25th Jan, 2008

Visitor Pattern: One Fix for LSP Violations

I wrote last night about a quick way to find Liskov Substitution Violations (LSP). Well, what good is it to find them if you don’t have a clue how to fix it? (Well, surely you do have some idea how to fix them, but I need some kind of segue into this, don’t I?)

The example in the previous post was a highly-anonymized bit of code from work that I came across a while ago. Kept sticking in the back of my mind — namely because my predecessor had put a comment right above the bit of code saying “This violates LSP, fix ASAP.” (Of course, he never got around to fixing it, nor did I ever really consider it myself — everything worked, right?)

Well, this past weekend I was playing around with NRefactory, a C# parser library from the folks who develop SharpDevelop. The whole library is built on the Visitor pattern, a pattern I wasn’t very familiar with in practice until I started trying to parse source code this weekend. I started playing with it, and at some point in time my mind went to that little snippet of code my ex-coworker had commented. I realized that — aside from having too little in my life to distract me from work (it’s the weekend, for Christ’s sake) — the visitor pattern would be a perfect solution for that little snippet of code.

Let’s get back to last night’s example though, which I’m going to put here again, slightly modified to make things a little more obvious:

[sourcecode language="csharp"]
public abstract class BaseObj { }

public class FooObj : BaseObj { }

public class BarObj : BaseObj { }

public class Worker
{
private void SomeMethod(Dictionary inputParams)
{
BaseObj obj = MyFactory.GetObject(inputParams);
if (obj is FooObj)
{
DoFooStuff(obj as FooObj);
}
else // obj must be a BarObj’s
{
DoBarStuff(obj as BarObj);
}
}

private void DoFooStuff(FooObj foo) { /* does stuff with a FooObj */ }

private void DoBarStuff(BarObj bar) { /* does different stuff with a BarObj */ }
}
[/sourcecode]

As an aside, I do want to point out that this is a blatant Open-Closed Principle (OCP) violation as well (and, generally speaking, an LSP violation is also an OCP violation).

If you’re not familiar with the Visitor pattern, it’s really simple: you call a method on one class, passing your current instance (the “this” variable) to that method; that method, in return, calls another method on the “this” object you’d passed in, again passing what is now the current “this” object back.

Clear as mud? Okay, it’s a lot easier to see in code here:

[sourcecode language="csharp"]
public abstract class BaseObj
{
public abstract void VisitWorker(Worker w);
}

public class FooObj : BaseObj
{
public override void VisitWorker(Worker w) { w.DoFooStuff(this); }
}

public class BarObj : BaseObj
{
private FooObj[] fooObjs;
public override void VisitWorker(Worker w) { w.DoBarStuff(this); }
}

public class Worker
{
private void SomeMethod(Dictionary inputParams)
{
BaseObj obj = MyFactory.GetObject(inputParams);
obj.VisitWorker(this);
}

public void DoFooStuff(FooObj foo) { /* does stuff with a FooObj */ }

public void DoBarStuff(BarObj bar) { /* does different stuff with a BarObj */ }
}
[/sourcecode]

Make a bit more sense? It was easy to solve not only the LSP violation, but the OCP violation as well, just by moving the calls to “DoFooStuff” and “DoBarStuff” into the FooObj and BarObj classes. At worst here, we have an Dependency Inversion Principle (DIP) violation because we have two methods that are relying on concrete implementations (FooObj and BarObj) instead of their common abstraction (BaseObj). This probably isn’t much of a problem here, since we’re apparently needing different functionality or information from FooObj that BarObj isn’t providing (or vice versa).

The only overtly “bad” thing about this implementation is that it requires our previously private members to now be public — or, at least, internal — so that the other classes can access them. I know there’s a lot of developers that think that the keyword “private” is holy, and reserve “public” access only for those godforsaken times you can’t avoid it — I’m all for encapsulation, but I don’t mind exposing functionality when it gives a benefit such as the above.

A couple of other criticisms here though: First, this example is kinda bad, in that we now have a highly coupled system. Your best bet would be to abstract the DoXStuff into an interface, and have the Visitors rely on that interface instead of directly on the Worker implementation.

Second, in spite of what I said earlier, there’s still a slight OCP violation (or, at least, the possibility for one). If you add another branch to the BaseObj tree straight from the root, you’re going to need to either add another method to the Worker class to handle it (i.e. DoNewObjStuff), rely on existing functionality, or create a degenerative method for VisitWorker (degenerative method == overridden method that does absolutely nothing, just returns). If you decide to abstract the DoXStuff to an interface, as suggested above, that’s fine, but you’ll need to add the functionality to your new interface and then change all implementations of that interface to provide the new method. Of course, your new descendant isn’t required to use a DoXStuff method from the worker — it can consume any method available to it.

A little painful, true, but it’s a hell of a lot cleaner than having to your if…else if… branch just because you added a new class. And if the new class needs additional functionality in the Worker class, that’s just par for the course — you’d have to add it one way or another.

Okay, I hope that made some kind of sense, as I’m out of time to elaborate further. Off to the garage for a smoke, then off to the grind again.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Responses

[...] things a-the-moment-of-saving would be to use the Visitor Pattern as described by Matthew Cory in this post.  Though, I would say in this particular example, there is a much more deep-rooted and [...]

I don’t get why you dont simplify this even further, like so:

public abstract class BaseObj
{
public abstract void DoStuff();
}

public class FooObj : BaseObj
{
public override void DoStuff()
{
// Do foo stuff here…
}
}

public class BarObj : BaseObj
{
private FooObj[] fooObjs;
public override void DoStuff()
{
// Do bar stuff here…
}
}

public class Worker
{
private void SomeMethod(Dictionary inputParams)
{
BaseObj obj = MyFactory.GetObject(inputParams);
obj.DoStuff();
}
}

That is a valid way of handling the situation, but in the example that I was giving, the Worker class was trying to work on a particular object, instead of having that object do the work.

What if the Worker object needs a database connection to work with a FooObj, but not with the BarObj? If you tack the database into the FooObj, you might be violating Single Responsibility — if you weren’t, you probably already would’ve had the database connection in it in the first place. If you make a database connection a requirement of the BaseObj class, then you have a BarObj that doesn’t need it — again, an SRP violation.

It generally depends on where the work makes most sense — if the Foo/Bar classes will be working on the data in the Worker class, then your solution is better; if the Worker class is going to be working on the data in the Foo/Bar class, I still claim my solution to be a more extensible design.

I get that, but I still believe for the sake of expansion-friendly class hierarchy that it is easier to have the classes do their own work. It doesn’t make it easier that one has to remember to update the Worker every time there is a new derived class that needs handling.

As for SRP/database scenario, it can be a sign that perhaps the classes shouldn’t derive from the same base class, or possibly delegate the database work to another class where needed.

You’re right; updating the Worker class for each new BaseObj derivative is a bit tedious (and an OCP violation at that). And I completely agree that the classes should do their own work; though this example was rather contrived, the point of it was more that the work needed to be done in this case was being performed by the Worked class and not directly relevant to the functioning of the BaseObj types. A better example probably would’ve helped, though right now I’m running a little late and can’t come up with one.

As a quick (and still quite contrived) example of the SRP/database scenario, think of a Logging class hierarchy. One instance logs to a database, one to a file, and one sends an email. The core logging functionality goes into a base class, while the database, file, and email relevant functionality goes into the derived classes. Granted, you’d probably have other classes in your system that would already maintain database tables or text files or whatnot (especially in a system robust enough to want three different logging methods).

You could still have a single logging class that just uses three different LogStorage derivatives; but that might also add extra complexity to the system, or at least it wouldn’t make it any simpler — instead of a Worker class and three logging classes, you have a Logger class and three LogStorage classes.

I need to run right now, but you’ve got my brain working and I’ll try and come up with a better example — honestly, this was more an excuse to implement the Visitor pattern than fix LSP, as I wrote this at a time when the Visitor pattern was just starting to seem a little more intuitive.

Leave a response

Your response:

Categories