SOLID - The L is for Liskov Substitution Principle
Between a holiday in the states, buying and redecorating a house and a summer it has been a long time since my last blog entry. It seems about time for a new one :) Today we'll be dealing with the third principle of SOLID, the Liskov Substitution Principle.
The Liskov Substitution Principle (LSP) was coined by Barbara Liskov as early as 1987. The principle is very tightly connected to the earlier discussed Open Closed Principle. A good way of adhering to the OCP is understanding and implementing code that uses the Liskov Substitution Principle. In this article we will discover why and how.
Barbara Liskov described the Liskov Substitution Principle as follows in 1988:
What is wanted here is something like the following substitution property: If for each object O1 of type S there is an object O2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when O1 is substituted for O2 then S is a subtype of T
Wow.
That's a mouthfull, so lets replace the letters with something more understandable (Note: Raleigh is a UK-base bike manufacturer).
RaleighBike is a subtype of Bike if: "for each object of type RaleigBike there is an object of type Bike and a program that is defined in terms of Bikes does not change its behaviour when Bikes are substituted with RaleighBikes".
Still a mouthfull. This is how Robert C. Martin summarized it:
Functions that use pointers or references to base classes must be able to use Objects of derived classes without knowing it.
Finally something that is understandable. And if you've read the previous article on the Open Closed Principle, it should sound somewhat familiar. If you read back the article than the quote above is almost what we accomplished with our code to output the current stock of a bike:
foreach ($bikes as $Bike) { $API = BikeAPIFactory::getBikeAPI($Bike); echo $API->getCurrentStock(); }
If you haven't read the OCP article, here is a small recap for you. To avoid numerous switch statements to different kind of API calls for different brands of bikes, we decided to give each bike brand its own API that had exactly the same function signatures as the others. In the end a factory is used to decide which API should be used for a specific bike object.
But we have to deal with the trustworthyness of our API's. We expect them to all implement getCurrentStock, but no one is telling them. If the API's are a little like me, they might not implement that method untill they're told to ;) Also we have no common base to reference to (think type hinting in function signatures for instance).
Design against abstractions
API's are just like us humans, if we don't impose some rules, there will be hell to pay. So lets come up with some rules. We can do this by defining an interface or abstract class. Since the Bike API's all have in common that they receive a Bike and store it in their constructor, I feel it would make sense to make an abstract class out of it, but lets show both:
interface BikeAPI { public function __construct(Bike $Bike); public function getCurrentStock(); } abstract class BikeAPI { private $Bike; public function __construct(Bike $Bike) { $this->Bike = $Bike; } abstract public function getCurrentStock(); }
Our API's can now either extend the abstract class or implement the interface (choose one though!), in both cases we have ensured that each of our API methods is present and has the same input parameters.
It seems like we are well on our way of adhering to the Liskov Substution Principle. After all, no code using a RaleighBikeApi object needs to know that it is dealing with a RaleighBikeApi as long as the RaleighBikeApi is a subclass of our parent BikeApi. Right?
Wrong!
The signatures of the BikeAPI methods are set now, so the input will not change. There is no way to create an API that does not take a Bike in its constructor for instance.
But what about return values? There are no rules for return values in the interface nor in the abstract class. We could add some comments with return value hints, but they would be exactly that: hints.
Design by contract
When reading up on the LSP you will soon come accross Bertrand Meyer's name and the term Design by contract. This is a programming methodology that defines contracts to ensure (amongst others) a classes's:
- input and return variables
- preconditions
- postconditions
You can see where this touches the LSP. A subclass that changes the return variable for instance, can not be used in algorithm as if it was its parent. If we expect the getCurrentStock to return an array with info and one subclass starts returning a CurrentStock object (with stock for multiple warehouses for instance) we are in trouble. We solved this problem for input variables, but we can not force our API's to do the same for output variables. This means we need to have conventions in our development team.
This convention would be the LSP. A subclass may not change input our output, it must not change pre- or postconditions and it should leave the state of invariants as they are. This is something programmers will have to start doing, there is no syntax (at least not in PHP) to enforce it, but it is very important when you want to write code that adheres to the Open Closed Principle.
Some rulebreaking
Let's look at a small example where the LSP is violated to see what kind of problems it will cause. Gazelle (a dutch bike manufacturer) has started offering a new service. They don't return the total stock for all their warehouses anymore, instead they return the individual stock per warehouse, so customers can estimate delivery times. If a programmer notices this and wants to use this in a single location we are starting to get into trouble. If he starts refactoring to incorporate the new information in the return value without paying attention to the fact that he is violating the LSP, the code will start to fail on us:
class GazelleAPI extends BikeAPI { public function getCurrentStock() { //parse response for stock per warehouse location return array( 'dutchWarehouse' => $dutchWarehouse, 'germanWarehouse' => $germanWarehouse, ); } }
When running (hopefully) the unit test suite before committing this change, the developer will soon notice that his change is going to cause trouble. All Api calls should return the same kind of information. At a minimum they must all return arrays but it would probably be for the better if the returned some sort of stock object. Right now, unit tests fail and users get presented with the word array instead of the actual stock.
Conclusion
Validating the LSP automatically is not that easy in PHP. There are languages built around the Design by contract paradigm (Eiffel by Bertrand Meyer for one), but PHP is not. There is a PEAR project that adds functionality for design by contract though.
However I feel something else is more important. Robert C. Martin writes in his article:
A model, viewed in isolation, can not be meaningfully validated. The validity of a model can only be expressed in terms of its clients.
And that is exactly it. It might seem perfectly valid to change the output of a function, certainly in cases that are less obvious than the one above. But if you don't look at the system, the software using the component you just changed, all kind of problems may arise. By now it must be clear that adhering to the Open Closed Principle is impossible without the Liskov Substitution Principle.
Comments
-
While PHP allows constructors in interfaces, if proper DbC is something you want to attain, and "to the definition" LSP is something you want to strive for; you shouldn't put any constructors inside your interfaces.
Interfaces are about describing how existing objects behave and react when called from other objects. Since an object can't exist before __construct() is called, it is generally not subject to LSP.
(Point in case: Java nor C# allow ctors in interfaces. And I'd say their object model is well thought out.)
Summarily, one object should not care how another object was created, only that it can respond to a particular call in a particular way. Additionally, interfaces generally should not talk about object-to-object dependencies, they should only references other objects if they are a parameter to a method (not related to introducing object dependencies.)
-
Hey Ralph,
Thanks for your reaction. What you write sounds very reasonable and I was about to remove the constructor from the interface, but than a question emerged.
If for each child class it should be possible to be used as its parent, does that not count for constructing an object as well? And if there is no contract on what a constructor looks like, there is no guarantee that I can use a child as if it was its parent.
So I am kind of divided on this. On the one hand, what you write makes sense. On the other hand it seems like enforcing a constructor from LSP point of view is what you'd want.
In the end I think that Robert C. Martin's paraphrase clearly talks about pointers or references to objects. Meaning an objects is either already constructed or doesn't need to be constructed. Since RCM's paraprase seems to be what most people have in mind when talking about the LSP I think the constructor could indeed be removed from the interface.
Any afterthoughts?
-
The interface's job is to declare demands (Ideally we could also demand return types, but not for now). Your demands sound like "accept a bike and get the current stock for it". So getCurrentStock should require a Bike be passed in. I see no gains from making demands on the constructor.
If for each child class it should be possible to be used as its parent, does that not count for constructing an object as well?
As long as the interface is correct, once it exists it can always be used as its parent. I think generally OOP doesn't consider anything as even existing before the constructor runs. How can you make demands of thin air?
Although I'd see this as an anti-pattern and avoid it, someone could get some value from promising that a particular set of arguments will work for construction. To verify it though you'd have to go into Reflection because you only have the string name of the class. And if you must use reflection, why not look at the constructor arguments directly?
-
PT