SOLID - the O is for Open Closed Principle
In the second installment of this series we will be looking at the Open Closed Principle (OCP), responsible for the O in SOLID. This principle is extremely important if you accept the premise that all software systems change during their lifetime. If you want to design systems that need to last beyond their very first versions adhering to the Open Closed Principle is paramount.
What?
In his article on the OCP, Robert C. Martin paraphrases the words of Bertrand Meyer:
Software entities (classes, modules, functions, etc) should be open for extension, but closed for modification.
At first that may sound as a contradiction, how can you design software that should be able to change in behaviour without modifying it? The keyword here is extension.
Software that requires an enormous amount of changes to implement one new feature or fix a bug is unstable and should be considered as "bad". Software should be designed so, that in case of a new feature, no existing classes should have to change. In other words: it is closed for modification. Existing software may be extended to achieve new features however.
Just as in the previous article this may all sound very abstract so lets dive directly into the code again.
Why?
Let's assume we're connected to several webservices of Bike manufacturers. We're interested in requesting the current stock for a Bike with those manufacturers. They all have a unique API and different methods of course. A typical solution to extract the information from these webservices is by creating an abstraction for each of those API's:
class GazelleAPI { public function getCurrentStock($bikename) { //contents } } class RaleighAPI { public function getCurrentStock($serialNumber) { //contents } }
Typically when these kind of structures exist, what you will encounter is code like below:
if ($API instanceof GazelleAPI) { echo $API->getCurrentStock($Bike->getName()); } else if ($API instanceof RaleighAPI) { echo $API->getCurrentStock($Bike->getSerialNumber()); }
But what if a third manufacturer API is introduced? We can not keep our existing code closed for modification, we have to adapt the structure above. An API method is bound to be called at more than one location in your code as well, so the same structure will be found in other places around our code. If you're out of luck you may have to change code at numerous locations.
Also, the getCurrentStock function is probably not the only function these webservices will offer (what about getCurrentPrice for instance). As such, if there are multiple functions called multiple times inside your code, a change to a relatively small API can prove to be hours of work.
So what would be a better solution? What if we load each API function with a Bike object instead of a value of a Bike? Each method will have the same signature, so calling code doesn't need to know about the implementation of the getCurrentStock. If a third API uses the bike name and the serial number there will be no problems:
class GazelleAPI { private $Bike; public function __construct(Bike $Bike) { $this->Bike = $Bike; } public function getCurrentStock() { //use $this->Bike->getName() here } } class RaleighAPI { private $Bike; public function __construct(Bike $Bike) { $this->Bike = $Bike; } public function getCurrentStock() { //use $this->Bike->getSerialNumber() here } } class ThirdAPI { private $Bike; public function __construct(Bike $Bike) { $this->Bike = $Bike; } public function getCurrentStock() { //use $this->Bike->getName() as well as $this->Bike->getSerialNumber() here } }
Instead of elaborate if/else statements everywhere you can just call getCurrentStock and let the specific API implementation deal with your problems internally. The current solution is closed for modification (a new API will not result in code changes in existing code) but open for extension (a new API can easily be added).
You probably still want do some things to improve this solution, I'll save that for the next article however which deals with some of the problems this solution triggers.
How to get the API
You may be thinking "where do I get that API from" right now. If I have a Raleigh Bike, I need the Raleigh API and so on. In the worst case you will still need switch statements all over the place.
One way of solving this is creating a Factory for your API's. You could end up with a mess inside it, but the mess is contained. There is only on location that you need to update in case of a new API, the APIFactory. Suppose the Bike class knows what brand it is of:
class BikeAPIFactory { public static function getBikeAPI(Bike $Bike) { switch ($Bike->getBrand()) { case 'gazelle' return new GazelleAPI($Bike); case 'raleigh' return new RaleighAPI($Bike); //etc } } }
If Bike is just a superclass of several Bikes like RaleighBike and GazelleBike (i.e. they extend Bike) the same code could look like this:
class BikeAPIFactory { public static function getBikeAPI(Bike $Bike) { if ($Bike instanceof GazelleBike) { return new GazelleAPI($Bike); } else if ($Bike instanceof RaleighBike) { return new RaleighAPI($Bike); } //etc } }
Obviously for each API this switch/if-else will grow, possibly with more complex decision paths, but it is in one place, and in one place only. Robert C. Martin calls this the "one switch" rule in his book Clean Code:
There may be no more than one switch statement for a given type of selection. The cases in that switch statement must create polymorphic objects that take the place of such switch statements in the rest of the system.
Bringing it all together
So lets look at how iterating over a group of Bike's and outputting some API info could look with this setup:
foreach ($bikes as $Bike) { $API = BikeAPIFactory::getBikeAPI($Bike); echo $API->getCurrentStock(); }
If an API is added, this code doesn't need to change. If a whole new Bike manufacturer comes into play, this code stays the same. There is an awful lot to improve still, just think about what kind of info each API will return. This will probably differ and offer us a whole new set of trouble. But in regard to which API is selected, and what parameters its method need, this code is closed for modification, but very much open to extension
Taking it even further
Looking back at the Bike class from the previous article there is a problem that prevents us from closing the Bike from other code as well:
class Bike { public $wheels; public function __construct($wheels) { $this->wheels = $wheels; } }
The wheels property is public and thus can be used from everywhere. This means that changing the wheels property means changing every location that property is used. In other words, the code using the wheels property can not be closed regarding that property.
Lets assume at one point the amount of properties that you have gathered concerning wheels consists of profile type, wheel size, rim size and amount of spokes (I know, I know, the last one is pushing it :D ). Because you don't want to add each of those properties for each of the Bike's wheels, you decide that wheels will no longer be an integer. The new wheels property will hold an array of wheel objects. Each location where the amount of wheels is read directly from the wheel property has to be changed.
Had the bikes code been like this, you would only have to change the implementation of the function getWheelCount:
class Bike { private $wheels; public function __construct($wheels) { $this->wheels = $wheels; } //old implementation public function getWheelCount() { return $this->wheels; } //new implementation public function getWheelCount() { return count($this->wheels); } }
Conclusion
As we have seen, the OCP is incredibly important to writing code that is maintainable in the long run. Although our editors have great search and replace functions, even the smallest reafactoring action in a large codebase takes attention. With the OCP in your hart and mind, you will only fix bugs in existing code. Everything else belongs in new code.
Comments
-
The if statement in BikeAPIFactory is still a "logic bottleneck" and does ad-hoc polymorphism. Object oriented programming attempts to remove all such ad-hoc decision making. Every if and every switch should be viewed as a lost opportunity for dynamic polymorphism.
Solution can be found here http://www.csis.pace.edu/~bergin/patterns/ppoop.html
But I like your tip about wheels! :)
-
Ey Skirmantas,
First of all, thanks for the link. I've read and enjoyed it. In a way it goes a long way to what I describe above, only after the last example they offer a way to remove even the last switch statement.
For other readers, I will now talk on stuff written in this article. I am not sure I am 100% convinced by the solution that is presented there but here's my two cents on it:
Basically they replace the switch statement with a map (which would probably become an associative array in PHP). The decision path (when do we select which OS in the example) isn't walked ad-hoc but "pre-defined". The map is initialized and after that simply knows what to return for which key.
This is probably faster than the switch statement to begin with, but that aside I can see another pro for the map solution. It allows for less cheating. Once a switch statement is in place, it is easy to add a new switch inside of one of the cases of the existing switch statement. If you feel the urge to do this you're probably cheating. Your polymorphic model shouldn't need a new level of decisions after all. The map doesn't allow sub-levels at all, so we're forced to adapt the system.
I first tried to disprove the theory by reasoning that it is bad for the objects to register values in a centralized map. I could only partially do that. On the one hand, it holds the risk of overwriting previous mappings. The state of the map is defined externally and can therefor not be guaranteed. On the other hand, even the decision path is now closed for modification which is great.
Having said that, I am not convinced. In the example's case it makes sense to have a Box define its own response. But I don't think I would like a Bike to announce it's own API. We specifically split up Bike's and its API's. By linking them together again we are taking the risk that a change in the API structure involves changing Bike classes, which what we specifically don't want.
As a conclusion, I still think that having one switch that has decision making for linking a Bike and its API as sole responsibility is fine. But I also see possibilities for the map solution provided in the article. I think it depends on the problem at hand.
I am open for further discussion however :)
-
Nice OCP article. Still, I can't agree that usage of factory is the best option here. Mostly because it hardcodes specific API implementation. What if you want to do unit tests? What if you need to use development API key/service methods instead of your production Gazella API key? In both cases you really need some kind of external dependency configuration glued with your environments (production, testing, development, ...).
-
Yeah, so how would you propose to enforce return values upon the PHP code? Other languages do have something like this
( "[...] int function() {[...]}" comes to mind, but PHP doesn't allow this.. -
Cool article about Open closed principle and php: http://www.jacopobeschi.com/post/solid-design-principles-and-php-open-closed
-
The if-else statement could not be completely removed. If it is not used in BikeFactory, it will be used in other class or segment. But open-close principle tells us to use if-else in a more appropriate way so as to maintain the complexity of the program.
In that sense, S.O.L.I.D has to be considered in a whole, not just focusing on a just one of them to make our program less complex and maintainable