Incremental (non-) Design

Fighting Entropy like Sisyphus!

paul.winkler@percolate.com

revised for PyGotham 2014

Hi. I'm Paul Winkler and I'm here to talk about object-oriented software design.

I work here in NY at Percolate, there are a bunch of us here today and yes we are hiring... We're a PyGotham gold sponsor as well! we have a booth in the food area, come talk to us if you're looking for work or just curious what we do.

Intro: What is this talk?

There's some fairly beginner-level OO design tips here; there's also some waxing philosophical so I hope the more experienced people won't be bored.

I'm going to talk about 3 things ... (slide)

Some of you know all of this. Some of you know more than me about all of this. Some of you may disagree with me about this. Some of you may know none of this.

First, to get a sense of who's here today - show of hands:

Who here would call themselves pretty experienced at OO design?

Who here is just starting at OO programming or design?

Out of curiosity, anybody here who came to Python after starting out in languages that aren't, or at least aren't typically, programmed using classes and inheritance? (eg. Javascript, PHP)?

Anybody who learned Python specifically so they could do Django or Flask?

(The interesting thing about that for my purposes is that I've noticed it's possible to do quite a bit of productive work in Django without ever doing much OO design. Which is actually pretty cool. But it might leave you wondering what all this OO design fuss is about.)

Background: Emergent Design

Agile mantra:

  1. Build the simplest thing possible
  2. Refactor
  3. Repeat

In my career: Most software I've worked on is developed using some agile approach and deployed on the web frequently. There are no big releases, just a constant stream of improvements.

Q: How many people work in similar situations?

In this world, it's taken as gospel truth that the old big-design-up-front approaches are bad because you don't know enough up front to predict what you need to design, so you waste time and you design the wrong thing. I believe this is largely true, though I reject the extreme of thinking you should start without doing any preliminary design work at all.

Instead we have embraced as a best practice the mantra: refactor constantly. Do the simplest thing and always be improving your design and all will be well.

Thesis: Emergent design doesn't always emerge

Result: Big ball of mud!

(source: CBS via Baltimore Sun)

Unfortunately, I'm going to argue, we often do a poor job of this. And a good design fails to emerge. Instead, we often get the proverbial "Big Ball of Mud" design. And this happens not through any one bad decision but through a series of decisions that in isolation make good sense, but taken together add up to an overly complex software design.

And I'm going to say up front that there's no grand solution here. The solution is vigilance and being aware of the pitfalls.

"We must imagine Sisyphus happy"

"The struggle itself ... is enough to fill a man's heart". -- Camus

Hence, Sisyphus. We are never going to be done pushing the design rock up the hill. Or the kitten up the slide. Eternal vigilance is the price of, not just liberty, but also agile design.

Disclaimer: I have not read Camus. I can use the google.

Try to enjoy it!

from http://existentialcomics.com/comic/29

If that doesn't appeal to you, you might be in the wrong line of work... or need an attitude change. Savor the little victories. Always be learning.

How do things get worse?

For today, focusing on overuse of inheritance.

This talk could go on forever so I'm picking on my favorite target. Inheritance. Or more specifically, overuse of inheritance for things that can be done more flexibly and more simply in other ways.

Why do we over-use inheritance?

Bad defaults:

Hard to untangle.

Things we do by default as we incrementally improve a system. These are all often highly expedient and often make things worse.

OO 101: Over-inheritance falls out of any language with inheritance.

Easiest path to D.R.Y.: Add more base classes!

Alternatives may not be as intuitive or obvious.

We continue to overuse inheritance because it's a path of very low resistance. And once we have an existing system that uses inheritance, it's very difficult - perhaps prohibitively so - to stop doing that. Once you pop, you can't stop!

Zope 2 in a nutshell:

Confession: Hi, my name is Paul, and I'm a recovering Zope 2 programmer.

Perhaps this makes me overly sensitive?

Zope, for the young folks in the audience, was a web development framework that was very big in the Python world around 10-15 years ago. Internally it used multiple inheritance very very heavily.

Here's part of the inheritance tree of the ironically named SimpleItem. Nearly everything you did in Zope 2 involved inheriting from this class.

Easy things were usually easy. The hard things it made convenient were easy. Anything else was rough going.

https://twitter.com/slinkp23/status/382568693466935296

So, people with my history are typically very suspicious of big inheritance graphs. Not coincidentally, the guy that replied to this tweet of mine is also a recovering Zope 2 programmer.

Why is too much inheritance bad?

I'm going to show a simple contrived example, and a real-world example of the kinds of problems I'm talking about.

I'm going to show you why they're problems, show you 3 or 4 common symptoms of overuse.

And what should we do instead?

I'm going to show you an alternative you may have heard of. How many people have heard the phrase "Favor composition over inheritance"? How many have not?

I'm going to briefly walk you through actually doing it.

Contrived Example: Requirements

shark_with_lasers.attack(target)

Your client just wants a freakin' shark with lasers.

Quick and Easy...

class SharkWithLasers(Shark, LaserMixin):

    def attack(self, target):
        self.shoot(target)
        self.eat(target)

Problem solved! Go home.

This is easy, right?

New Requirement

But now we want an orca with nunchaku.

Factor out commonalities into more base classes...

Another requirement!

Uh-oh.

Symptom 1: Class explosion.

Every concept we add makes more and more classes.

But even if we stop here forever, it's already bad, because...

Symptom 2: Yo-yo problem

https://en.wikipedia.org/wiki/Yo-yo_problem

"Often we get the feeling of riding a yoyo when we try to understand one [of] these message trees." -- Taenzer, Ganti, and Podar, 1989

With inheritance, when you see a method being called, and you want to understand what's going on, you have to mentally envision the inheritance graph and figure out which class defines the version that's actually getting called.

Since subclasses can call methods defined in superclasses, and superclasses can also call methods that overridden or even only defined in subclasses, you have to go hunting by bouncing up and down through the inheritance tree looking for these method definitions.

Your development tools like IDEs and language servers can do the grunt work for you of course, but that doesn't mean you can _understand_ the design, or that it's any good.

Understanding state - instance state, typically via attribute assignments - is even worse, because it can change on literally any line.

Multiple inheritance makes it even more fun - it's not like being a yo-yo, it's like being a pinball and bouncing all over the place. You have to reconstruct Python's method resolution order in your head, or find a tool to do it for you.

Yo-yo problem larval stage

class SharkWithLasers(SharkBase, LaserMixin):

    def attack(self, target):
        self.shoot(target)
        self.eat(target)

Where are shoot() and eat() defined?

It starts innocuously enough...

Easy to guess in that example.

class Shark(object):
    def eat(self, target):
        print "chomp! delicious %s" % target

class LaserMixin(object):
    def shoot(self, target):
        print "pew! pew! at %s" % target

Not so much when there are dozens of classes.

Who is "self"?

Put another way: It's interesting to ask yourself in each method definition, what kind of object do I mean when I say "self"?

You don't know if it currently means a shark, or a base Animal, or a thing with lasers, or a base Weapon, or a thing with armor? You have to look all over, with only the names to give you clues.

Symptom 3: Poor Separation of Concerns

ArmoredSharkWithLasers will have methods related to sharks, lasers, and armor.

Those are not conceptually related at all.

More classes + more methods = more yo-yo

"Favor Composition Over Inheritance"

"Has-a" or "Uses-a" relationships, instead of "Is-a".

Underlying principle in "Design Patterns" (aka "Gang of Four" book)

Now we get back to this phrase we mentioned before.

Composition: Usually Better

class Shark(object):
    def __init__(self, weapon):
        self.weapon = weapon

    def eat(self, target):
        print "chomp! delicious %s" % target

    def attack(self, target):
        self.weapon.attack(target)
        self.eat(target)

shark_with_laser = Shark(weapon=Laser())

Better: Fewer Classes, Simpler Tree

Better: Separation of Concerns

Better: Less Yo-yo Problem

def attack(self, target):
    self.weapon.attack(target)
    #    ^^^^^^  A clue!
    self.eat(target)
    # Still have to look, but the tree is smaller.
  • If needed, one-line wrapper methods can be added to Shark or a subclass, and these internally are nice and explicit. (Be mindful of the "law of demeter")

Better: More flexible too

These would have been hard to do without special case hacks and/or yet more classes:

mystery_shark = Shark(
    weapon=get_random_weapon())

armed_to_the_teeth = Shark(
    weapon=WeaponCollection(Lasers(), Grenades()))

But that's all contrived!

Yes, it's a bad made-up design that nobody would ever do.

(right?)

A real-world story

One day I was working on some rest API endpoints at my job. I had these requirements.

Started with...

Names of classes changed to protect the innocent. But this was generated from real code from a real production system.

Solution for adding a new Salmon view...

Existing inheritance hierarchy tends to encourage more inheritance, because it's easier than puzzling out how to do without it. This is what I meant by "once you pop, you can't stop."

Here I factored out methods I needed to re-use into two new base classes.

Better approach!

Let's walk through refactoring SharkWithArmor!

Shark with Armor: Bad

class Shark(Animal):

    def receive_hit(self, damage):
        self.health -= damage
        if self.health <= 0:
            self.die()

class ArmorMixin(object):

    def receive_hit(self, damage):
        self.armor_health -= damage
        if self.armor_health < 0:
            super(ArmorMixin, self).receive_hit(-self.armor_health)
            self.armor_health = 0

class SharkWithArmor(ArmorMixin, Shark):
    pass

One nice thing about this design: the Shark class knows nothing about armor. All you have to do is put the base classes of SharkWithArmor in the right order, and receive_hit() will do the right thing.

One not so nice thing: Depends on super().receive_hit() and does not have any base classes. Implicitly must be mixed into something that provides that method. Not documented by code.

Better Armor: Proxy object

class Armored(object):
    def __init__(self, wearer):
        self.wearer = wearer

     def receive_hit(self, damage):
         self.armor_health -= damage
         if self.armor_health < 0:
             self.wearer.receive_hit(-self.armor_health)
             self.armor_health = 0

     def __getattr__(self, name):
         # Or explicitly proxy all others if desired.
         return getattr(self.wearer, name)

 shark_with_armor = Armored(wearer=Shark())

This might look a little backwards at first. The armor has the wearer, rather than the wearer having the armor.

This is so we can maintain the nice property we had before, where the Shark class doesn't have to know about armor. Nothing knows about the armor except the armor itself... and the invocation that constructs it.

Better Laser: Delegation

Shark has and uses laser, rather than is laser.

class Shark(object):

    ...

    def attack(self, target):
        self.weapon.attack(target)
        self.eat(target)

shark_with_laser = Shark(weapon=Laser())

How do we get here?

Earlier we suggested that this was a better design for sharks with lasers. How do we get from the inheritance-based code to this delegation-based code? When there's a huge pile of other classes in the tree and we want to do it gradually?

Example refactoring of Sharks/Orcas/Nunchucks/Lasers:

https://github.com/slinkp/inheritance_talk_examples

Important: Tests before refactoring!

You need solid test coverage. If you don't have it, do that first. This is mandatory.

Filling out your test suite and getting decent coverage is more important to the success of your project than redoing your design. You could add tests and never redo the design and you'd be a hell of a lot better off than when you started.

The sample repo starts and ends with 100% line coverage.

END

Questions?

Appendix 1: References / Inspiration

Image Credits

Cats-on-a-slide gif: found at http://thisconjecture.com/2014/02/15/the-myth-of-sisyphus-a-touch-of-silly-and-a-great-animation-of-the-story/ original provenance unclear.

Orca designed by Sarah-Jean from the Noun Project

Nunchucks designed by Simon Henrotte (public domain)

Armor from http://infothread.org/Weapons+and+Military/Armor-Uniform-Insignia/

Car in mud from http://www.motoringexposure.com/20228/friday-fail-soccer-players-get-stuck-mud

Tools used for this talk

Appendix 1: Mixins usually suck

Question for audience: does everybody know what a mixin is? in python?

(If not: A mixin is a class designed not to be used by itself, but by inheriting from it to add some behavior to your class. Get more behavior by inheriting from more mixins. In some languages eg. Ruby, this means something a bit more formal, but in python it's just an informal idea of, here's a class you can inherit from if you want its behavior.)

Mixins are good...

BUT mixins are bad...

... not always bad

Some characteristics of nice mixins:

Appendix 2: "Template Method" Pattern Sucks

Symptom: Reuse is tied very tightly to the inheritance tree and is very hard to refactor away from that tree.

Symptom: As that tree grows, you don't have a yo-yo problem anymore, you have a pinball problem - bouncing all over the place.

Good use of Template Method

Simple example that does not suck: unittest.TestCase. The setUp() and tearDown() are expected to be overridden.

Good because:

So template method is certainly not inherently bad, it's useful and good.

Smells

Some code smells to watch out for: