Don't Play Piñata With Your Code

Mexicans have a long standing tradition of playing a game that involves piñatas during celebrations.  Many of you are familiar with piñatas. For those of you that are not, let me explain. A piñata is an object traditionally made of cardboard or and made to resemble some cartoon character or a real life object.  They range in size from a regular soccer ball all the way to the size of young child. They are decorated with paper mache on the outside and have an empty pocket on the inside that can be filled with candy.  The piñata is then hung by a rope and usually controlled by a human while blind folded kids take turns, one at a time, taking swings at it with a stick. It is fun for the kids and funny for those watching because most swings hit nothing but air. Eventually a kid will hit it and bust it open. Then all the kids swarm to get the candy that has poured out of the piñata.




At this point you may wonder what on earth this has to do with software engineering.  I was trying to think of an analogy debugging and troubleshooting while taking shots in the dark, wild guesses if you will.  Blind folded or in the dark while taking swings at a moving object, makes a great party event but not such a good approach for troubleshooting software bugs.  

There are many ways to trouble shoot applications, but taking the piñata approach should not be one of them, yet I've noticed it as a re-occurring anti-pattern.  You may have experienced this yourself.  You have a weird bug. You take a guess you try something and nope you are wrong. You have another suspicion try something else and you are wrong again.  You do this repeatedly after a few hours or sometimes days and arrive at nothing. You are more baffled than when you started.  You have little in terms of narrowing the problem down and you have literally made zero in terms of progress.

In a perfect world everything would be caught with unit tests.  You would identify the bug write a failing test and fix it. However, even in a fully testable compilation of code there are some issues that just won't be revealed.  Race conditions, asynchronous bugs and works-on-my-box bugs are but a few examples of this.  But even when that's not the case, if you so happen to work in legacy code that has little to no unit tests you could still be faced with a similar problem.  That problem being; not knowing where to begin or how to narrow down your problem area. 

Most problems can usually be narrowed down by slicing the problem in half along the way until you figure out the cause. In this case you may already know where the problem occurs but don't have a clue why it's happening.  Another way to narrow down the problem is to place logging on the system.  Sometimes that's not possible because you don't have the permission to do so in the environment where the problem surfaces.  So what can you do?

Before we get to the solution I want to explain something.  Troubleshooting and the necessary skills for effective troubleshooting are something that is second nature to a good mechanic.  A good mechanic knows exactly how to divide the problem at each step until he narrows it down to the actual root cause.  In a way, developers are a lot like mechanics because we require good troubleshooting skills to get past our daily road blocks.  



As developers we have a big disadvantage when compared to mechanics. The mechanic has the advantage that all automobiles are very similar.  All automobiles work with the same types of systems. Sure some automobiles have slight variations of those systems, brake, suspension, steering, electrical, cooling, so on and so forth.  Some of those systems may have a little customization, but for the most part all cars and their components are very similar.  Automobiles have common components and parts that are very similar across makes such as brake drums, fuel pumps, cylinders, sensors, etc.  All these components and how they work are very well documented. In software, things are quite different.  

In fact, many of the components in software are the sole invention of an in-house programmer or another. Sure they use some common building blocks and some common architectures but in the middle where all the gooey domain details reside, things get very different, very unique and highly customized.  Overall each software application is very different from one team or organization to the next.  The design reveals a thing-a-majig-doer here and a watch-a-macallit-manager over there.  This is a problem that could have a post or even a book of it's own. That's not my intent with this post.

Rather, my intent is to provide a little advice on how to avoid playing piñata while troubleshooting your code.  See mechanics, when they get stuck or want to understand a component better they go their manuals for answers. And mechanics have lots of manuals for everything from automobile systems to specific repair manuals for make and models.   Our code on the other hand, rarely has technical manuals.  And for good reason, it would likely take more effort to write the manual than to write the code, or at least as much. So to make up for that shortcoming, we can closely inspect the code.  However, it is difficult keeping all the information in our head as we inspect the code. Instead, we can take high level snapshots of the software design by transferring the code to visual graphs.  




This is where it is handy to turn back to the design tools such as Unified Modeling Language(UML), flow charts, truth tables and data flow diagrams.  I don't like to write graphs as a way of doing detailed design.  I think it is overkill and placing too many details in UML diagrams, for instance,  makes the UML models lose their focus.  Also when the implementation changes, all those details are no longer accurate and require updates. However, UML makes a great tool for conveying design concepts. It's a lot easier to look at a UML diagram than to keep keep those class relationships in your head.  As we inspect the design visually, we can get a little closer with the other tools such as data flow diagrams and flow charts, truth tables.

Sometimes the problem is not that the bug is hard to find, it is simply that we don't have a good visual or a good understanding of the design involved.  In other words, we are not approaching the problem in a systematic manner.  When problems get really hard to solve, we must approach them in steps, recording and checking off each step along the way.  Reverse engineering the design allows us to create these steps.

Turning to reverse engineering may sound like a lot of work and it can be, but compared to the alternative piñata game it actually has a end date.  With some practice, we can build some information rich models rather quickly.  These models can then be used to bounce back against the code to help us formulate hypothesis of the problem.  In many cases we will be able to identify the problem just by inspecting the UML design and the code.  In all cases we will gain a better understanding of the system and the graphical design will help us actually get to the solution.

So if you ever find yourself playing piñata with your code, slow down, pull out a piece of paper and a pencil and start reverse engineering the code.  You will be surprised at the how much information a little reverse engineering can provide.

If you would like to get started with UML, the best resource that I have found is the course by Construx. That will cost you but it is worth it if you are willing to pay the price. On the other hand Martin Fowler's book is also a great resource and much more affordable.

Hope this post provides some value to you. Thanks for reading. If you would like  to be updated on new posts you can follow me on twitter @fernandozamoraj.

Comments

Popular posts from this blog

Putting Files on the Rackspace File Cloud

Setting Up XnaMobileUnit to run unit tests in Windows Phone 7 XNA Games

Reasons to Stay Away from CSLA