Monday, May 12, 2014

The Debugging Mindset

A good programmer can write clean, understandable, code. A great programmer is also a great debugger. What do I mean?  In every software project I've been a part of there have been one or two issues that are almost impossible to solve. There's no apparent root cause, they happen sporadically, they are difficult to detect, and cause all sorts of problems that require significant cleanup. In my experience most people like to blame these issues on parts of the system they don't understand. Developers will say it's network issues, DBAs will say it's the application code, etc. It takes a great programmer to take the chaos of one of these bugs and figure out the true root cause.

This programmer follows a logical path to determine the root cause, every time. There are no guesses, there are no assumptions. Everything must be proven or dis-proven and checked off the list of possibilities. This programmer has the debugging mindset - the understanding that inside of all systems there are a series of causes and effects. Even this simple fact is sometimes disregarded when IT shops are dealing with the hard problems. When it comes to computers, nothing just happens. There is a cause. And it's your job to figure out why.

You must think - scenario A causes scenario Z. At this point the debugging mindset is simple. 1) figure out all the causes and effects between A and Z, and 2) figure out which ones are working and which ones aren't.

Let me provide an example.

Let's say you are working on a mobile app that plays a song from a server once the user presses a play button.  Sometimes, certain users are reporting that they can't play a song once the button is pressed.

The first step is to create a high level list of causes and effects.

For instance, when a user touches the screen, the app should call a certain method.  The method then calls a service which downloads the song. Then, the method will start playing the song.

You have three high level causes and effects. The second step is to determine which one of these is not working - for which you need the appropriate tools, and for different problems you need different tools. If you are trying to determine if the method is called, you could use a process debugger in an IDE.  If you have access to the code, you could change the first line of the method to write to a file so that you can verify the method was called. If not either of those, are there any actions changed in the method body that you could use?  Is there a setting to allow the operating system to log method calls?  You get the idea.

Once you have determined the part that isn't working, you dig deeper. If you determine that the button isn't working, then what causes and effects are you missing?  Again at a high level, the screen picks up an electrostatic charge and send the coordinates to the CPU. The CPU then notifies the operating system. Is the screen not picking up the user's touch?  Maybe the user is wearing gloves or the screen is wet?  You could go deeper than that if you wanted. Usually you don't have to, but for the really hard problems you'll likely have to go deeper than you thought.

But what if you verify each cause and effect and you can't determine what's wrong? Then your assumptions are incorrect. Either you do not know all the causes and effects, your tests were not correct, or you interpreted the results incorrectly. This is where hours upon hours have been spent in development time. Question everything. Every file name, every class, every link in the chain.

To summarize:
  1. Figure out the chain of causes and effects
  2. Use a tool to figure out which link is not working
  3. Break the piece that is not working into smaller pieces, and start back at 1 until the root cause is determined.
So how do you get better at debugging?  I can't speak for everyone, but I've started a log of every debugging task that takes me more than 5 minutes to solve.  It started paying divdends after just a few days.  In particular, I learned that I wasn't reading the entire stack trace of exceptions and was missing crucial details.  I was missing Step 1; I didn't know all the links in the chain.

Unfortunately there is no silver bullet to becoming a great debugger.  It is always a learning process, and as you learn new technologies you will find new links in the chain you didn't know existed and new tools to inspect each link.  Keep with it, keep challenging your assumptions, log your 10,000 hours, and become a great programmer.

No comments:

Post a Comment