Have you ever been around a person, or maybe you are that individual, who has the innate ability to figure out what’s wrong with _____? Merely glancing at the computer that is behaving strangely causes the problem to disappear, touching the printer results in printed pages, or opening the hood of the car stops the squealing?
It’s a strange phenomenon, but certain people possess the ability to diagnose a situation, survey the playing field, and effectively implement a solution. I’m sure you’ve seen it happen before.
Truth be told, I don’t believe these people have any special magic. It is far more likely they’ve spent time honing their ability to observe what’s going on, consider possible courses of action, and choose the one that has the best probability of fixing the problem.
This is called troubleshooting.
Let’s talk about how to do it:
Principle 1: Identify and reproduce the problem at hand.
In my world of computer programming, this frequently takes the form of a customer reporting odd behavior in one of our software applications. They may say something like:
“I tried to do X, expected Y to happen, but instead, Z was the result! Oh noes!”
When we get a report like that, the first thing we do is try to replicate the issue the customer experienced. We reach out to the person and try to learn as much as we can about what they were doing when the bug showed up.
After we have a good understanding of the customer’s experience we will try the same actions in a testing environment to replicate the issue.
We will also write an automated test that specifically exposes the problem so that the bug can no longer hide behind obscurity.
Principle 2: Only change one factor at a time.
Once we have identified that, yes, there is a problem, it becomes tempting to pull out all kinds of potential solutions. But this is not a good approach. Why?
By changing only one element at a time we can isolate the source of the problem.
For example, say you have a desk lamp that is not turning on. If you change the bulb, plug it in, and reset the circuit breaker all at once, you will not have a clear understanding of what the problem was.
Instead, start with the easiest possible solution:
- Is it plugged in?
- If it is plugged in, is there a chance the circuit breaker has tripped?
- If it is plugged in, and the circuit breaker is working, is the light bulb burned out?
Ah, yes! The bulb is burned out. Now that we’ve worked through the troubleshooting process we can implement the solution (install a new bulb).
Principle 3: Expect the problem to exist in my area of responsibility
An interesting aspect of troubleshooting, and one that is often overlooked, is that the problem is very likely your fault.
When programming we are frequently relying on code written by other people, in the form of stable libraries or reputable frameworks. Assuming we are using them correctly, chances are that the issue is not present within the framework as it has been vetted by thousands of other programmers.
By assuming the problem begins with me, I will first inspect the areas in which I may have introduced the bug. In my experience, the vast majority of the time this is where the problem existed and is where I needed to implement a solution.
Principle 4: Begin with the obvious solutions but also look outside the box
This is an interesting principle because it is hard to formulate exactly how to look outside the box. At the point when you become familiar with what is outside the box, your box has grown to encompass the new information.
Suffice it to say that if a solution is proving difficult to find, and you’ve been intensely focused in a single direction, pause and ask yourself:
“I’ve been focusing straight ahead, but have I looked left, right, up, and down too?”
All too often we will have a gut-feeling as to where the solution should be and charge hard in that particular direction when in reality the solution is on another plane completely.
In the case of the desk lamp example, maybe the breaker is on, the lamp is plugged in, and the bulb is good. But, what if the internal circuitry has given out?
Rule out the variables you have control over and then look at ones you hadn’t even considered at the onset.
Principle 5: Share what you learned
Lastly, you are not the only person that will ever experience this issue. Share what your solution was. Write a blog article about it, answer a similar question on a site like Stack Overflow, or share it with a co-worker.
Chances are it was hard work to weed out the bug and you now can help other people. Take a few minutes and pay it forward!