Thursday, February 15, 2018

Go after root causes like you give a damn!

A root cause is not a singular gem buried within a complex problem, waiting to be unearthed, polished, and put on display for all the world to see while you soak up plaudits for your geological prowess.

And yet we often hear root causes bandied about in precisely that way. Statements like "the sales team didn't understand the product specs" or "the engineering team wasn't aware of all the use cases" might be put forth as root causes when things go awry, but rarely do statements like those about groups actually come from the groups themselves. 

Why do we suggest root causes originating outside of our domains? It might feel safer than calling out our own shortcomings.We might feel as though we need to maintain our own "credibility" or "status" or "subject matter expertise" or whatever you call the corporate rendering of a be good mindset, as opposed to a get better mindset. 

But does that really help things? Does attributing causes of complex problems to others actually help us make progress in addressing those problems? If we aren't actually interested in making progress on those problems, why are we talking about them?

When looking for root causes, understand that you're evaluating many options and looking for leverage points that you can impact within your action radius. Your action radius is shaped by your range of motion and your time horizon. Think about your time horizon as a spectrum; whether you have three days or three months to address a problem will change the actions you take. Neither time horizon is wrong, but being blind to the influence of the spectrurm is.

Similarly, think about your range of motion as made up of both the level of change you can effect (e.g. resources) as well as the level of risk you're willing to accept in addressing a problem (see: blast radius). The ways a front-line worker and a regional ops director work on problems will vary by their station. Again, neither is wrong, but ignoring the spectrum is.

Once you understand your action radius, use that as a lens to view the problems you face. What can you do about something? What can you do that will have the biggest impact on your problem with acceptable follow-on effects? What you start to see are the root causes that you should be addressing, and you should be addressing them because you can address them.

When viewed this way, your new lens can help ensure you're building logic chains that hang together both up and down the levels of detail. This will help you avoid the temptations of mental traps like solutioneering and pushing solutions in search of problems.

At this point you might be thinking that's all well and good, but you're facing some really big problems with some really tough root causes that are way outside your action radius no matter how you slice it, but you've still been tasked with getting it done. In that case, think about who does have an action radius that can address root causes you've identified and go talk to them. Lay out your situation, your understanding, your assumptions, etc. Ask questions. Understand the problem from their perspective. In other words, help them come along with you and go after problems together!

Grounding root causes in what you can actually do about something turns problem solving from a disassociating blame game into a meaningful set of steps you can take to move through complex environments, and in doing so you learn more about the actual obstacles holding you back from doing what you want to be doing. 

Thursday, February 1, 2018

What's my blast radius?

Rapid experiment cycles can be applied in many settings to speed up the pace of learning and reduce the gaps between expectations and reality. One of the common terms in literature dealing with that concept is the idea of minimizing the "blast radius".

In short, minimizing the blast radius means making sure you don't screw up something major in your worst-case scenario. That way you can keep executing experiment cycles and improve the pace of your learning. So how can one think about mitigating their blast radius in order to continue uninterrupted learning cycles? I recommend a three part test:
  • Are you acting unilaterally?
  • Are your actions irreversible?
  • Is the potential damage unacceptable?
If your planned experiment only violates one or two parts of the test, proceed! If, however, you find yourself tripping all three, you might want to reconsider your next experiment. What does it mean to trip one of those tests? Let's take them one by one.

Acting unilaterally. In a classic unilateral setting, one person is making a decision to do something without consulting anyone else. That definition can apply quite readily to testing blast radii, but it should also be noted that groups can act unilaterally. In the case of groups, "unilateral" can be equated with "one perspective". Multiple people in a group might all agree to an action, but if their perspectives are mostly overlapping, then that action is still unilateral. In a setting of us vs. them, you need to get input from them if you want to avoid acting unilaterally.

Irreversible actions. Technical domains often have an ability to "undo", "restore", "revert", "roll back", or otherwise reverse an action just taken, but by no means is that capability restricted to technical domains. Putting a physical item somewhere can be reversed if you can just move it back. Announcing a change to a tight-knit team can be "reversed" with a similar level of communication. Limiting the population exposed to a change means the bulk of the population stays with the current reality, essentially letting you revert to old norms easily.

Unacceptable damage. Of the three parts of the test, this is probably the trickiest, and can seem a bit recursive: isn't the whole point of thinking about blast radii to avoid unacceptable damages from an experiment gone wrong? Thinking about the other two parts of the test can help one suss out unacceptable levels in a given situation. As an extreme example, if you decide to make an omelette for yourself, you're acting both unilaterally and irreversibly, but clearly it's an acceptable level of damage. Perhaps more realistically, if you're considering a major product change that you think will be irreversible, you can strengthen your case by seeking different perspectives.

The point of your experiments should not be to do, but rather to learn so that you can do better. Not every experiment will turn out as you expect it to, so you should mind your blast radius, but odds are if you haven't critically assessed your blast radius, you're either being too cautious and taking too long to learn or being reckless and inviting unnecessary risks.