The RAID Log: Right Topics, Wrong Sequence

If you were to poll project managers on which software tool (aside from email) they use most often, for the largest number of tasks, I imagine that Excel would predominate. From complex calculator to data analyzer to zero-effort database and factoid organizer, I doubt there is a more ubiquitous multi-tool. And yet, we sometimes use it in ways that inhibit our understanding of the problems we’re trying to solve. One example that comes to mind is a spreadsheet template we used in a former life, called a “RAID log.”

The spreadsheet had four tabs: Risks, Assumptions, Issues, and Decisions. Each tab had columns defined that would drive identification and management of their namesakes. In and of itself, it was an excellent starting point. But the sequencing that made the template name pronounceable hid the complex relationships among the four objects. And that probably caused more than a few problems along the way.

Assumptions

Most projects begin with a limited number of facts, or “knowns,” and a by-product of the project is to expand that knowledge base. But at the beginning, a lot of assumptions are needed to fill in the gaps. These assumptions help us better visualize the current state, the desired future state, and the space between. With assumptions, we can make plans and estimates, communicate and collaborate, and otherwise act on those things we know to be facts. Consequently, all projects begin with a set of assumptions.

That said: every assumption carries some amount of uncertainty. Because we depend on assumptions to proceed with our work, and that uncertainty is present, each assumption represents a source of risk. Of course, not all assumptions are foundational, and few will be purely wild-assed guesses, but one of our process goals in every project should be to verify every assumption. To do this, every assumption should be documented and referenced when used in estimates, plans, schedules, budgets, and any other project activity.

Issues

Whether the goal is to correct a problem, develop a new capability, or comply with some external requirement, projects are approved to address issues. If the status quo were acceptable, no one would spend money to change it. On many projects, the project is an umbrella for the resolution of all sorts of issues. As you explore the project scope for details, more issues will emerge, and some of your assumptions will be challenged. This is a good thing: solving the wrong problem won’t get you any points. You might discover some issues can be resolved along the way, while others depend on the outcome of the project. Persistent issues frequently have roots in an earlier solution, and it’s common for them to be well-documented. So capture that content in your project issue log.

Once you transition to the execution of your project, more issues will manifest, especially as you make changes or consume scarce resources (like the time and attention of your stakeholders). On some projects, tracking and resolving issues becomes the key to making progress. The old adage about being up to your ass in alligators speaks volumes about the potential for creating new problems while trying to solve old ones.

Risks

A risk is an uncertainty that matters. As noted, every assumption has an element of uncertainty, and every issue matters. Consequently, a review of assumptions and issues logs should give you a starting point in identifying risks. If an assumption is incorrect, what are the consequences? What is the likelihood of resolving one issue, only to create a new one? In reviewing the plan, identifying risks isn’t just about understanding the proposed solution, but the current state.

As Tim Lister so famously said, “Risk management is how adults manage projects.” Tracking risks, from identification to analysis to action to retirement, is one of the keys to a successful project. Once you understand both the risks and the issues, you can select better risk responses. Being able to link risks to assumptions, and to both the issues that made them worth taking and the issues that they might become, will help you keep preventive and corrective actions from falling through the cracks.

Decisions

The most expensive part of any project is indecision. Many decisions are the leverage needed to resolve issues; others drive risk responses. Still others change or confirm assumptions. Many schedule delays can be attributed to the failure to make a timely decision, or the failure to communicate a decision once made. Once the need for a decision is identified, whether to resolve an issue, manage a risk, or just eliminate alternatives, it should be logged. Allowing a pending decision to drag on too long is an impediment to the project, and to the business of the organization. A review of the decision log for pending items should be part of every steering committee meeting.

The relationships among assumptions, issues, risks, and decisions are complex, and somewhat recursive. Tracking them is not a chore to be done once and then checked off the to-do list. Taken together, they are the background (and sometimes foreground) of every project. Understanding how they interact can be critical to keeping your project on track. So if you use a spreadsheet like our old RAID log, be sure to use it wisely.

Complexity, Profitability, and Risk Management

Writing in the Agile Pain Relief blog, Mark Levison ascribes the collapse of the global economy to the complexity of the market for sub-prime loans. He quotes Andrew Haldane’s analogy of a dog trying to catch a Frisbee as a complex application of physics, easily avoided with a simple heuristic. In advocating for simplicity, Mark asks: “Why didn’t the financial regulator system catch the problem early, while it was still small?”

The regulators didn’t catch it because it was legal, however ill-advised. It is the job of legislators to define what activities are illegal, and they are loathe to proscribe highly profitable activities. This is especially true when the people making those profits donate to their election campaigns. So the people who saw the coming disaster avoided it by cashing out, buying up over-sold assets at the trough, and mocking the rest of us. Sort of a social corollary to natural selection. The complexity was profitable – it was greed that made it unsustainable.

As for Haldane: there are no physics problems in nature. They exist only in the minds of humans who reduce their observations to abstractions, in order to decant generalized rules that they can discuss with similarly inclined humans. Dogs bark at physicists, for good reason.

Complexity is More Profitable than Simplicity

Complexity is relative and temporary. At one time, sending a crate of spices from east Asia to Europe was an incredibly risky, complex endeavor. Now we just send it via DHL and it gets there the next day, if we pay a little extra. Making complex activities routine and reliable is the source of all commercial success – the activities are no less daunting, but the risk and capital cost has been spread across far more participants and transactions. There is tremendous profit potential in reliably reproducible complex activities, executed in volume for a small commission.

Consider modern self-service retail transactions: you wave your six-pack of Dos Equis over the barcode scanner and insert your credit card into a slot. The inventory is updated, the supply chain is notified, the purchase is made available for stock level analysis, the sales tax is calculated, accounting system updated, and the total charged to an individual credit account, all without human-introduced errors. Repeat 80 million times a day for the modern US economy, and you begin to see real cost savings.

Risk Management Controls

Controls simplified for non-practitioners, rather than optimized for results, produce sub-optimal results. Similarly, risk management practices dumbed down for the disengaged decision maker, rather than optimized for predictable, reproducible results, result in more catastrophes.

There is generally a strong correlation between risk and reward, except for the category sometimes called “stupid risks.” Think Darwin Award contestants or investors in sub-prime REIT assets. Agile methods work only to the degree that that they provide frequent points at which course corrections can be made and marginal expected benefit (reward) can be compared to marginal uncommitted cost (risk) in order to inform decision-making at both project and portfolio levels. It is as easy to apply Agile methods poorly and fail miserably as it is to model your approach on a ballistic trajectory, making all your decisions at the beginning and then adjusting based on the actual point of impact before you start over.

Predictable, Reproducible Results

The best argument for Agile methods is demonstrated, predictable, reproducible results. Nearly twenty years after the Snowbird conference, we have plenty of evidence of what works. Simply professing Agility, absent top-down commitment to excellent management practices, doesn’t change a damned thing. And neither does decrying Dilbert-style management. We have to embrace the complexity, master it, and make it our super-power.

All that said: a rigorous approach to defining the business case, followed by an unambiguous definition of “done,” ruthless execution, and continuous risk management still won’t guarantee results. But if you want a guarantee, buy a toaster – they’re simple.

Managing Risks That Evolve Over Time

Most project managers are used to making a qualitative risk analysis in two dimensions: the likelihood that an event will occur, and the impact of the event. And most risk management plans include some sort of “T-shirt” sizing scale to facilitate classification of probability and impact as small, medium, large, and so on. While not particularly rigorous, relative categorization does have the benefit of getting SME participation without making great demands on their time or making them uncomfortable with complex quantifying processes. A qualitative approach can be useful for determining which risks are significant enough to require a risk response strategy or additional quantitative analysis.

But to paraphrase Rod Serling, there is a third dimension: time. Some risks are always present, while others are episodic or cyclical, or even overtaken by events. For that reason, it can be valuable to model the expected life cycle of a risk, in order to determine whether it needs special handling.

Characterizing the Evolution of a Risk

Let’s consider three cases:

  • A risk that impacts a task or deliverable. For example, an external event or delivery that, if delayed, would delay a project task on the critical path. We’ll call this a dependency risk.
  • A risk associated with an assumption. In many cases, a critical assumption can’t be confirmed to be accurate until somewhat later; thus, the possibility that it is incorrect will linger until that time.
  • An external / environmental or internal business risk that might have significant impact but only in some specific window of time. We’ll call this a “fly-by” risk.

There are other scenarios, but in the typical IT project, most will fall into one of these categories.

Dependency Risks

Most project schedules include dependencies—one task can’t begin until another is complete. While good project plans have some slack built in, a task on the critical path that is delayed can have significant impact. For example, if your project has a requirement to go live at the beginning of a quarter, all it takes is a delay in the right place to significantly impact the schedule and all associated project costs. Thus, the probability of the event occurring is spread over a rather narrow span of time, but the impact follows it and can be spread over a much longer period.

In developing responses for dependency risks, it can be helpful to prepare both a mitigation strategy to reduce the probability of the event, and a contingency plan to manage the event as an issue if it happens anyway. Dependency risks naturally have an end date; the event probability of a properly managed dependency risk should be expected to be reduced as that date comes closer. Part of your risk management strategy should include diagnostics, a specific expectation of the probability reduction over time, and triggers to proactively initiate conversion of the risk to an issue before the end date.

Assumption Risks

Most projects are planned with at least a few assumptions—consider them to be accepted risks. Good project plans identify the assumptions as such, and include tasks to validate the critical assumptions. Where the validity of the assumption is questionable enough to represent a significant risk, the validation tasks should get a correspondingly higher priority. Risk responses for assumption risks may have to include revisiting the budget and business case. You don’t want to preside over a “zombie project” when the cost / benefit ratio has shifted in the wrong direction.

Assumptions are necessary for decision-making in the absence of certainty—we simply need to have a plan for how we’ll deal with newly found certainty, before we get to it.

Fly-by Risks

Chapel in the SkyWhile an asteroid strike isn’t a significant risk for most IT projects, year-end processing and other business-as-usual activities can be. I’ve seen more than one project delayed when resources were pulled for an emergency response to an event we already had on the calendar. If you know you might have to compete for resources during some finite window of time, identify it as a risk.

You may not be able to reduce the probability of the event, but it may be possible to transfer the risk by assigning tasks during that critical period to other resources. It may also be possible to avoid the risk by proactively juggling some tasks during the planning stage to reduce your exposure during that window. This usually doesn’t need to be more complicated than incorporating public holidays and staff vacation or maternity leave plans into your schedule.

The Evolving Project Risk Exposure Profile

While it isn’t necessary for all projects, it can be helpful to express your project’s risk exposure profile over time. There are a number of ways to graph the combination of probability and impact of multiple risks over a timeline. Showing your key stakeholders how the risks to the project are being shepherded to retirement can go a long way toward retaining their support when those events you are trying to prevent happen anyway. If you can demonstrate how your risk management plan will ultimately eliminate risk exposure at some point near the end of the project, you’ll probably get a lot more engagement in reaching your goals.