Complexity, Profitability, and Risk Management

Writing in the Agile Pain Relief blog, Mark Levison ascribes the collapse of the global economy to the complexity of the market for sub-prime loans. He quotes Andrew Haldane’s analogy of a dog trying to catch a Frisbee as a complex application of physics, easily avoided with a simple heuristic. In advocating for simplicity, Mark asks: “Why didn’t the financial regulator system catch the problem early, while it was still small?”

The regulators didn’t catch it because it was legal, however ill-advised. It is the job of legislators to define what activities are illegal, and they are loathe to proscribe highly profitable activities. This is especially true when the people making those profits donate to their election campaigns. So the people who saw the coming disaster avoided it by cashing out, buying up over-sold assets at the trough, and mocking the rest of us. Sort of a social corollary to natural selection. The complexity was profitable – it was greed that made it unsustainable.

As for Haldane: there are no physics problems in nature. They exist only in the minds of humans who reduce their observations to abstractions, in order to decant generalized rules that they can discuss with similarly inclined humans. Dogs bark at physicists, for good reason.

Complexity is More Profitable than Simplicity

Complexity is relative and temporary. At one time, sending a crate of spices from east Asia to Europe was an incredibly risky, complex endeavor. Now we just send it via DHL and it gets there the next day, if we pay a little extra. Making complex activities routine and reliable is the source of all commercial success – the activities are no less daunting, but the risk and capital cost has been spread across far more participants and transactions. There is tremendous profit potential in reliably reproducible complex activities, executed in volume for a small commission.

Consider modern self-service retail transactions: you wave your six-pack of Dos Equis over the barcode scanner and insert your credit card into a slot. The inventory is updated, the supply chain is notified, the purchase is made available for stock level analysis, the sales tax is calculated, accounting system updated, and the total charged to an individual credit account, all without human-introduced errors. Repeat 80 million times a day for the modern US economy, and you begin to see real cost savings.

Risk Management Controls

Controls simplified for non-practitioners, rather than optimized for results, produce sub-optimal results. Similarly, risk management practices dumbed down for the disengaged decision maker, rather than optimized for predictable, reproducible results, result in more catastrophes.

There is generally a strong correlation between risk and reward, except for the category sometimes called “stupid risks.” Think Darwin Award contestants or investors in sub-prime REIT assets. Agile methods work only to the degree that that they provide frequent points at which course corrections can be made and marginal expected benefit (reward) can be compared to marginal uncommitted cost (risk) in order to inform decision-making at both project and portfolio levels. It is as easy to apply Agile methods poorly and fail miserably as it is to model your approach on a ballistic trajectory, making all your decisions at the beginning and then adjusting based on the actual point of impact before you start over.

Predictable, Reproducible Results

The best argument for Agile methods is demonstrated, predictable, reproducible results. Nearly twenty years after the Snowbird conference, we have plenty of evidence of what works. Simply professing Agility, absent top-down commitment to excellent management practices, doesn’t change a damned thing. And neither does decrying Dilbert-style management. We have to embrace the complexity, master it, and make it our super-power.

All that said: a rigorous approach to defining the business case, followed by an unambiguous definition of “done,” ruthless execution, and continuous risk management still won’t guarantee results. But if you want a guarantee, buy a toaster – they’re simple.

Managing Risks That Evolve Over Time

Most project managers are used to making a qualitative risk analysis in two dimensions: the likelihood that an event will occur, and the impact of the event. And most risk management plans include some sort of “T-shirt” sizing scale to facilitate classification of probability and impact as small, medium, large, and so on. While not particularly rigorous, relative categorization does have the benefit of getting SME participation without making great demands on their time or making them uncomfortable with complex quantifying processes. A qualitative approach can be useful for determining which risks are significant enough to require a risk response strategy or additional quantitative analysis.

But to paraphrase Rod Serling, there is a third dimension: time. Some risks are always present, while others are episodic or cyclical, or even overtaken by events. For that reason, it can be valuable to model the expected life cycle of a risk, in order to determine whether it needs special handling.

Characterizing the Evolution of a Risk

Let’s consider three cases:

  • A risk that impacts a task or deliverable. For example, an external event or delivery that, if delayed, would delay a project task on the critical path. We’ll call this a dependency risk.
  • A risk associated with an assumption. In many cases, a critical assumption can’t be confirmed to be accurate until somewhat later; thus, the possibility that it is incorrect will linger until that time.
  • An external / environmental or internal business risk that might have significant impact but only in some specific window of time. We’ll call this a “fly-by” risk.

There are other scenarios, but in the typical IT project, most will fall into one of these categories.

Dependency Risks

Most project schedules include dependencies—one task can’t begin until another is complete. While good project plans have some slack built in, a task on the critical path that is delayed can have significant impact. For example, if your project has a requirement to go live at the beginning of a quarter, all it takes is a delay in the right place to significantly impact the schedule and all associated project costs. Thus, the probability of the event occurring is spread over a rather narrow span of time, but the impact follows it and can be spread over a much longer period.

In developing responses for dependency risks, it can be helpful to prepare both a mitigation strategy to reduce the probability of the event, and a contingency plan to manage the event as an issue if it happens anyway. Dependency risks naturally have an end date; the event probability of a properly managed dependency risk should be expected to be reduced as that date comes closer. Part of your risk management strategy should include diagnostics, a specific expectation of the probability reduction over time, and triggers to proactively initiate conversion of the risk to an issue before the end date.

Assumption Risks

Most projects are planned with at least a few assumptions—consider them to be accepted risks. Good project plans identify the assumptions as such, and include tasks to validate the critical assumptions. Where the validity of the assumption is questionable enough to represent a significant risk, the validation tasks should get a correspondingly higher priority. Risk responses for assumption risks may have to include revisiting the budget and business case. You don’t want to preside over a “zombie project” when the cost / benefit ratio has shifted in the wrong direction.

Assumptions are necessary for decision-making in the absence of certainty—we simply need to have a plan for how we’ll deal with newly found certainty, before we get to it.

Fly-by Risks

Chapel in the SkyWhile an asteroid strike isn’t a significant risk for most IT projects, year-end processing and other business-as-usual activities can be. I’ve seen more than one project delayed when resources were pulled for an emergency response to an event we already had on the calendar. If you know you might have to compete for resources during some finite window of time, identify it as a risk.

You may not be able to reduce the probability of the event, but it may be possible to transfer the risk by assigning tasks during that critical period to other resources. It may also be possible to avoid the risk by proactively juggling some tasks during the planning stage to reduce your exposure during that window. This usually doesn’t need to be more complicated than incorporating public holidays and staff vacation or maternity leave plans into your schedule.

The Evolving Project Risk Exposure Profile

While it isn’t necessary for all projects, it can be helpful to express your project’s risk exposure profile over time. There are a number of ways to graph the combination of probability and impact of multiple risks over a timeline. Showing your key stakeholders how the risks to the project are being shepherded to retirement can go a long way toward retaining their support when those events you are trying to prevent happen anyway. If you can demonstrate how your risk management plan will ultimately eliminate risk exposure at some point near the end of the project, you’ll probably get a lot more engagement in reaching your goals.

The Influence of Risk Tolerance on Risk Response Strategies

In a prior post on selecting means of communication, I quoted Master Kan, from the pilot to the early 1970’s television series, Kung Fu:

Avoid, rather than check. Check, rather than hurt. Hurt, rather than maim. Maim, rather than kill. For all life is precious, nor can any be replaced.

We should adopt a similar rubric for selecting risk response strategies:

Avoid, rather than transfer. Transfer, rather than mitigate. Mitigate, rather than accept. For all risk response strategies have both a cost and a residual risk.

Selecting Risk Response Strategies

I bring this up because I see so many organizations and managers choose to mitigate or accept risks that they could otherwise avoid or transfer. Avoiding a risk usually results in an opportunity cost, or at least deferring the benefit, but it tends to result in the least residual risk. For example: responding to a schedule risk by removing some element from scope avoids the risk, at the opportunity cost of not having the capability provided by that element. Transfer and Mitigate responses usually have at least somewhat predictable direct costs while retaining some residual risk. Accepting a risk means it’s all residual, and acceptance can have a complex mix of direct and opportunity costs.

In some cases, it’s about the perceived cost of the safer responses. But I see it happen most often in organizations following a merger or acquisition, where they haven’t reached an end state in their evolving culture. Perhaps one of the predecessor firms had a greater appetite for risk; perhaps middle management has internalized the acquisition itself as a willingness to take on significant risk. Or maybe their appetite for certain types of risks is higher than that of their new colleagues.

A few years ago, I worked with a customer that was being acquired by a much larger firm. They had initiated a project for the express purpose of reducing the chance of being found in non-compliance with a legal requirement, although they had relatively little exposure. The cost of the project far outweighed the potential cost of being found in non-compliance, or of making improvements to their existing manual process. But the decision-maker felt that the non-compliance risk absolutely had to be mitigated. That said, the project itself was very risky, in terms of schedule and quality. It was kicked off late, the vendor provided a relatively inexperienced team member in a key role, and there was no internal consensus on what business rules should be embedded in the process. In the end, a senior manager in the acquiring firm killed the project. Their view of the bundle of risks was quite different, and they decided to accept what they viewed as a relatively low-cost, low-impact risk, rather than take on all of that residual risk.

Gauging Appetite for Risk

It is extremely difficult to measure risk tolerance, or even to describe it in meaningful terms. In an interview, I once asked a PMO director about their organizational risk tolerance. He admitted that the question had never been asked before, and struggled to answer in a way that would be actionable for a contract project manager. Plainly, no organization is willing to admit that they have little appetite for risk, although few can express what level of risk they find acceptable. But in order to suggest risk response strategies, the team will benefit from an understanding of how the organization views their choices. So, let me propose a few interview questions that might start the process of gauging appetite for risk:

  • Are you willing to replace an established vendor with an acceptable level of performance in order to reduce costs? While the new vendor might have a lower price, any transition will have a learning curve and lower quality. If this is an acceptable trade-off, then they should be seen as having a somewhat higher appetite for risk.
  • Are you willing to accept higher retention risk after an implementation project is completed, in order to avoid the costs of augmenting your staff with temporary workers? Most projects that replace a legacy system provide a platform for team members to gain new skills and experience, and it is common for some folks to seek out greener pastures. If cost avoidance matters more than staff retention,  then that tells you a bit about what risks they are willing to accept.
  • Are you willing to accept higher quality risk, in order to finish on schedule? If the project does not have an immovable finish-by date, follow up with questions on what drives this response.
  • Are you willing to defer some deliverables in order to reduce schedule risk? The answer can lead to some interesting discussions on perceived benefits of the project deliverables.
  • Are you willing to add administrative complexity, in order to reduce implementation risk? Again, this speaks to the trade-off between quality and cost.

While this list is not particularly comprehensive, I think it will provide some insight into the organization’s appetite for risk. Or at the very least, their tolerance as it applies to the proposed project. If you have some additional interview questions you’d like to add to this short list, please leave a comment.