Deliberate Risk Taking in Software Teams

Risk is a difficult concept to understand and deal with. As an industry, when building software, we talk about lowering our overall risk and that will increase our chances of success. So many things can potentially go wrong on projects that we are simply conditioned to remove those possibilities as much as possible. In many cases, I would agree with that approach, but just as in the field of Enterprise Risk Management (ERM), our industry needs more nuanced ways of looking at risk in order to move forward and striking the right balance between debilitating risk aversion and wild-west gun slinging. Today this is managed by our industry gurus, our project managers, and scrum masters using intuition and experience. Perhaps if we formalized the thinking about risk, we can increase our overall competency beyond our sages of wisdom.

ERM categorizes risks into two categories: ancillary and core.

  • A core risk is one that the company wants to take on because often it is the reason for them being in business and earning revenues. The business is said to be exploiting this risk and deriving profits from doing so. If you’re a consulting company, such as Polaris Solutions, you really want to take on as many fixed-bid projects as you can – as long as you know they are the types of projects your company can do well. Similarly, a staff augmentation firm that might not have strong engagement / project management competencies may want to avoid those same engagements. The consulting group should take on the risk because in theory it should be compensated for taking on the risk and it knows it can deliver on or ahead of schedule. Additionally, imagine a cloud hosting service provider like Microsoft Azure. People pay Microsoft so they can offload their risk of outages and complexity in infrastructure. If Microsoft hires a lot of really smart people that do infrastructure management, it wants the risks associated with that work because it can charge a price for that. In theory – their competency is keeping the datacenter functioning well.
  • An ancillary risk is one that the company would love to reduce, mitigate, or offload to a third party whenever it can. Companies extend credit to their customers all the time through “accounts receivable” – which are promises of customers to pay invoices within 30/60/90/etc. days. There is a risk always that a customer might not pay their invoice to the company. Most companies don’t like that risk and that’s why many have shrunken payment terms, or outsource collections from the outset to someone that is good at “collecting.” At my company, Polaris, there’s the risk that an employee gets hurt on the job. I don’t personally like having that risk, so we buy (in addition to compliance reasons) workers compensation insurance which prevents me from having to pay the medical bills directly. That’s an ancillary risk that I don’t want, and is not core to what we do.

These concepts can apply to software development teams as well. Unfortunately, most of us treat all risks in the same way – they are things that we want to remove from our paths. Some examples of risks that indeed are ancillary for most enterprise development teams:

  • Laptop / hard drive failure results in source code getting lost.
  • Changing business conditions resulting in changing requirements (*note on this in my next post)
  • Dependencies on other teams’ output not getting built in time
  • Dependencies on third party libraries that are unexpectedly failing
  • Not considering product like hardware when testing and sizing applications results in scalability challenges

If those are ancillary risk, then what would possibly be a core risk to the team? A software development team exists to build software, and therefore the core risks are often surrounding the software itself. Some examples might be:

  • An experiment in new design patterns results in slower code
  • An architecture spike didn’t produce a successful POC
  • Some new feature might be more complicated than we thought it was going to be

These do share a commonality that they are the actual product, and sometimes within the teams own control. I’ve seen agile and waterfall teams unnecessarily pad estimates in such a way to reduce their risk of missing commitments to the project or product. By padding estimates, especially on an agile team, they are ensuring that fewer User Stories are being accepted by the team. Those teams often always meet their commitments and then have time left over. If this culture persists, the team takes on less and less over time. Their overall value to the business becomes less and less as they are marginalized from key product enhancements. In extreme cases, when left unchecked, this leads to team members being “let go.”

In this post, I’m not arguing to take on all commitments you can, but I am pushing that we recognize when our teams are starting to look “risk averse” and are being overall safe beyond what is acceptable. As members of the team, we can recognize this most easily, but a trained scrum master can recognize this through the empirical data on sprints. The moral of the story is that padding estimates is not avoiding the same type of risk as adding another hard drive to prevent hardware failure. So think differently about these cases, and we should ensure that our teams keep striking a good balance between too much/too little core risk.

What do you think – do you see any core risks that your team is silently mitigating?

Check out the Chicago ALM User Group – June 25th – Git

If you’re in Chicago, and are interested in learning more about ALM topics, the Chicago ALM User Group is a great group to participate in. Full disclosure, I was one of the founding organizers years back, before I’ve moved down to St. Louis. This month, the topic will be Git. So if you’re in the neighborhood, be sure to check out the meeting on June 25th.