Is your approval process working?

Is your approval process working?

In every business, there are many scenarios which require some form of approval process. Common examples we might find in every business are:

  • requests for time-bound privileged access to resolve an incident or perform a change
  • requests for enduring access to an application which isn't part of the typical employee role
  • requests to approve a pull request for a new piece of code or configuration
  • requests to purchase something

The approval process can take many different forms, but we typically rely on some form of 4-eyes check, or human approval process, where there are typically 3 different parties involved with a flow such as:

  1. a request is submitted from one human (the requester)
  2. to another (the approver),
  3. for the purposes of mitigating risk on behalf of the business (the risk manager).

There are of course occasions where there can be multiple approvers and the approver might be the risk manager, but critically, there is some separation between the requester and approver to minimise risk of malicious or accidental or unauthorised harm through the resource being requested. This is normally documented into policy and control statements which require an approval process to be followed.

To make the narrative easier to follow, this article focuses on examples using requests for access to a resource, but many of the considerations could apply equally to a code review or a finance approval. The focus is on identifying the business trade-offs involved, and assumes technology to support the choices is readily available (which it typically is).

Design Considerations

There are many things to consider in designing what an appropriate approval process should look like.

  • What is the risk involved with the action/access being requested? Apart from the approval process are there other mitigating controls in place?
  • What is the frequency with which the approval is needed? Is this a multiple times per day, or a once a year type of activity?
  • What is acceptable latency for getting the approval? If required to resolve a critical incident, then waiting for a different timezone or start of working day won't work?
  • What is the right level of skill and/or situational awareness to provide a meaningful approval?

These key piece of information will typically inform the design of an approval process, most critically who will have the permissions to provide the approval.

Effectiveness

There are many perspectives through which to measure effectiveness of the approval process and the contextual factors mentioned above are critical to establishing this. Let us consider from the perspective of the three different actors involved.

The Risk Manager

The foundational objective of many approval processes is compliance with policy. There will be a policy which states all access needs to be approved, and an auditor will expect to see evidence of approval for each request. This is absolutely foundational and provides very little assurance that any risk is being mitigated....

compliance != risk mitigation

In order to establish whether risk is actually being mitigated, we need to dig a bit deeper....

  • how many times is this access being requested? The more frequent the requests the less likely they will be subject to any scrutiny.
  • what is the approval versus rejection rate? If there are never any rejections, then is the process providing sufficient challenge?
  • if there is a group of approvers, which individuals are performing the approvals? Do they have the contextual awareness of what they are approving?
  • what is the response time for each approval? Is there any consideration taking place before clicking the approve button?
  • is the access actually being used once it has been granted?

The Requestor

In the vast majority of cases, the person making the request is doing so in order to perform their role. We have controls in place to mitigate risks to the business, but we should be listening to feedback from people requesting access regarding:

  • How often do they need to request access?
  • How long do they need to wait?

There are always trade-offs to be made, but a high frequency of approvals from outside the immediate team, would often indicate some underlying problem and whether it is poor engineering practices resulting in frequent use of break-glass, or new starters needing to request access to dozens of the same applications as their team mates, these are problems that need to be fixed.

The Approver

Arguably the most important person in the process is the approver. Depending on the process or the organisation: they could be part of a centralised IAM team, they might be an incident manager who needs to approve breakglass access, they might be a line manager, they could be an application owner.

For many forms of access, selecting Approvers is not easy, and as organisations scale contextual awareness of employees and applications reduces, making the task even harder.

Key questions for the approvers:

  • Do they understand the risks associated with the access they are approving? Is the blast radius of the role they are approving understood? Is the requester asking for the least privilege role?
  • Do they understand under what circumstances access should be granted or not? Are we expecting them to ask questions, perform any checks? Have they been trained to do so?
  • Do they have sufficient capacity (within an appropriate time window) to process the number of requests they receive? Is the collective "cost" of this approval time understood?

Concluding on Effectiveness

When you dig a bit deeper into how the process is operating, gathering some of the metrics above, it is not uncommon to find and subset of processes which:

  • create inefficiency and delay for the requesters
  • provide a constant stream of interruption for approvers who don't have the situational awareness to make an effective decision
  • and do little to genuinely mitigate risk

There are of course exceptions to this, but of all of the approval processes within your organisation, it is very uncommon not to find some which exhibit these characteristics. Starting to look at those which are used most frequently is often a good place to start.

There are many trade-offs to be made here, sometimes some "friction" in the process can be a useful tool to encourage the right behaviours, but make these trade-offs consciously and recognise the impact it has on all participants in the process.

Alternative Approaches

There will always be some approval processes needed, but some techniques to try and mitigate more risk and reduce inefficiency are:

  • strict SDLC for all components - promote everything as code through pipelines and scrutinise all break glass access. Do not allow a culture of tweaking in the environment.
  • role decomposition - if high risk roles are being used on a regular basis to perform a specific action, can a lower privilege role, perhaps with a different approval flow, be created to reduce the number of times the higher risk role is used?
  • automation of any business rules - the classic integration is to check for a valid change or incident id where requesting break-glass access, but are there other checks which could be codified?
  • training and education for requesters and approvers - make sure users know how to get access to things they need and approvers understand what they are approving and when to redirect people to alternative options.
  • for lower risk items, move from a "preventative" approval process, to a "detective" reporting process that can auto-approve at the point of access, but then periodically report on what access has been used and provide training if inappropriate.
  • make sure different employee profiles in the organisation have the right "birth right" access. New employees should typically get most of their access provisioned automatically and not have to spend their first week gathering permissions to do their jobs.

What else?

I am quite sure there are other considerations not taken into account above? What would you add to the conversation?

Christian Shpilka

AI, Co-founder of Pixoft and Litrol, Software Development Advisor – HQ Science Ltd.

5d

Phil, thanks for sharing!

Like
Reply

Interesting article. I'd add an additional consideration. From a security perspective there's a risk judgement to be made but what's almost never considered is the opportunity cost of the approval process. How much does even having an approval process cost in terms of not just implementation of the process but the chilling effect of the process on growth and change? I'd suggest considering alerting and monitoring of access in place of approval process in the first place. As you point out approval is often a tick the box process anyway. Regardless, event granting approval does almost nothing to prevent risky behaviour/actions once granted. I'd argue that in almost all cases, approval processes can be replaced with SDLC automation and judicious (not overly noisy) alerting.

Like
Reply

Compliance != risk mitigation. So true!

Like
Reply
Kai Harrod

IT Strategy, Change and Run Leader, Enterprise Architect or Risk leader. Experienced in Transformation including Mergers, Divestment & Acquisition within Financial Services

1y

Try Approver Apathy which you also neatly described. I can only concur that the right control is used for the severity of risk faced to ensure the cost of control doesn’t exceed the cost of failure. As described, oftentimes what appears to be a necessary control is nothing more than an inability to automate risk MI any other way. This can be addressed by a simple Cost Benefit analysis across Resource Effort and Delay Opportunity.

Matt Davis

Head of Core Infrastructure & Platform Services at Lloyds Banking Group

1y

Nice article! I think approver fatigue is definitely something to consider at scale. JIT, GitOps and the complexity of cloud in general has led to much higher volumes of approvals and there is definitely a risk these start to become part of “trust based” relationships. In aviation, they don’t let pilots/co-pilots fly together often to avoid this developing. In IT we obviously have to look to Sentinel and similar tools to track patterns of behaviour and identity problems/responses - and this will only have to get more and more sophisticated.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics