blog

Vulnerability timelines, SLA, Measurement and prioritization – the how and the why of application and cloud security objective setting

Vulnerability timelines, SLA, Measurement and prioritization - the how and the why
Vulnerability timelines, SLA, Measurement and prioritization - the how and the why
Vulnerability timelines, SLA, Measurement and prioritization – the how and the why

Vulnerability Prioritization in application security, cloud security, container security and infrastructure security, how would you set targets for security and development teams? In this article, we explore the “Why” of vulnerability prioritization and answer the question are SLAs good enough to prioritize?

Vulnerabilities have been increasing in number year on year, precisely 34%, according to MITRE CVE statistics. 

MITRE-CVE

There is no secret that the complexity of cloud and application security vulnerability is increasing consistently.

The speed at which modern organizations are building applications and security teams are catching up to them is increasing.

There is no secret that the complexity of cloud and application security vulnerability is increasing consistently.

The speed at which modern organisations are building applications and security teams are catching up to them is increasing. It comes as no shock that 54% of application security and cloud security professionals have considered changing jobs or industries in the last few years (CxO online)

Among those challenging times, there is a well-known acute cybersecurity shortage. On top of the shortage being in cloud security space and application security space. 

It is no surprise since those two spaces are probably more complex to grasp for newcomers. 

How to move forward, then?

  • Contextualize vulnerabilities
  • Prioritize based on risk and context
  • Set meaningful targets

To contextualise, we’ve covered this in a previous blog on Vulnerability Contextualization and Risk-based prioritisation for application security and cloud security. 

We also covered the prioritisation effort in an extensive whitepaper Prioritizing Vulnerabilities On Cloud And in Software – CVE and the Land of Broken Dreams.

In this blog, we will focus on vulnerability timelines and the challenge of setting them.

Before we move on to the timeline, let’s dive into the definition of SLA in this particular context.

SLA and Measurement Definitions

Let’s start with definitions of the various metrics like SLA and SLO and what they are:      

  • A service-level agreement (SLA) is a commitment between a service provider and a client. In our specific case, an example of SLA: “ A critical level vulnerability must be fixed within X number of days”.
  • A service-level objective (SLO)is the objective that teams agree on internally and is a critical element of a service-level agreement between a service provider and a customer. 
  • An Operational Level Agreement (OLA) is an internal agreement within an organisation. An example of OLA is Application X or team Y needs to maintain a fixed rate for Critical Vulnerability below 15 days.

Note. For an SLA, an SLO is typically a contractual agreement, while for OLA is guidance. SLAs are often misused and agreed upon/mandated internally within teams instead of using OLA. In summary, if there are contractual fines, the OLA are objective to meeting contractually, while internally, they are often captured in policies and agreed upon amongst teams without contractual consequences. An example of OLA is team X must maintain an uptime of 99.99. An example of SLA is contractually agreed between a client and a vendor: if availability is below 99%, a service credit/fine will be issued. 

  • OKR  – Objective and Key Results ( with Metrics and timeframes).  The OKR is a  collection of objectives and supporting metrics within timeframes. A quick example of an objective for the team could be the number of vulnerabilities resolved per sprint or a balance between user stories and security/bug fixes. The key part of OKR is setting objectives and defining the key metrics to measure those objectives. Helpful books on the subject are Measure what matters most or How to measure everything in cybersecurity
SLOSLAOKR
DefinitionA target reliability level objectiveA legal contract or agreement that, if breached, will have penaltiesThe team will meet Objective agreed internally (e.g. fix rate of critical below 14 days)
ExampleCritical Vulnerabilities will be resolved in 28 days 95% of the timePublic available products will have 0 critical vulnerability upon critical release vulnerability disclosed will be solved in 10 days The application will maintain a fix rate defined in SLO of 95% resolution within 15 days 
Who Sets itProduct Owner in partnership with security teamsBusiness Development, Legal teams, IT and Devsecops Product owners in collaboration with security team 
Supported by SLO or Key Indicators (the K in OKR)
Vulnerability and SLA/SLO/SLI Descriptions

The Vulnerability Context and Landscape

The vulnerability landscape in modern organisations is complex; we can categorise vulnerability types or security misconfigurations into several categories. Vulnerabilities in the various categories have different behaviours and require different levels of attention.

We can categorise assets into the following categories:

  • Application Security – Vulnerabilities in software, 3rd party libraries, and code running in live systems
  • Infrastructure Cloud security-related – Vulnerabilities that concern images or similar infrastructure systems running in the cloud
  • Cloud security – Misconfiguration of cloud systems (Key manager, S3) 
  • Network security / Cloud security – Vulnerability affecting network equipment like WAF, Firewall, routers
  • Infrastructure security – Categorised as everything that supports an application to run, that is, traditionally servers, endpoints, and similar systems
  • Container vulnerabilities are derived from either the image in a register/ deployed or the build file that composes them.

The systems used to measure the security posture and the security health of different elements that compose our system are quite wide, illustrated in the picture below.

The resolution time and SLAs are fairly different between asset types across the various categories.

Vulnerability Tooling Landscape (note RAST, often referred to as RASP)

Vulnerabilities Timelines

The picture below explains the complexity of dealing with multiple metrics. It is often easy to use simply SLA, but when do you start the timer? 

  • The timeline at the bottom shows the lifecycle of a vulnerability 
  • The second line shows the timeline of vulnerability in your organisation from the date of discovery to the patch/confirmed resolution
  • The above line shows the timeline of risk acceptance/mitigation
SLA Timelines & Timers & risk acceptance/mitigation

Vulnerability Timelines

Timelines to fix vulnerabilities are dictated by several events. They are composed, in reality, of several timelines. We start from the official public timeline (bottom) that determines the public or private disclosure of a vulnerability till the time of the release of a patch/bugfix. 

At any point in this evolution, your system can detect the vulnerability. 

Usually, this happens when Application Security Tooling releases a signature or detection for a vulnerability discovered. The zero-day time spans between the vulnerability and the patch/fix released by the vendor.  (second timeframe)

Usually, when a vulnerability is disclosed in public security scanners, vendors tend to release the vulnerability detection within days to enable organisations to detect vulnerabilities. 

The exposure window is usually the time from the release of the vulnerability to the time of resolution in your system. Nonetheless, in reality, the timers for exposure windows start from when the vulnerability is identified in your system to when the vulnerability is resolved.

SLA or SLO usually are the target times from the vulnerability being identified and discovered in the system or the ticket being raised with the individual team (resolution SLA).

Discovery to Declaration to CVE – This timeline is usually the most dangerous and relates to the discovery of vulnerability – commonly, in this timeline, there is no patch available, and the systems are at risk for the so-called 0 days. 

  • Disclosure in the wild of vulnerability usually involves the vulnerability being disclosed widely on the web for various reasons, giving the vendor no chance to fix the vulnerability. The resolution time/mitigation time becomes critical. 
  • CVE Registration – The CVE register acknowledges the vulnerability, and the vulnerability does receive a specific code. 
  • PoC – Proof Of Concepts made available – Usually, the PoC is a piece of code that exploits vulnerabilities in systems.
  • Vulnerability identified in network/container/code. 
  • The vulnerability being worked on by a team – Not all the time a vulnerability/ patch is straightforward to fix. Sometimes, an update is quite straightforward and requires only a few updates, whereas other times, it requires extensive testing and careful planning.
  • The vulnerability is being remediated by the team.
  • Vulnerability remedy being confirmed (pentest, Security scanner).

Resolution of vulnerabilities can also be driven by architectural restructuring and upgrading to more modern systems. 

A system could also be out of support and maintenance, and the vendor might not be available anymore. 

The risk of those vulnerabilities needs to be addressed and adjusted depending on the criticality of the system and the blast radius it could cause. 

Nonetheless, a deeper analysis of systems might lead to overall compensating controls (like WAF, RASP, System lockdown, and restrictive access control) that lowers the overall probability of exploitation and hence reduce the overall risk

SLA, SLO and Vulnerability Timelines

The picture below adds the commonly used timers and when they start/ finish. Those timers are important as the SLA/SLO set by the organisation are based on some or all those timers. Usually, the timer that gets used the most is the Mean time to Resolve, which is generally considered from the vulnerability discovered to the resolution. A caveat: most unresolved vulnerabilities add up the Mean time to resolution and clatter the data. A better time for SLA is usually the Mean Time to Resolve from Acknowledgement.

  • True Exposure is the timeline from vulnerability released/discovered (publicly or not) to the time it gets fixed in your organisation. Caveat the vulnerability might get discovered or hidden in your organisation; it all depends on when the signature of a scanner or a vendor/ bug bounty programme discloses the vulnerability to your organisation. 
  • Zero-day exposure is usually the time when the vulnerability gets released and when a patch is made available by the vendor (divided here for clarity between hidden – when the vulnerability is discovered and explicit when the discovery is made available on the web)
  • Mean time to Resolve – MTTR – the average time the ticket takes to get resolved is usually captured from the ticket acknowledged and tickets resolved.
  • Mean Time to Acknowledgment – MTA – is the time between the discovery of a vulnerability is made, and the time it gets taken into work by the dev team.
  • Mean time to Resolution from Acknowledgment – MTTR – A – this timer counts the amount of time it passes from Acknowledgment of the ticket to a resolution of the vulnerability in the developer ticket. This is a better indicator of work being done. The average time it takes to resolve a vulnerability.
  • Mean Time to Open MTO –  is the time between the ticket being raised and the ticket being worked on. This is usually similar to MTTR-D.
  • Mean Time to Resolve from Discovery MTTR-D is the overall time from discovering a vulnerability to full resolution.
SLA Timelines & Timers

How to set targets

Service Level Objectives/Agreements are not a solution but an aid to setting targets for the team but can be an aid if there is nothing else.

Expanding on the subject here would be too extensive; we wrote several whitepapers and articles on the subject;

In conclusion, targets based on risk are much more precise but also variable.

Setting targets is a complex process and should be

  • Collaborative based
  • Outcome-based
  • Reviewed often

We wrote a previous article on SLA and how to set them for cloud security, container security, infrastructure and vulnerability, and application security.

We also wrote a specific blog on how to set targets for infrastructure and application security and move toward OKr.

We are publishing another book/whitepaper on Vulnerability Management and how to move away from pure SLA and more toward a team-based OKr.

Examples of SLA

Before we deep dive in how to set SLA a bit of context on SLA and where they can be based on. 

  • Severity Based SLAs – are the simplest ones but also the most basic. They don’t offer a contextual view of where the vulnerabilities are
  • Multi Severity SLA – those SLAs change based on the severity of the SLA or the Criticality of the Asset – they are more contextual and consistent as the Criticality of an asset, and the severity does not change frequently. Nonetheless, they reflect a very static view of the organisation.
  • Exposure-based SLAs are more contextual but lack the focus on criticality and severity. Nonetheless, the SLAs are consistent and aligned with part of the probability of exploitation (derived from where the asset are exposed)
  • Risk-based SLA – those SLAs are the best in representing an environment but come with challenges due to the dynamic nature of risk. 

Setting SLA/ SLO based on vulnerability severity

Vulnerability Severity Based SLA

The most common level and objective when fixing vulnerabilities at an early stage. 

The SLAs are usually set to Different levels of vulnerability criticality

  • Critical severity vulnerabilities fixed in X amount of time /days
  • High severity vulnerabilities fixed in X amount of time /days
  • Medium severity vulnerabilities fixed in X amount of time /days
  • Low severity vulnerabilities fixed in X amount of time /days
AdvantagesDisadvantages
– High granularity
– Take into account different factors 
– Risk can be based on the probability of exploitation
– Risk can be based on asset criticality


– Those SLAs are not the most intuitive
– SLA Can change based on the landscape change
– Same vulnerability can have a different SLA based on the deployment
– SLA can change while at work (rare but to consider)
– Requires middleware to enrich scanner data
Risk Based Based SLA

Asset Criticality Based SLA

Those SLAs are a bit more sophisticated and rely on a different level of criticality of the assets. 

Conversations with different teams have surfaced with confusion when working with different SLA levels. 

The SLA is usually set to Different levels of vulnerability criticality

Critical Services 

  • Critical severity vulnerabilities fixed in X amount of time/days
  • High severity vulnerabilities fixed in X amount of time/days
  • Medium severity vulnerabilities fixed in X amount of time/days
  • Low severity vulnerabilities fixed in X amount of time/days (caveat most organisations don’t get around fixing low or medium severity vulnerabilities) 

Non-Mission Critical Services – General speaking, more time to fix vulnerabilities

  • Critical severity vulnerabilities fixed in X+Y  amount of time/days
  • High severity vulnerabilities fixed in X amount of time/days
  • Medium severity vulnerabilities fixed in X amount of time/days
  • Low severity vulnerabilities fixed in X amount of time/days
AdvantagesDisadvantages
– High granularity
– Take into account different factors 
– Risk can be based on the probability of exploitation
– Risk can be based on asset criticality


– Those SLAs are not the most intuitive
– SLA Can change based on the landscape change
– Same vulnerability can have a different SLA based on the deployment
– SLA can change while at work (rare but to consider)
– Requires middleware to enrich scanner data
Risk Based Based SLA

Exposure Based SLA

The SLA is based on exposure of assets, usually assets that are externally facing. The exposure level is generally more complex to measure and relies on asset management accuracy or some form of tag-based strategy in container and cloud.

AdvantagesDisadvantages
– High granularity
– Take into account different factors 
– Risk can be based on the probability of exploitation
– Risk can be based on asset criticality


– Those SLAs are not the most intuitive
– SLA Can change based on the landscape change
– Same vulnerability can have a different SLA based on the deployment
– SLA can change while at work (rare but to consider)
– Requires middleware to enrich scanner data
Risk Based Based SLA

Risk Based SLA

Risk-based SLAs are more sophisticated and rely on composite metrics to set targets. Those SLAs are the best to use but also the most complex to implement

  • Risk Triage SLA = This SLA provides the agreed time on how long it should take to triage a risk and accept/reject it. 
  • Risk SLA = This SLA provides the agreed time on how long the risk should be in the risk status – accepted, signed off (Maximum Risk time)
AdvantagesDisadvantages
– High granularity
– Take into account different factors 
– Risk can be based on the probability of exploitation
– Risk can be based on asset criticality


– Those SLAs are not the most intuitive
– SLA Can change based on the landscape change
– Same vulnerability can have a different SLA based on the deployment
– SLA can change while at work (rare but to consider)
– Requires middleware to enrich scanner data
Risk Based Based SLA

Additional Considerations

When setting SLA, sometimes no one rule fits all. It is important to remember that setting SLA shall serve a purpose. Driving metrics and resolution time down or keeping resolution time consistent. 

Risk Considerations

The issue with risk and severity

Currently, risk and severity are two words used interchangeably when they should not. Severity is a non-actualized risk, while risk expresses not only the potential of a risk to manifest but also the impact and the probability of it manifesting.

How to calculate risk

The risk level mentioned above can be calculated in multiple ways, from simple severity (completely contextualise risk) to Location-based and Granual probability of exploitation based. 

Following a more detailed version of the risk formula

Risk = Probability (Likelihood of exploitation, Locality)  * Severity * Impact 

Contextual aspects are based on:

  • The severity of a vulnerability – how much a 3rd party vendor has declared that vulnerability to potentially be dangerous
  • Probability of exploitation – how likely is that vulnerability to be exploited
  • The locality is a factor in the probability of exploitation 
  • Impact (also known as a factor of the Business Impact assessments) communicates how much damage a vulnerability could cause to the organisation

Risk-based threat assessment is usually done by security professionals. Still, this result in an overwhelming job as the factors that need to be considered is simply too many and vary too quickly. 

A topography of IT components in a typical enterprise/ Modern deployments

Following is a list of elements security professionals need to consider when triaging and deciding which vulnerabilities to fix first :

  • How an application is being built
  • Where it is deployed (which network, which environment) 
  • What kind of data does the application process
  • How many of the components are external, Internal or connected to those
  • What are the vulnerabilities of the code, libraries and API that the application is building
  • Where are the encryption keys stored? Are there any misconfigurations in the storage system 
  • Is any of the systems where the application is being deployed vulnerable or has it become vulnerable
  • Is any of the software in the system where the application is being deployed
  • Is there any threat actor group targeting a specific vulnerability/system
  • What is the blast radius if one of those components gets compromised

The following elements can be considered when calculating the risk elements

Probability of exploitation

  • The severity of a vulnerability (CVE, CWE, CVSS and CWSS)
  • The locality of an asset, also known as Context
  • Exploitability of a vulnerability based on the availability of Proof of concept or code snippet
  • Probability of an attacker targeting the vulnerability
  • Active exploitation of the vulnerability from threat actors groups 
  • Discussion on Twitter, Linkedin, Reddit and other forums
  • The freshness of vulnerability (in first 40 days, vulnerabilities are exploited/targeted more frequently) 

Impact on system

  • What data is being processed by the system
  • How many users are accessing the system or could be impacted 
  • How much revenue could be impacted if the system is unavailable, and for how long
  • Contractual impact – how much damage/ credit clients need to be compensated for a failure in a system
  • Brand image damage – how much new business
  • Stock/Share price damage – how much a public disclosure affects the trust of the stock market (in direct relationship with  

We recently wrote a whitepaper  that expands on this problem Contextual and Risk-based Prioritization of vulnerabilities in cloud and applications.

Conclusion

No matter how you decide to set targets is important that those targets are collectively aligned with team objectives and business objectives.

An objective not agreed upon by the business or teams is completely aspirational. 

Measurable objectives are the key to achieving objectives. A rule of thumb is to establish SMART.

S – Specific

M – Measurable

A – Actionable/Achievable

R – Realistic/Relevant

T – Time-Bound

Those objectives must be collectively agreed upon based on metrics that can be observed and measured.  

Measuring how Mean Time to Resolution, How much workload a team has, is the key to identifying and revisiting objectives. Commercial and non-commercial solutions can tap into Ticketing systems (Jira, Service now, Github) and CI systems (TravisCI, Jenkins…) to measure how many vulnerabilities are introduced per release. 

Appsec phoenix, with other applications and cloud security observability solutions, is here to provide a better experience for the security and development team.

Francesco is an internationally renowned public speaker, with multiple interviews in high-profile publications (eg. Forbes), and an author of numerous books and articles, who utilises his platform to evangelize the importance of Cloud security and cutting-edge technologies on a global scale.

Discuss this blog with our community on Slack

Join our AppSec Phoenix community on Slack to discuss this blog and other news with our professional security team

From our Blog

Contextual vulnerability management is a comprehensive approach to identifying, analyzing, and mitigating vulnerabilities in software and cloud infrastructure. It involves considering the specific context and environment in which vulnerabilities exist, including the software and hardware components, the network infrastructure, and the organizational policies and processes in place. By adopting this approach, organizations can more effectively assess and mitigate the risks posed by vulnerabilities, helping to protect their assets and maintain the security of their systems and networks.
Francesco Cipollone
Cyber security risk is challenging to calculate. Real-Time context, Cyber threat intelligence, Ownership Vulnerabilities, all part of the same continuum ->
Alfonso Eusebio
In today’s digital world, cyber threats are a real and growing concern for organizations of all sizes. As the threat landscape continues to evolve. we explore in this blog how to threat treats, which one to use in your prioritization strategy
Sally Turner

Join our Mailing list!

Get all the latest news, exclusive deals, and feature updates.

x Logo: ShieldPRO
This Site Is Protected By
ShieldPRO