Why Software Risk Management Does Not Quantify Probability of Occurrence

Why Software Risk Management Does Not Quantify Probability of Occurrence

In the development of systems involving human life such as medical devices, automobiles, and aircraft, risk management is critically important. However, there is a fundamental difference in risk assessment methods between hardware and software. Of particular note is that quantitative assessment of “probability of occurrence” is extremely difficult in software risk assessment, and in practice, approaches emphasizing severity have become mainstream.

Basic Concepts of Risk Management

In general risk management, risk is assessed using the following formula:

Risk = Severity of Harm × Probability of Occurrence

For hardware, failure rates due to aging of components and usage environment can be statistically predicted. For electronic components, for example, it is possible to calculate specific values such as “one failure per 100,000 hours of operation.”

The Distinctive Nature of Software

Software possesses characteristics that are fundamentally different from hardware.

1. No Aging Degradation

Software has no physical entity and therefore does not degrade over time. Once a program operates correctly, it will continue to operate identically indefinitely, as long as the execution environment remains unchanged.

2. Deterministic Behavior

For the same input, software always returns the same output. This means that probabilistic elements do not intervene. If a bug exists, it will always manifest under specific conditions.

3. The Complexity Problem

Modern software consists of enormous lines of code and countless execution paths. Testing all combinations is practically impossible. For example, even a program with only 10 independent conditional branches has 2^10 = 1,024 execution paths.

The Approach in IEC 62304

The international standard for medical device software, IEC 62304, classifies software safety classes as follows:

Safety ClassClassification Criteria
Class ANo injury or damage to health is possible
Class BNon-serious injury is possible
Class CDeath or serious injury is possible

It should be noted that this classification is based solely on “severity” and does not consider “probability of occurrence.”

Note: A draft of the second edition of IEC 62304 (expected publication around 2026) proposes replacing the current three-class system (A, B, C) with a two-level “Software Process Rigor Level” system (Level I and Level II) to align with IEC 81001-5-1. Level I essentially corresponds to the current Class A, while Level II consolidates Classes B and C. This change aims to simplify the classification process and harmonize with other health software standards, while maintaining the fundamental principle of severity-based classification rather than probability-based assessment.

Why Quantitative Assessment of Probability of Occurrence is Difficult

1. Unpredictability

When software bugs will manifest depends entirely on user operation patterns and input data. Statistical prediction is difficult for the following reasons:

User usage patterns vary widely; input data combinations are infinite; and interactions with external systems are complex.

2. The Binary World

Software bugs either definitely occur when conditions are met, or never occur when conditions are not met. This binary nature is incompatible with probabilistic approaches.

3. Handling Latent Risks

It is impossible to estimate the probability of existence of undiscovered bugs. Therefore, it is necessary to conduct risk assessment assuming worst-case scenarios.

Note: While probability models using Poisson distributions or exponential distributions exist theoretically, these are primarily used to observe the progress of bug discovery and correction, and are rarely used in operational risk assessment or safety class determination. IEC 61508-3 Annex C explicitly states that “applying probabilistic methods to software is extremely difficult.”

In IEC 61508, the concept of “Systematic Capability” (SC) is used as a measure (expressed on a scale from SC 1 to SC 4) of confidence that the systematic safety integrity of an element meets the requirements of the specified Safety Integrity Level (SIL). This approach focuses on the rigor of development processes and control measures to avoid and control systematic failures, rather than attempting to quantify failure probability for software. The SC concept recognizes that systematic failures in software stem from human errors in design and specification, and are not amenable to the same probabilistic treatment as random hardware failures.

Practical Approaches

Instead of seeking probability of occurrence, software development manages risk through the following methods:

1. Thorough Testing

This includes unit testing, integration testing, system testing, boundary value analysis, equivalence class partitioning, stress testing, and abnormal system testing.

2. Ensuring Safety Through Design

This includes fail-safe design, ensuring redundancy, and thorough error handling.

3. Process Management

This includes code review, utilization of static analysis tools, and thorough change management.

Implications for Medical Device Development

In the development of medical device software, this approach is extremely important. Risk assessment that does not rely on probability of occurrence is a more conservative and safety-oriented approach.

For example, omitting testing because “it is a rarely used function” is not permissible. As long as that function has the potential to affect patient life, the highest level of quality assurance is required regardless of frequency of use.

Summary

The difficulty of quantitative assessment of probability of occurrence in software risk management stems from the inherent characteristics of software. The nature of software being deterministic yet complex makes statistical prediction difficult.

The fact that IEC 62304 performs class classification based solely on severity is a practical approach that takes this reality into account. Developers need to ensure safety not by relying on the uncertain element of probability of occurrence, but by assuming worst-case scenarios and implementing preventive measures against them.

Software quality is guaranteed not by probability, but by the rigor of the development process and the robustness of the design. Having this recognition is the first step in developing safe and reliable software.

Related post

Comment

There are no comment yet.