Why Software Risk Management Does Not Quantify Probability of Occurrence
In the development of systems involving human life such as medical devices, automobiles, and aircraft, risk management is critically important. However, there is a fundamental difference in risk assessment methods between hardware and software. Of particular note is that quantitative assessment of “probability of occurrence” is extremely difficult in software risk assessment, and in practice, approaches emphasizing severity have become mainstream.
Basic Concepts of Risk Management
In general risk management, risk is assessed using the following formula:
Risk = Severity of Harm × Probability of Occurrence
For hardware, failure rates due to aging of components and usage environment can be statistically predicted. For electronic components, for example, it is possible to calculate specific values such as “one failure per 100,000 hours of operation.”
The Distinctive Nature of Software
Software possesses characteristics that are fundamentally different from hardware.
1. No Aging Degradation
Software has no physical entity and therefore does not degrade over time. Once a program operates correctly, it will continue to operate identically indefinitely, as long as the execution environment remains unchanged.
2. Deterministic Behavior
For the same input, software always returns the same output. This means that probabilistic elements do not intervene. If a bug exists, it will always manifest under specific conditions.
3. The Complexity Problem
Modern software consists of enormous lines of code and countless execution paths. Testing all combinations is practically impossible. For example, even a program with only 10 independent conditional branches has 2^10 = 1,024 execution paths.
The Approach in IEC 62304
The international standard for medical device software, IEC 62304, classifies software safety classes as follows:
| Safety Class | Classification Criteria |
| Class A | No injury or damage to health is possible |
| Class B | Non-serious injury is possible |
| Class C | Death or serious injury is possible |
It should be noted that this classification is based solely on “severity” and does not consider “probability of occurrence.”
Note: A draft of the second edition of IEC 62304 (expected publication around 2026) proposes replacing the current three-class system (A, B, C) with a two-level “Software Process Rigor Level” system (Level I and Level II) to align with IEC 81001-5-1. Level I essentially corresponds to the current Class A, while Level II consolidates Classes B and C. This change aims to simplify the classification process and harmonize with other health software standards, while maintaining the fundamental principle of severity-based classification rather than probability-based assessment.
Why Quantitative Assessment of Probability of Occurrence is Difficult
1. Unpredictability
When software bugs will manifest depends entirely on user operation patterns and input data. Statistical prediction is difficult for the following reasons:
User usage patterns vary widely; input data combinations are infinite; and interactions with external systems are complex.
2. The Binary World
Software bugs either definitely occur when conditions are met, or never occur when conditions are not met. This binary nature is incompatible with probabilistic approaches.
3. Handling Latent Risks
It is impossible to estimate the probability of existence of undiscovered bugs. Therefore, it is necessary to conduct risk assessment assuming worst-case scenarios.
Note: While probability models using Poisson distributions or exponential distributions exist theoretically, these are primarily used to observe the progress of bug discovery and correction, and are rarely used in operational risk assessment or safety class determination. IEC 61508-3 Annex C explicitly states that “applying probabilistic methods to software is extremely difficult.”
In IEC 61508, the concept of “Systematic Capability” (SC) is used as a measure (expressed on a scale from SC 1 to SC 4) of confidence that the systematic safety integrity of an element meets the requirements of the specified Safety Integrity Level (SIL). This approach focuses on the rigor of development processes and control measures to avoid and control systematic failures, rather than attempting to quantify failure probability for software. The SC concept recognizes that systematic failures in software stem from human errors in design and specification, and are not amenable to the same probabilistic treatment as random hardware failures.
Practical Approaches
Instead of seeking probability of occurrence, software development manages risk through the following methods:
1. Thorough Testing
This includes unit testing, integration testing, system testing, boundary value analysis, equivalence class partitioning, stress testing, and abnormal system testing.
2. Ensuring Safety Through Design
This includes fail-safe design, ensuring redundancy, and thorough error handling.
3. Process Management
This includes code review, utilization of static analysis tools, and thorough change management.
Implications for Medical Device Development
In the development of medical device software, this approach is extremely important. Risk assessment that does not rely on probability of occurrence is a more conservative and safety-oriented approach.
For example, omitting testing because “it is a rarely used function” is not permissible. As long as that function has the potential to affect patient life, the highest level of quality assurance is required regardless of frequency of use.
Summary
The difficulty of quantitative assessment of probability of occurrence in software risk management stems from the inherent characteristics of software. The nature of software being deterministic yet complex makes statistical prediction difficult.
The fact that IEC 62304 performs class classification based solely on severity is a practical approach that takes this reality into account. Developers need to ensure safety not by relying on the uncertain element of probability of occurrence, but by assuming worst-case scenarios and implementing preventive measures against them.
Software quality is guaranteed not by probability, but by the rigor of the development process and the robustness of the design. Having this recognition is the first step in developing safe and reliable software.
Comment