TPK5160: Risk Analysis
Introduction
Three Main Questions
What can go wrong?
What is the likelihood of that happening?
What are the consequences?
Risk Analysis, Assesment, and Management
Risk Analysis
Systematic use of available information to identify hazards and to estimate the risk to individuals, property, and the environment.
Hazard identification
Frequency analysis
Consequence analysis
Qualitative or Quantitative
Risk Evaluation
Process in which judgments are made on the tolerability of the risk on the basis of a risk analysis and taking into account factors such as socioeconomic and environmental aspects.
Risk Assessment
Overall process of risk analysis and risk evaluation.
Five steps
- Identify the hazards
- Decide who might be harmed and how
- Evaluate the risks and decide on precautions
- Record your findings and implement them
- Review your assessment and update if necessary
Risk Management
A continuous management process with the objective to identify, analyze, and assess potential hazards in a system or related to an activity, and to identify and introduce risk control measures to eliminate or reduce potential harms to people, the environment, or other assets.
The Study Object
System
Composite entity, at any level of complexity, of personnel, procedures, materials, tools, equipment, facilities, and software. The elements of this composite entity are used together in the intended operational or support environment to perform a given task to achieve a specific objective.
- Hardware (H). Any physical and nonhuman element of the study object, such as workspace, buildings, machines, equipment, and signs.
- Software (S). Nonmaterial elements of the study object: for example, computer software, work procedures, norms, checklists, and practices.
- Liveware (L). Personnel, such as operators, maintenance staff, service personnel, visitors and third parties. Liveware also includes such elements as teamwork and leadership.
- Management/organization (M). Management, policies, strategies, training, and so on.
- Environment (E). The internal and external environment in which the study object operates.
The SHEL Model
- Liveware-Hardware (L-H). This interface between the human and the hardware is often called the man-machine interface.
- Liveware-Software (L-S). This interface describes the relationship between the personnel and the computer software, checklists, and so on. It depends on the presentation format, clarity, and the use of symbols.
- Liveware-Liveware (L-L). This interface covers the relationship between the individual and other persons in the workplace. It depends on leadership, cooperation, teamwork, and personal interactions.
- Liveware-Environment (L-E). This interface concerns the relationship between the individual and herlhis internal and external environments. The internal environment involves factors such as temperature, light, noise, vibration, and air quality. The external environment includes such things as weather conditions and external systems that may influence the working conditions.
Complexity and Coupling
Complexity
Continuum from linear via non-linear to chaotic. A measure of the predictability of the system.
Coupling
A measure of the strength of the interconnectedness between system components.
Accident Categories
Jens Rasmussen's Categories
- Category 1: High frequency, Low consequence
- Category 2: Medium frequency, Medium consequence
- Category 3: Low frequency, High consequence
James Reason's Categories
- Individual Accidents: accidents that are caused and suffered by single individuals.
- Organizational Accidents: Organizational accidents are generally characterized by multiple causes and numerous interactions between different system elements.
Risk in our Modern Society
Safety Legislation
Risk and Decision-making
Stakeholder
Person or organization that can affect, be affected by, or perceive themselves to be affected by a decision or activity.
Risk-Based Decision-making
A process that uses quantification of risks, costs, and benefits to evaluate and compare decision options competing for limited resources.
Risk-Informed Decision-making
An approach to decision-making representing a philosophy whereby risk insights are considered together with other factors to establish requirements that better focus the attention on design and operational issues commensurate with their importance to health and safety.
The Words of Risk Analysis
Events and Scenarios
Event
Incident or situation which occurs in a particular place during a particular interval of time.
Hazardous Event
The first event in a sequence of events that, if not controlled, will lead to undesired consequences (harm) to some assets.
Initiating Event
An identified event that upsets the normal operations of the system and may require a response to avoid undesirable outcomes.
Accident Scenario
A specific sequence of events from an initiating event to an undesired consequence (or harm).
Reference accident scenario
An accident scenario that is considered to be representative of a set of accident scenarios that are identified in a risk analysis, where the scenarios in the set are considered to be likely to occur.
Worst-case accident scenario
The accident scenario with the highest consequence that is physically possible regardless of likelihood.
Worst credible accident scenario
The highest-consequence accident scenario identified that is considered plausible or reasonably believable.
Probability and Frequency
Probability
Classical Approach
Frequentist Approach
Bayesian Approach
Subjective probability: A numerical value in the interval [0, I] representing an individual's degree of helief about whether or not an event will occur.
Prior probability: An individual's belief in the occurrence of an event E prior to any additional collection of evidence related to E.
Posterior probability: An individual's belief in the occurrence of the event E based on her prior belief and some additiona evidence D I.
Assets and Consequences
Asset
Something we value and want to preserve.
Harm
Physical injury or damage to health, property, or the environment (assets).
Severity
Seriousness of the consequences of an event expressed either as a financial value or as a category.
Risk
Risk
The combined answer to three questions: (1) What can go wrong? (2) What is the likelihood of that happening? and (3) What are the consequences?
Safety performance
An account of all accidents that occurred in a specified (past) time period, together with frequencies and consequences observed for each type of accident.
Risk influencing factor (RIF)
A relatively stable condition that influences the risk.
Desired risk
Risk that is sought, not avoided, because of the thrill and intrinsic enjoyment it brings.
Risk Homeostasis
When the technical safety of cars increases, the drivers will tend to drive less cautiously.
Residual risk
The risk that remains after engineering, administrative, and work practice controls have been implemented.
Risk perception
Subjective judgment about the characteristics and severity of risk.
Objective risk
An accurate and reasonably complete characterization of a risk can be made by stating (only) objective facts about the physical world.
Subjective risk
An accurate and reasonably complete characterization of a risk does not refer to any objective facts about the physical world.
Dual view on risk
An accurate and reasonably complete characterization of a risk must refer to objective facts about the physical world and to (value) statements that do not refer to objective facts about the physical world.
Barriers
Barrier
Physical or engineered system or human action (based on specific procedures or administrative controls) that is implemented to prevent, control, or impede energy released from reaching the assets and causing harm.
Mitigation
Action to reduce the severity, seriousness, or painfulness of something. Implementation of reactive barriers.
Accidents
Accident
Accident (1): An unwanted transfer of energy, because of lack of barriers and/or controls, producing injury to persons, property, or process, preceded by sequences of planning and operational errors, which failed to adjust to changes in physical or human factors and produced unsafe conditions and/or unsafe acts, arising out of the risk in an activity, and interrupting or degrading the activity.
Accident (2): A sudden, unwanted, and unplanned event or event sequence that leads to harm to people, the environment, or other assets.
Incident
An unplanned and unforeseen event that mayor may not result in harm to one or more assets.
Near accident
An unplanned and unforeseen event that could reasonably have been expected to result in harm to one or more assets, but actually did not.
Uncertainty
Uncertainty: A measure of the confidence we have in the results of risk assessment.
Vulnerability and Resilience
Vulnerability: The inability of an object to resist the impacts of an unwanted event and to restore it to its original state or function following the event. Resilience: The ability to accommodate change without catastrophic failure, or the capacity to absorb shocks gracefully.
Safety and Security (not curriculum)
Hazards and Threats
Hazard
A source of danger that may cause harm to an asset. Triggering event: An event or condition that is required for a hazard to give rise to an accident. Safety issue: The manifestation of a hazard or combination of several hazards in a specific context.
Classification of Hazards
Threats
Energy Sources
Technical Failures
Failure: The termination of a required function. Failure mode: The effect by which a failure is observed on a failed item. Failure mechanism: A physical, chemical, or other process that leads to failure.
How to Measure and Evaluate Risk
Risk Indicators
A parameter that is estimated based on risk analysis models and by using generic and other available data. A risk indicator presents our knowledge and belief about a specific aspect of the risk of a future activity or a future system operation.
Safety performance indicator: A parameter that is estimated based on experience data from a specific installation or an activity. A risk performance indicator therefore tells us what has happened.
Risk to People
Individual risk: The frequency with which an individual may be expected to sustain a given level of harm from the realization of specified hazards.
Societal risk: The relationship between frequency and the number of people suffering from a specified level of harm in a given population from the realization of specific hazards.
Individual Risk per Annum (IRPA)
IRPA = (observed no. of fatalities due to hazards a)/total no. of person-years exposed
Potential Equivalent Fatality (PEF)
A convention for aggregating harm to people by regarding major and minor injuries as being equivalent to a certain fraction of a fatality.
Localized Individual Risk per Annum (LIRA)(LSIR)(IRI)
The probability that an average unprotected person, permanently present at a specified location, is killed in a period of one year due to an accident at a hazardous installation.
Risk Contour Plots
LIRA per point in an area
Reduction In Life Expectancy (RLE)
Lost-Time Injuries
LTIF = (no. of lost-time injuries (LTis)/ no. of hours worked) * 2 * 10^5
Lost Workdays Frequency. LWF = (lost days due to LTIs / no. of hours worked) * 2 * 10^5
Relation Between the Frequencies of Fatalities and Injuries
Heinrich's triangle
Potential Loss of Life (PLL)
The PLL is the expected number of fatalities within a specified population (or within a specified area A) per annum. Same as annual fatality rate (AFR).
Fatal Accident Rate (FAR)
Deaths per Million
FN Curves
FN Criterion Lines
Risk Matrices
Discetized FN plot thingy.
Risk Acceptance Criteria
Criteria used as a basis for decisions about acceptable risk.
Acceptable risk: Risk that is accepted in a given context based on the current values of society and in the enterprise.
Acceptable and Tolerable Risk
Acceptable means almost always acceptable, tolerable is something you accept in order to gain some value.
Value of Life
Value of a Statistical Life (VSL) 1- 15 million us dollars.
Societal willingness to pay
Value of averting (or preventing) a fatality (VAF)
Implied cost of averting a fatality (ICAF).
Net cost of averting a fatality (NCAF).
Approaches to Risk Acceptance
The ALARP Principle
As low as reasonably practicable.
Cost-Benefit Assessment: A disproportion factor d may be calculated as:
The ALARA Principle
As low as reasonably achievable. No region of general acceptance.
The GAMAB Principle
Globally at least as good. New systems are in total not to be less safe than older ones. But tradeoffs are allowed.
The MEM Principle
Minimum endogenous mortality. New technologies are not allowed to cause a significant increase in mortality:
Societal Risk Criteria
The Precautionary Principle
Where there are threats of serious or irreversible damage, lack of full scientific certainty shall not be used as a reason for postponing costeffective measures to prevent environmental degradation.
Risk Management
Introduction
Risk Management
- Risk analysis
- Risk evaluation
- Risk control and risk reduction
Bow-Tie Analysis
Risk Analysis
- What can go wrong? (i.e., hazard identification)
- What is the likelihood of that happening? (i.e., frequency analysis)
- What are the consequences? (i.e., consequence analysis)
Types of Risk Analysis
Qualitative Risk Analysis Semiquantitative Risk Analysis Quantitative Risk Analysis
Risk Acceptance Criteria
The Steps In a Risk Analysis
- Plan and prepare the risk analysis.
- Define and delimit the system and the scope of the analysis.
- Identify hazards and potential hazardous events.
- Determine causes and frequency of each hazardous event.
- Identify accident scenarios (i.e., event sequences) that may be initiated by each hazardous event.
- Select relevant and typical accident scenarios.
- Determine the consequences of each accident scenario.
- Determine the frequency of each accident scenario.
- Assess the uncertainty.
- Sensitivity analysis: Analysis that examines how the results of a calculation or model vary as individual assumptions are changed
- Establish and describe the risk picture.
- Report the analysis.
Risk Evaluation
- Evaluate the risk against risk acceptance criteria.
- Suggest and evaluate potential risk-reducing measures.
Risk Control and Risk Reduction
-
Preventive measures intended to reduce the frequency of one or more hazardous event. Such measures are also called proactive or frequency-reducing measures.
-
Mitigating measures intended to avoid or reduce the consequences of a potentially hazardous event. Such measures are also called reactive or consequence reducing measures.
-
Eliminate, substitute, and/or minimize.
- Prevent.
- Detect and warn.
- Mitigate.
Control of Human Error
- Error reduction
- Error capturing
- Error tolerance
Competence Requirements
To which degree the risk analysis will meet the objectives depends on the competence of the study team.
Quality Requirements
- Consistency in internal logic
- Empirical support
- Predictability of outcomes under similar conditions
Accident Models
Introduction
Accident Classification
Major accident (in aviation): An accident in which any of three conditions is met: - The airplane was destroyed; or - There were multiple fatalities; or - There was one fatality and the airplane was damaged substantially.
Accident Investigation
There are two main objectives of an accident investigation: (i) to assign blame for the accident, and (ii) to understand why the accident happened so that similar future accidents may be prevented.
Accident Causation
Acts of God
Funny name for events outside human control.
Accident Proneness
Today's researchers tend to view accident proneness as associated with the propensity of individuals to take risks or to take chances.
Classification of Accident Causes
- Direct causes are the causes that lead immediately to accident effects. Direct causes are also called immediate causes or proximate causes, as they usually result from other, lower-level causes.
- Root causes are the most basic causes of an accident. The process used to identify and evaluate root causes is called root cause analysis.
- Risk-influencing factors (RIFs) are background factors that influence the causes and/or the development of an accident.
Accident Models
Objectives of Accident Models
- Accident investigation
- Prediction and prevention
- Quantification
Classification of Accident Models
- Energy and barrier models
- Event sequence models
- Event causation and sequencing models
- Epidemiological accident models
- Systemic accident models
- Accident reconstruction methods
Energy and Barrier Models
Haddon
Sequential Accident Models
Heinrich's Domino Model
- Social environment and ancestry
- Fault of the person
- Unsafe act or condition
- Accident
- Injury
Loss Causation Model
- Lack of management control
- Basic causes (personal factors or job factors)
- Immediate causes (substandard acts and conditions)
- Incident (contact with energy, substance, and/or people)
- Loss (people, property, environment, and material)
Rasmussen and Svedung's Model
a. Root cause. b. Causal sequences. c. Hazardous event. d. Event sequences. e. Persons, assets.
STEP
Sequentially timed events plotting. - The start state describes the normal state of the system. - The initial event is the event that disturbed the system and initiated the accident process. The initial event is an unplanned change done by an actor. - The actors that changed the system or intervened to control the system. An actor does not need to be a person. Technical equipment and substances can also be actors. - The elementary events. An event is an action committed by a single actor. - The events are assumed to flow logically in the accident process. - A timeline is used as the horizontal axis in the STEP diagram for recording when the events started and ended. - The end event of the STEP diagram is the point where an asset is harmed and the point that defines the end of the diagram.
Epidemiological Accident Models
Reason's Swiss Cheese Model
- Decision-makers
- Line management
- Preconditions
- Unsafe acts
- Last barriers
Tripod
Tripod-Delta
Safety management system and a proactive method for accident prevention. 1. Basic risk factors (BFRs): " ... those features of an operation that are wrong and have been so for a long time, but remain hidden because their influences do not surface without a local trigger". 2. Hazards and unsafe acts 3. Accidents, incidents, and losses
Tripod-Beta
Method for accident investigation and analysis. As such, it is a reactive approach that is used mainly after an accident has taken place in order to prevent recurrence.
Includes HEMP and Tripod-Delta
Event Causation and Sequencing Models
MTO-Analysis
- Man
- Technology
- Organization
MORT
The management oversight and risk tree. S-branch. This branch contains factors representing specific oversights and omissions associated with the accident. M-branch. This branch presents general characteristics of the management system that contributed to the accident. R-branch. This branch contains assumed risks-risk aspects that are known, but for some reasons are not controlled.
Systemic Accident Models
Rasmussen's Soclotechnical Framework
- Structural Hierarchy
- System Dynamics
AcciMap
Normal Accidents
Multiple failure accident in which there are unforeseen interactions that make them very difficult or impossible (with our current understanding of the system) to diagnose. Normal result of interactive complexity and tight coupling.
Interactive complexity: Failures of two or more components interact in an unexpected way--due to a multitude of connections and interrelationships.
Tight coupling: Processes that are part of a system happen quickly and cannot be turned off or isolated-due to direct and immediate connections and interactions between components.
High-Reliability Organizations (HRO)
Organizations built to proactively prevent accidents. Organizational redundancy.
STAMP
Systems-theoretic accident model and processes.
-
Development of the hierarchical control structure, which includes identification of the interactions between the system components and identification of the safety requirements and constraints.
-
Classification and analysis of flawed control (constraint failures), which includes the classification of causal factors followed by the reasons for flawed control and dysfunctional interactions.
Data for Risk Analysis
Types of Data
- Technical data: System description.
- Operational data: Procedures description.
- Accident data: Previous accidents.
- Hazard data: (i) checklists of relevant hazards and (ii) information (e.g., fact sheets) about dangerous substances and dosages that will harm human beings and the environment.
- Reliability data: Liftetime of components.
- Maintenance data: Time since service.
- Meteorological data.
- Data on natural events.
- Exposure data.
- Environmental data
- External safety junctions
- Stakeholder data
Component Reliability Data
Human Error Data
Risk Assesment Process
Plan and Prepare
Objectives
Study Team
Project Planning
System Description
- What inputs to the system/element are required?
- What functions are performed by the system/element?
- What are the outputs from the system/element?
Familiarization
Document Control System
Laws and Regulation
Input Data
Selection of Method
Reporting
Updating
Hazard Identification
The process of identifying and describing all the significant hazards, threats, and hazardous events associated with a system.
Objectives of Hazard Identification
(a) Identify all the hazards and hazardous events that are relevant during all intended use and foreseeable misuse of the system, and during all interactions with the system. (b) Describe the characteristics, and the form and quantity, of each hazard. (c) Describe when and where in the system the hazard is present. (d) Identify possible triggering events related to each hazard. (e) Identify under what conditions the hazard could lead to a hazardous event and which pathways the hazard may follow. (f) Identify potential hazardous events that could be caused by the hazard (or in combination with other hazards). (g) Make operators and system owners aware of hazards and potential hazardous events.
Hazard Identification Methods
- Hazard Log
- Checklist and Brainstorming
- Preliminary Hazard Analysis (PHA)
- Change Analysis
- Failure Modes, Effects, and Criticality Analysis (FMECA)
- Hazard and Operability (HAZOP) Study
- Structured What-If Technique (SWIFT)
- Master Logic Diagram (MLD)
Hazard Log
A log of hazards of all kinds that threaten a system's success in achieving its safety objectives. It is a dynamic and living document, which is populated through the organization's risk assessment process. The log provides a structure for collating information about risk that can be used in risk analyses and in risk management of the system.
Can contain: - Hazards - Hazardous Events - Incidents (scenarios) - Threats and vulnerabilities - Journal
Checklist Methods
Objectives and Applications
(a) Identify all the hazards that are relevant during all intended use and foreseeable misuse of the system, and during all interactions with the system. (b) Identify required controls and safeguards. (c) Check that available controls and safeguards conform to the requirements specified.
Analysis Procedure
Preliminary Hazard Analysis (PHA)
Change Analysis
FMECA
HAZOP
SWIFT (Not in curriculum 2018)
MLD (Not in curriculum 2018)
Causal and Frequency Analysis
Objectives of the Causal and Frequency Analysis
Methods for Causal and Frequency Analysis
- Cause and effect diagrams
- Fault tree analysis
- Bayesian networks
- Markov methods
- Petri nets
Fault Tree Analysis
Cut set: A cut set in a fault tree is a set of basic events whose (simultaneous) occurrence ensures that the TOP event occurs.
Minimal cut set: A cut set is said to be minimal if the set cannot be reduced without losing its status as a cut set.
Development of Accident Scenarios
Methods for Development of Accident Scenarios
- Event tree analysis
- Event sequence diagrams
- Cause-consequence analysis
- Consequence analysis methods
Human Reliability Analysis
Human error: An out-of-tolerance action, or deviation from the norm, where the limits of acceptable performance are defined by the system. These situations can arise from problems in sequencing, timing, knowledge, interfaces, procedures, and other sources. Human reliability: The probability that a person: (i) correctly performs some system-required activity in a required time period (if time is a limiting factor) and (ii) performs no extraneous activity that can degrade the system. Human unreliability is the opposite of this definition.
Human error probability (HEP): The probability that an error will ocurr when a given task is performed.
Human error mode: The effect by which a human error can be observed.
Human Error Identification
HRA Methods
Job Safety Analysis
A job safety analysis (JSA) is a simple risk assessment method that is applied to review job procedures and practices in order to identify potential hazards and determine risk-reducing measures. Each job is broken down into specific tasks, for which observation, experience, and checklists are used to identify hazards and associated controls and safeguards.
Barriers and Barrier Analysis
Barriers and Barrier Classification
Proactive barrier: A barrier that is installed to prevent or reduce the probability of a hazardous event. A proactive barrier is also called a frequency-reducing barrier.
Reactive barrier: A barrier that is installed to reduce the consequences of a hazardous event. A reactive barrier is also called a consequence-reducing barrier or mitigating barrier.
Active barrier: A barrier that is dependent on the actions of an operator, a control system, and/or some energy sources to perform its function.
Passive barrier: A barrier that is integrated into the design of the workplace and does not require any human actions, energy sources, or information sources to perform its function.