Is one better than the other? Both Fault-Tree Analysis (FTA) and Failure Mode and Effects Analysis (FMEA) are popular tools for root cause analysis, fault-finding and risk analysis. But the similarities end here. Each analysis has its own approach to failures, which translates into very different results. Let’s dive into the differences between FTA vs FMEA.
What are the differences between FTA and FMEA?
Before we begin, we recommend reading our articles about FMEA and FTA. If you’re already familiar with both methods, then we can go right to it! This table sums up the main differences between FMEA and FTA:
|deductive, top-down approach||inductive, bottom-up approach|
|showcases the correlation between multiple failures||catalogs failures for each component and does not analyse the system as a whole;|
|considers external events||doesn’t consider external events|
|does not consider partial failures||does not consider unexpected failures|
|easy to update with proper software||often consists in Excel sheets that require constant updating|
The first difference between FMEA and FTA is their approach to failure. FTA is a systematic, deductive, top-to-bottom method. The starting point is the failure itself, and from then on it reaches a broader conclusion – like an investigation or a diagnosis.
For example, if the fire alarm fails, an FTA will use that as a top event. The diagram then explores potential causes (whether the fire detection system failed or the heat sensors failed) until the root cause is found.
On the other hand, FMEA is an inductive method, which applies a general rule to a particular situation. Instead of starting at the top, it takes a “bottom-up” approach and focuses on a single component. Based on the asset’s history, it describes all possible failure modes and the effect of each failure mode.
This is why you might get the impression you’re creating a “failure catalog” more than anything else. In the example above, an FMEA would list all of the possible failures in the fire alarm and write down “fire propagation” as an effect for all of them.
Determine failure modes in an FMEA vs. FTA
These two very different approaches have their consequences when it comes to determining failure modes. However, they have one thing in common: both approaches require someone with deep knowledge about the asset and its reliability.
An FMEA analysis hinges on predicting all possible failure modes for each component. You can break them down into modes that cause a complete breakdown; partial failures and almost unnoticeable damage. Unlike FTA, it will not take into account conditioning events, nor does it establish the relationship between multiple failures.
On the other hand, with an FTA we might neglect partial failures, because each hypothesis equates to either “0” or “1”; there’s no scale. If there’s a failure but the asset retains some degree of functionality, this is not depicted in the data. However, in project and design analysis, it’s very efficient in identifying potential failures and components that need safety improvements.
Why is it so hard to catalog failure modes in an FMEA?
Besides being time-consuming, it’s easy to overlook unexpected failure modes or failures that stem from multiple failures within the system. An example is what happened at the nuclear power plant in Fukushima, Japan. Initially, the reactors survived the impact of the earthquake thanks to backup power generators. But when the tsunami that followed flooded the room with the generators, the worst happened.
The possibility of these multiple failures occurring at the same time had not been hypothesized. Nor the flooding of the generator room, possibly. The Fukushima sea wall was only 5.7 meters tall – but it’s estimated the waves reached a height of 14 to 15 meters that day.
Quantitative vs. Qualitative Analysis
FTA is one of the most well-known ways to perform a Probabilistic Risk Assessment (PRA), which makes it a quantitative tool. That’s why it is almost mandatory in high-hazard industries, such as the nuclear, petrochemical and pharmaceutical industries. But that wasn’t always the case. Again, we’ll look to history to give you a real-life example that illustrates the differences between FTA and FMEA.
Why is FMEA a subjective analysis?
When NASA applied a PRA in the 60s to calculate the likelihood of travelling to the Moon and returning “safely”, the result was only 5%. The American Space Agency found these numbers too dim – especially if they were leaked to the public – and opted for a qualitative method. You guessed it: FMEA.
Even though FMEA also calculates a Risk Priority Number (RPN) that consists of a scale from 0 to 10, it’s subjective. In other words, a “4” is not necessarily two times more dangerous than a “2”. Yet NASA used FMEA to assign a criticality level to each component from the 60s to the early 80s. If a component’s failure put the crew’s lives at risk, it had “criticality 1”; if it put the mission at risk, it had “criticality 2” and all the other failures had “criticality 3”.
A failure like the one that caused the disintegration of the Challenger space shuttle in 1986 – the unusually low temperatures compromised the o-rings – had a criticality of 1, but the occurrence ranking was only “2”, meaning it would happen “2 times out of 100,000”. No calculations had been made to correlate temperature with performance, nor had it been tested by the manufacturer.
Ultimately, the occurrence ranking was misleading and subjective. NASA underestimated risk and, as a consequence, there was no backup for the o-rings. After the disaster, the aerospace industry started to use a combination of FMEA and PRA. The nuclear industry had already been an early adopter of FTAs after the Three Mile Island accident.
However, in Challenger’s case, it should be noted that the only available statistical data would have come from trial flights and stress tests, which might have been limited. For assets without reliable statistical data, no tool can truly be quantitative!
Chains of events in FTA and FMEA
This is another big difference between FTA and FMEA. FTA takes into account several events, including external events (i.e., earthquakes) and conditioning events (i.e., temperatures). FMEA creates an “isolated system” and doesn’t consider how external agents might compromise its integrity.
Day to day use and updates
This is not a difference between the analysis themselves, but in the way we use them day to day. FMEA is usually a detailed table that is hard to keep updated, while FTA is usually done with a software (which also integrates statistical data). Hence, most maintenance managers find it easier to keep FTAs up to date.
When to use FTA vs. when to use FMEA
Given the substantial differences we explored throughout the text, it’s clear FTA and FMEAs have advantages and limitations. Nobody is perfect!
When to use a Fault-Tree Analysis (FTA):
- there are very few “top events”;
- you need to assess safety within a system;
- you need to perform a PRA;
- you’re analysing a complex system in which several components interact with each other;
- there is a lot of room for “human error” or software issues that trigger safety modes.
When to use a Failure Modes and Effects Analysis (FMEA):
- you cannot pinpoint top events to perform an FTA;
- the goal is to identify all possible failure modes, even if they don’t have a dangerous effect (i.e., you need to prepare a product manual);
- the asset’s performance is predictable and doesn’t require a lot of human intervention, which makes it more likely that you’ll be able to catalog all failure modes.
Can FTA and FMEA work together?
FTA and FMEA are not mutually exclusive. Risk analysis can either be quantitative or qualitative, so they can work side by side. For most maintenance and facility managers, a hybrid tool would be just right.
Other versions of FMEA, such as FMECA, PFMECA or a Variation Modes and Effect Analysis (VMEA), are intermediate solutions that allow managers to blend quantitative with probabilistic analysis.
These tools are increasingly used in Industry 4.0 to decide which assets and systems have top priority in predictive maintenance – which is (still) expensive and almost always reserved for critical assets.