Reliability-based approach reduces flare design relief load

Dec. 15, 1997
Refiners and petrochemical producers can use the reliability-based approach for sizing or retrofitting flare systems to significantly reduce capital investments. The approach systematically estimates the maximum credible relief load that could occur within a selected level of confidence. Traditional methods for sizing flare systems often result in relief loads that are unreasonable and uneconomical.
J. Patrick Williams
Arinc Annapolis, Md.

Michael D. Donovan
Scientech-NUS Gaithersburg, Md.

Refiners and petrochemical producers can use the reliability-based approach for sizing or retrofitting flare systems to significantly reduce capital investments.

The approach systematically estimates the maximum credible relief load that could occur within a selected level of confidence.

Traditional methods for sizing flare systems often result in relief loads that are unreasonable and uneconomical.

Although the American Petroleum Institute (API) recommended practices (RP) provide guidelines to calculate relief loads, they allow engineering judgment to determine the maximum credible flow for simultaneous relief from multiple sources.

The reliability-based approach uses plant-specific data to determine the frequency of relieving events and the failure probability of the emergency operations. The maximum credible relief load is calculated from the worst relieving scenario that could be expected at least once in a given period of time, e.g., once in 10,000 years.

Conservative relief loads

Process contingencies, such as fire, power failure, and closed outlet, can cause an overpressure condition in piping and equipment. Relief devices, usually one or more pressure relief valves (PRVs), protect the system from overpressuring beyond its safe limit.

The American Society of Mechanical Engineers (ASME) Boiler and Pressure Vessel Code and API RP 520 and 521 state that, unless strong evidence dictates otherwise, the worst-case conditions must be adopted (i.e., no credit for any actions that normally decrease relief load during a process upset) to size PRVs and their associated equipment and piping.

When this conservative approach is extrapolated to calculate plant-wide relieving systems (e.g., flare systems), the total relief load is usually unreasonably large. In such extrapolations, the total relief load is obtained by adding the relief loads calculated for PRV sizing purposes.

Flare systems are not easily retrofitted to meet calculated loads by the traditional method. The capital investment and lost production to retrofit existing flare systems to meet large loads can add up to several million dollars. In one case, a West Coast refinery was unable to increase its flare size because a permit was denied by local authorities.

Except for small process units, adding the relief loads calculated for PRV sizing purposes for a contingency affecting multiple protected systems is overly conservative and impractical.1 Recognizing this problem, API RP 5212 suggests considering the capability and timing of human intervention to reduce the relief load using good engineering judgment to establish the most appropriate flow basis.

How do we quantify good engineering judgment? The reliability-based approach complements API 521 by providing a quantitative methodology for determining the maximum credible relief load of a flare system.

Relief-reducing actions

A relief-reducing action (RRA) is any sequence of human or mechanical actions that has potential for decreasing or preventing the relief load caused by a given contingency. An RRA can range from a manual emergency operation (e.g., closing a valve or starting up nearby equipment) to actuation of highly redundant automatic emergency shutdown (ESD) devices (e.g., modern fired heater ESD systems).

To illustrate how RRAs affect the relief load of a process system, consider a power failure case affecting the distillation column system in Fig. 1 [88,336 bytes]. The feed, reflux, and products are pumped by motor-driven equipment, and the condenser is water-cooled.

The pumps and the condensing duty fail on a loss of power. Assume that the reboiler duty is not affected by a power failure if the steam is not shut off. The RRA is shutting off the reboiler steam to limit heat input to the column. The heat balance for normal operating conditions and at relief conditions, with and without credit for the RRA, is summarized in Table 1 [9,252 bytes].

The worst-case scenario for this system occurs when, at relief conditions, the reboiler continues to deliver its full duty, causing a heat imbalance of 25 million BTU/ hr. The unbalanced heat is absorbed by the liquid inventory in the column. The fraction of the liquid inventory that vaporizes escapes through the PRV.

Assuming a latent heat of vaporization of 100 BTU/lb, a relief load of 250,000 lb/hr is the worst-case scenario. This relief load would be used for PRV sizing purposes.

If the RRA reduces the heat input to 20% of the reboiler normal duty, the relief load is 200,000 lb/hr less than the worst case.

Potential consequences

Taking credit for certain RRAs can result in smaller relief loads and flare systems than those sized using worst-case conditions. The trade-off is the potential for overloading the flare system.

A flare system can be limited by radiation, back pressure, or velocity, depending on which relief scenario was analyzed. The potential consequences of overloading a flare system include:

  • High thermal radiation and noise levels
  • Flame blowout
  • Flare header or subheader failure
  • PRV bellows or flange failure
  • Pressure vessel failure.
In general, plant systems are designed with safety margins that preclude failure at conditions moderately in excess of design. The effectiveness of these safety margins in mitigating the consequences of excessive relief load is difficult to quantify because it depends on the extent of flare overloading and the mechanical condition of the flare system. To provide safety assurance, the potential consequences are considered actual consequences.

Potential hazards associated with petrochemical industries are explosions, fires, and releases of toxic substances. Although the probability of occurrences of accidents is site-dependent, frequencies for accidents with serious consequences have been established over a large population of facilities. Event frequencies for potential plant hazards are shown in Table 2 [14,301 bytes]. Overloading the flare system may result in similar consequences.

Reliability-based approach

The reliability-based approach consists of the following four steps:
  • Selection of an acceptable frequency of occurrence for events that may overload the flare system
  • Determination of the frequency of occurrence for scenarios that could result in large relief loads and identification of measures to reduce the frequency of these scenarios
  • Identification of the applicable RRAs and determination of their probability of failure and impact on the total relief load
  • Identification of the combinations of RRA failures that are credible within the acceptable frequency for exceeding the flare system design capacity and determination of the maximum credible relief load.

Acceptable frequency

The selected acceptable frequency for exceeding flare system design capacity is a principal factor in making subsequent decisions. The frequency is site-specific and depends on many factors, which include location, corporate risk-management policies, and general perceived risk.

Designing for the worst-case scenario does not eliminate the possibility of a catastrophic overpressure event. Any flare system may fail to protect the plant if the flow between a process system and the flare tip is obstructed (e.g., improper operation of a block valve on a flare line) or if the PRV fails to open.

Table 2 shows that 1 x 10-4 events per year, i.e., once in 10,000 years, is a reasonable value for an acceptable frequency for overloading a flare system. This rate equates to a maximum probability of 0.5% that an event which would overload the flare system would occur once in a 50-year period. The rate is consistent with the threshold for accepting a serious plant accident proposed by Harvey,7 and the 1 x 10-5 events per year frequency suggested by Bowen8 as a limit for practical engineering.

The expected frequency of a relieving event in which all RRAs fail to operate as intended (worst-case scenario) is on the order of 10-13 events per year for a refinery provided with reasonably well maintained ESD systems.9

Frequency of relief scenarios

A relief scenario is the result of a single event (i.e., an initiating event) that triggers a sequence of events involving interrelated systems. Typical plant hazards to be considered as initiating events are power failure (total or local), cooling water failure, plant steam failure, and fire.

Additional initiating events with potential to become a flare header sizing case may be more plant-specific or process-dependent, e.g., emergency depressurizing of hydroprocessing units.

The frequency of initiating events can be obtained from historical plant data or estimated from a suitable analysis such as fault tree analysis (FTA) and applicable industry data.

A fault tree is a graphical representation of the failure modes of an undesired event (the top event) in terms of failures of its primary cause events (basic events).

The events in the fault tree structure are related through logic gates and/or representing the conditions that need to be met for the component above the gate to fail. The frequency or probability of failure of the component above each gate in the fault tree is computed from the frequency, probability, or unavailability of each of the components directly below the gate. This process is repeated until the top event is reached.

A simplified fault tree representing a power failure is illustrated in Fig. 2 [83,158 bytes]. In this case, the power failure may lead to loss of steam, depending on the level of integration of the steam and power systems.

The event sequences that can follow an initiating event and lead to relief scenarios of interest are defined by the dependencies of other systems (e.g., power failure may lead to a power and cooling water failure scenario). The event sequences can be established using event tree analysis (ETA), which complements the FTA technique.

The primary event of an event tree is the initiating event. The events that follow the initiating event in each relief scenario are the successes and failures of other systems. For example, the event tree for a plant-wide, power-failure initiating event, shown in Fig. 3 [31,211 bytes], models the four potential scenarios that can result from a total loss of plant power. The "up" branch of the tree represents the success of a system and the "down" branch represents the failure of the system.

Certain systems, however, may totally depend on other systems (i.e., failure of one system results in the failure of another system). Total dependence between systems is represented by a lack of branching under a system in the event tree. For example, in Fig. 3, there is no branching under cooling water or instrument air systems when the steam system has failed because, without electric power or steam, both cooling water and air systems have failed.

FTA and ETA techniques help identify ways to decrease the frequency of occurrence of the scenarios that could potentially overload the flare system. If the expected frequency of a scenario is reduced to, or below, the accepted frequency for exceeding the flare system design capacity, the scenario can be excluded from further consideration.

RRA failure probabilities

The failure probability of RRAs depends on human and mechanical factors. These factors are highly dependent on the operating and maintenance philosophy at the facility. The benefit of a well-designed automatic ESD system can be totally offset by poor operating or maintenance practices. The failure probability of RRAs can be obtained from plant test results or estimated using FTA and ETA techniques supported with representative industry data.

Those RRAs which have the greatest impact on relieving bottlenecks in the flare system are the most suitable candidates for reliability improvement. Typical approaches for improving RRA reliability include:

  • Increasing the frequency and scope of testing of emergency devices (e.g., heater fuel lockouts, standby pump autostarts) and ensuring the prompt correction of any detected malfunctions
  • Substituting unreliable mechanical components with more-reliable components or human actions with automatic emergency devices
  • Adding redundancy to the design to minimize common cause failures and undesired spurious trips (e.g., inclusion of the control valve in the ESD loop for a reboiler or heater, two of three voting logic for sensing devices)
  • Implementing and enforcing strict administrative procedures to minimize the probability of an emergency device being permanently disabled or inadvertently left in its bypass position.
RRAs involving poorly maintained devices (not routinely tested) may have a very low probability of operating when required. An important factor in determining the probability of failure on demand of emergency devices (not normally in operation) is the frequency of routine testing.

For most emergency devices, a malfunction is not revealed until the device is needed to operate. If the device is tested and a malfunction is detected, it should be addressed promptly. If it is not tested, the probability of failure on demand will increase over time.

The process in Fig. 1 can be used to illustrate the effect of design improvements and testing practices on the failure probability of RRAs. Data for this example were obtained from previous studies involving reliability analysis for similar systems. Suppose that the RRA for Fig. 1 is an emergency operation that requires the steam block valve to be manually closed at the reboiler within 5 min after power failure. A failure probability of 9 x 10-1 appears reasonable for this example. This is an unreliable type of RRA since it depends on the intervention of humans under stress in an area not normally staffed.

Reliability of this RRA can be improved through changes in design and testing practices. Typical RRA failure probabilities of the following four successive design improvements are presented in Table 3 [11,939 bytes]:9

  • Addition of a remotely operated ESD valve, actuated manually from the control room.
  • Upgrading of the ESD from a semiautomatic to a fully automatic lockout device tripped on power failure or high column pressure.
  • Inclusion of the control valve in the ESD function by installing an air-dumping solenoid valve on the air supply line to the control valve to force it to its fail-safe (closed) position. Although control valves may not ensure a tight shutoff, they provide a reliable means of mitigating relief.
  • Commitment to proper ESD system management (e.g., a keylock system combined with strict administrative control procedures). This minimizes the probability of the ESD function being inadvertently left in its bypass position.

Credible number of failed RRAs

The minimum relief load into the flare system results when all applicable RRAs operate as intended. The maximum possible relief load is the sum of all the relief loads calculated for PRV sizing purposes (worst-case scenario).

The maximum credible relief load for the flare system is given by the worst combination of RRA failures that could be expected to occur within the selected frequency for exceeding the flare system capacity. The credible number of RRA failures is determined by a probability analysis.

Each combination of RRA failures with potential for becoming a flare system sizing case must be analyzed to determine the adequacy of the flare system. If the back pressure or radiation constraints are still not met, further reliability improvements or piping changes may be required.

This methodology is repeated until an optimum combination of reliability and piping changes is reached. Fig. 4 [29,742 bytes] illustrates a sample probability distribution curve for the number of RRA failures in a plant. The failure probability of each RRA used in the construction of Fig. 4 is assumed to be 0.1.

Considerable savings

Application of the reliability-based methodology has resulted in considerable savings in unnecessary capital investments while ensuring plant safety. The results from five flare studies are summarized in Table 4 [7,910 bytes].

The reliability-based approach shows that, with some reliability improvements to plant utility systems and RRAs, the existing capacities of the flare systems in Table 4 were adequate for handling the worst-credible relief scenario for the plant. The cost of the improvements was minimal compared with the cost for implementing flare system modifications dictated by the traditional approach.

References

  1. Lees, F.P., Loss Prevention in the Process Industries, Vols. 1 and 2, Butterworths & Co. Ltd., London, 1983.
  2. Guide for Pressure-Relieving and Depressuring Systems, API Recommended Practice 521, Third Edition, American Petroleum Institute, Washington D.C., November 1990.
  3. Guidelines for Process Equipment Reliability Data, Center for Chemical Process Safety of the American Institute of Chemical Engineers, New York, 1989.
  4. Kletz, T.A., "The Applications of Hazard Analysis to Risks to the Public at Large," Proceedings: World Congress of Chemical Engineering, Amsterdam, 1976.
  5. Vervalin, C.H., "Hazard Evaluation," Hydrocarbon Processing, December 1986.
  6. Garrison, W.G., "Major Fires and Explosions Analyzed for 30-Year Period," Hydrocarbon Processing, September 1988.
  7. Harvey, B.H., First Report of the Advisory Committee on Major Hazards, HM Stationary Office, London, 1976.
  8. Bowen, J.H., "Individual Risk vs. Public Risk Criteria," Chemical Engineering Progress, Vol. 72, No. 2, February 1976.
  9. Proprietary studies for refining and petrochemical customers, Arinc Inc., Annapolis, Md.
J. Patrick Williams is a principal engineer with Arinc in Annapolis, Md. He has managed numerous safety and reliability projects for refineries and petrochemical plants, including seven reliability-based flare system assessment projects for U.S. and Latin American facilities. Williams holds a BS degree in chemical engineering from Catholic University of Valparaiso, Chile, and an MS degree in chemical engineering from the University of Maryland.
Michael D. Donovan is the general manager of the engineering division at Scientech-NUS, Gaithersburg, Md. While with Arinc, where he wrote this article, he headed the development efforts for the reliability-based methodology. Donovan, a licensed professional engineer in Georgia, holds a BS degree in mechanical engineering from Vanderbilt University, Nashville, and an MS degree in industrial management from Georgia Institute of Technology, Atlanta.

Copyright 1997 Oil & Gas Journal. All Rights Reserved.