[Note: this text version is only for web crawler.

Click HERE: PUBLICATIONS to access high quality PDF version ] 

 


Don’t Gamble with Your SIS
Understand the benefits and limitations of safety instrumented systems
By Arthur Zatarain, P.E.
 
As a wise singer once crooned, you have to “know when to hold ’em, and know when to fold ’em.” But Kenny “The Gambler” Rogers merely had to beat long-shot odds to win at his game. Outside the casino, designers of industrial control systems don’t have the luxury of being right only 51% of the time. For many manufacturing and process systems, a control system failure - even for a second - simply isn’t an option. Hence, it’s important that control systems deliver safe and reliable performance, even when things go wrong.
 
Also important is the need to maintain production uptime; while additional control devices help prevent accidents, they also reduce uptime by increasing opportunity for nuisance trips. You need to find that delicate balance between safety, production reliability, and overall cost when designing, operating, and upgrading production control systems.
 
The established concepts of safety and reliability for industrial controls are detailed in ANSI/ISA 84.1, [i]Application of Safety Instrumented Systems for the Process Industries[i]. This ANSI/ISA standard applies within the United States, and is equivalent to the IEC61511 standard in Europe and other areas. These standards reveal that statistical analysis of safety instrumented systems is a science in itself. Fortunately, only a few basic concepts are required to appreciate the simplified discussion presented in this article.
 
Safety in numbers
Although using only a single control device often is appropriate, much of safety instrumented system (SIS) design incorporates multiple devices to perform a single control function. The multiple units are cleverly arranged to accommodate the anticipated failure of any single device. Although formal terms such as replicated, complementary, or diverse aptly apply to the various arrangements, the catchall term “redundant” is normally used to describe any flavor of multiple-device configuration.
 
The SIS concept uses an “M out of N” terminology to describe device configuration; reliability is based on M number of properly functioning components out of a total of N. This concept often is noted as MooN (spoken as “M out of N”). For example, 1oo2 (“one out of two”) might represent an arrangement of two relays in series; depending on context, this arrangement can safely shut down a process with only one of the two devices, or it can continue safe operation with only one of two. The terminology for each context is the same, but the applications are quite different. Further examples of typical SIS architectures include:
• 1oo1: A single fuse or rupture disk that limits an over-current or over-pressure malfunction in a near infallible mode.
• 1oo2: Two power supplies connected in parallel to accommodate shutdown of either one. Only “one out of two” is required for continued safe operation.
• 2oo2: Two high-level sensors connected in series that permit a tank inlet valve to open. “Two out of two” devices, both indicating there is no high level, are required to safely open the valve.
• 2oo3: Triple modular redundant (TMR) pressure transmitters configured in a voting system. “Two out of three” devices must agree to continue safe production should one of the three transmitters fail in any manner.
 
Note that each of these examples addresses a specific malfunction of a control device. This important concept will be explored a bit more later on. Figure 1 illustrates four examples of increasingly complex SIS architectures; all are based on simple relay contact motor control.
 
 
FIGURE 1 GOES HERE
 
Start with your needs
Figure 1. These relay contact motor control schemes show how the degree of reliability desired determines the degree of complexity needed in the control system.
 
 
Demanding reliability
A key SIS concept used to evaluate reliability is called probability of failure on demand, or PFD. Its calculation is complex, and often controversial, but is simplified here to denote the percentage of time that a device is expected to not perform its control function properly. As with golf, the goal with PFD is a low score.
 
Different levels of PFD might apply to the same device based on its role in the overall system. For example, a pressure sensor might have a 4% probability of causing a nuisance trip, but only a 2% probability of causing an unsafe situation. Because these probabilities are calculated on a per year basis, and accumulate over time, a device with a 4% PFD is estimated to malfunction once every 25 years (4% failure/year x 25 years = 100% failure). And because the PFD is estimated for each device, the net reliability of a total system rapidly decreases if multiple devices affect a single control function. Therefore, low PFD values for each device are prime design criteria.
 
The values shown in Figure 2 compare the reliabilities (expressed in years to fail) obtained with typical SIS architectures. These values assume a single component with PFDs of 4% nuisance and 2% safety, as mentioned above. The 1oo2 values represent the reliability of a single device. Those numbers might be adequate for some situations, but they degrade rapidly when multiple devices affect a single system.
 
 
FIGURE 2 GOES HERE
 
Reliab
Figure 2. Redundant schemes 2oo2 and 2oo3 help avoid nuisance trips, but risk more frequent unsafe failures than the simpler 1oo2 scheme.
 
For redundant device configurations, it’s interesting to note that the simplest configuration, 1oo2, has the longest time span during which an unsafe condition is expected to occur. However, it also has the shortest time for a nuisance trip. Systems that require reliable operation as well as avoiding unsafe situations might be better served by more sophisticated solutions as found in the 2oo2 and 2oo3 modes.
 
Hold ’em or fold ’em
Two design philosophies for accommodating predictable failure are called fault-tolerant and fail-safe. Although these schemes are first cousins, they represent two distinct responses to a control malfunction. The fault-tolerant mode will “hold ’em” and let the control function continue to operate correctly. The fail-safe mode, however, will “fold ’em” and admit defeat while safely ceasing normal operation. Both modes have valuable - but different - roles in reliable control system design. The following simplified definitions (adopted from the SIS standard) highlight the similarity and difference between the two control concepts:
• Fault-tolerant: Continued correct execution in the presence of a specific malfunction.
• Fail-safe: Assumes a predetermined safe state in the event of a specific malfunction.
 
The similarity between the fault-tolerant and fail-safe modes is their delivery of a predictable response to a specific malfunction. The difference between the two modes lies in their responses; fault tolerance maintains the normal control function, while fail safe ceases normal operation in favor of an acceptable safe state. Note that both control modes require some portion of the overall affected system to remain functional. A control design that continues predictable operation after it itself has totally failed is neither reasonable nor reachable.
 
Identification of specific malfunctions that require a predetermined response is another key aspect of failure-mode design; neither mode by itself can provide a predictable reaction to unknown or indeterminate malfunctions. Specific predictable malfunctions must first be identified such that a failure mode can be designed to accommodate them.
 
Fault tolerance
Generally speaking, no single device can provide a fault-tolerant control function. Most often, a combination of similar (or identical) devices is required to provide “replication” of a particular role such that they perform the same function independently. ANSI/ISA 84.1 labels this as “redundant” if the replicated functions are identical. An alternate method is called “diversity,” in which devices perform similar control functions by means of different technology, process interface points, or computer features.
 
Figure 3 represents a simple 1oo2 fault-tolerant system in which an AC-to-DC power supply is paired with a battery backup to power a DC load; this arrangement uses two so-called diverse components that provide fault-tolerant operation for the specific malfunction of power source failure.
 
 
FIGURE 3 GOES HERE
Can’t lose
Figure 3. If the main power source, the AC-to-DC supply, fails, the system continues to operate because the battery backup remains functional.
 
 
More elaborate fault-tolerant examples include replicated input/output systems and logic solvers that use a 2oo3 voting scheme to accommodate I/O or processor malfunctions. These examples represent both ends of the fault-tolerance spectrum. Such robust designs are appropriate for industrial processes that can’t withstand abrupt suspension, and for any safety system that demands the highest level of reliability.
 
As shown in Figure 2, 2oo3 voting systems promise the longest duration without nuisance trips while maintaining safe operation. That high-end performance comes at relatively high cost, although currently far less than when the concept went mainstream several decades ago. You can minimize total system cost by applying the principles of ANSI/ISA 84.1 and other related standards in a consistent and organized manner. Careful partitioning of the overall control and safety system isolates the critical process controls that require advanced SIS concepts.
 
Fault tolerance certainly isn’t appropriate for every control loop in a plant, but nearly every production environment can benefit from targeted application of a non-stop and safe control system design.
 
Fail-safe
While fault tolerance grabs most of the trade press, fail-safe controls still serve as journeymen in many control system designs. Continued normal operation, however, typically isn’t the goal of a fail-safe mode; the role of fail-safe is to place the control function in a predictable state in which other control functions can operate the ongoing process safely. So, although the control function has technically failed, safe overall process operation isn’t compromised in a fail-safe control system.
 
Fail-safe designs proudly say, “Sure, I might break one day, but I’m not taking anyone down with me.” Consider the lowly electrical fuse; it gives its life in the name of safety by preventing an over-current condition that could cause a fire, or worse. The affected process, however, must tolerate a total loss of power if it’s to rely on a simple fuse for protection.
 
However, many control situations demand a more sophisticated fail-safe solution to a specific malfunction, such as safely withstanding a loss of control power or input signal. The most common fail-safe actions are fail closed or fail open, in which the device output is forced open or closed when a specific malfunction occurs. Other options include fail-in-place, and fail to a specific value. These fail-safe actions permit the still-functioning device to place a controlled element, such as a throttle valve, into a predetermined state to maintain overall process safety.
 
Consider a current-to-pneumatic positioner shown in Figure 4. The local controller is designed to fail-closed on loss of pneumatic motive power; if the air supply fails, a spring inside the positioner automatically closes the valve, regardless of 4-20 ma input signal. Note that the positioner’s fail-safe feature doesn’t apply to failure of the positioner itself; the feature instead covers the specific malfunction of an external power source. Failure of the positioner itself would be a different specific malfunction that must be covered by another device. You must understand this important concept and apply it when using any device in a fail-safe situation. First determine the specific malfunction, then select control components that can remain operational while covering for that anticipated failure.
 
 
FIGURE 4 GOES HERE
 
One form of fail-safe
 SHAPE  \* MERGEFORMAT
 
 
Fail
-
Safe
Malfunction
=
Loss of motive power
Positioner
Fail
-
close
Valve Positioner
4
-
20
ma signal
Air supply
X

Figure 4. A spring-loaded positioner will close the process valve if the pneumatic air supply fails.
 
 
Play your cards right
Fault-tolerant and fail-safe designs clearly serve an important role in reliable control system design. Understanding the complexity, benefits, and costs of each mode is essential to keeping important processes safely online with the uptime demanded in high-production environments. Sometimes a few dollars of fail-safe control can prevent dangerous situations that harm people, property, and the planet. But those safety dollars also must prevent nuisance trips that can lead to costly lost production. Proper application of fault-tolerant and fail-safe designs are, therefore, vitally important when designing and maintaining process control systems. Every control function must be considered carefully because, as ole Kenny advised, you have to know when to hold ’em, and know when to fold ’em.

 

 

 

Best Viewed in Mozilla Firefox

Artzat Consulting is owned by Arthur Zatarain, PE in Metairie Louisiana, a suburb of New Orleans Artzat provides consulting and expert witness services to attorneys, insurers, and end users. Typical projects relate to equipment, automation, instrumentation, and control systems. Service is available nationwide with engineering licenses held in Louisiana, Alabama, California, and Alaska.

Forensic Engineer

A forensic engineer performs analysis and reporting on technhical matters that are typically being pricessed through some form of legal matter. However, a legal environment isn't required for a forensic examination. The analysis may be performed merely to determine the cause of a specific event or condition. For example, a forensic examination may be made on a control system to determine why an accident occured, or why a system did not perform as expected. The forensic analysis may be of software code such as ladder lofic in a PLC, or it may involve hard wired relay logic, electrical controls, power distribution, or instrumentation. Forensic engineering is therefore useful in a variety of situations regardless of the legal entanglement.

Industrial Equipment

Typical equipment includes programmable logic controller PLC, distrubited control system DCS, and electric relay logic. PLC systems use ladder logic for most operations, while a DCS will often use function block programming. The concepts of PLC and DCS have merged into a unified control platform based on open architecture interfaces. The use if ladder logic is widespread due to its earlier application to relay logic circuits.

An expert witness is used to investigate and evaluate the technical and commercial aspects of accidents, intellectual property, and commercial matters. Artzat consulting can assist clients in all these areas, with experience with steam boilers, paper mill, steel mill, burner management, and telemetry scada. Other areas include medical devices, flow measurement, meters, power distribution, and refridgeration.

Expert Witness Services

Expert witness can be provided in any state, with experience in Louisiana, California, Alabama, and Alaska. Other states include North Carolina, Olkahoma, Illionis, and Indiana and Texas. Michigan has also been served, with the states of Washington, Colorado, Oregon, and District of Columbia DC. Any state such as New York or New Jersey can also be served by expert witness service. Professional credentials are important, such as licensed engineer or registered engineer. Also importnat is a masters degree in engineering or similar field. A phd is not a necessity for an expert witness because career experience and expert witness experience is more useful to the client than a phd with no relevant experience.

product Liability

A forensic engineer is useful for matters of product liability and product defects. Artzat Consulting has experience with product liability for industrial and commercial equipment. Product liability has also been analyzed for control systems, programmable controllers, ladder logic, and engineering design. Product liability can result from an original product manufacturer oem, or from a systems integrator who combines components into a complete system.

Forensic Engineering Locations

Service in Louisiana, Mississippi, Texas, and Alabama is efficient due to the proximity of Metairie to those areas. However, an airplane will take Artzat anywhere within the USA in a matter of hours. Travel to Alabama areas such as Birmingham or Montgomery or Mobile is easy, with Huntsville also accessible by car. Visits to Houston, Dallas, San Antonio, and Austin are also less than one day away by car. A phd is not unusual for an expert witness, but is not really important when compared to real life experience with equipment, controls and automation with PLC and DCS control system equipment.

Service in California includes Los Angeles, San Francisco, and San Diego as well as outlying Bakersfield and Antioch. Seattle is a bit far, but the airline does most of the heavy lifting. Travel to New York NYC occurs easily on JetBlue and Delta. Once in NYC the entire tri-state area is easily accessibls, as is upstate new york.

Service to New England is welcomed, so please inquire with your technical requirements for an expert witness. Travel to new England such as Boston is by JetBlue, or other carriers, which can then lead to other New England cities.

Engineer for Machine Accident

An engineer ma be required to serve as an expert witness or forensic for a machine accident such as with a conveyor, power press, steel mill, or extraction machine. The instance could be an equipment accident, or it could be a process accident. A typical example is an expert engineer for a manufacturing accident. This could be an expert engineer or forensic engineer in an assembly plant, or an expert engineer in a production line or on a vehicle assembly line.

Oilfield accident

An expert engineer can be useful to evaluate an oilfield or oil and gas accident. Those events may include oil and gas or the related products such as water, co2, h2s, and sulfates. The accidents occur on oil wells, gas wells, pipelines, storage tanks, and production vessels such as separators, treaters, waste heat recovery units, and water treating facilities. Such events can be generally divided into an oil and gas drilling accident or an oil and gas production accident. An oilfield accident requiring an expert engineer can occur onshore of offshore. The expert engineer can be for control system, production system, safety system or automation system, or instrumentation. The system can be electrical, electric, electronic, hydraulic, and pneumatic. A computer control system can also require an expert engineer. An industiral engineer can also be used if the matter involves safety and production systems.

Automatic control

An expert engineer may be required for an accident involving automatic control. That expert could be for electrical engineer, control system engineer, or automation engineer. A mechanical engineer or someone with experience with mechanical engineering can also be useful for an automatic control accident. A certified systems integrator is someone who can be an expert engineer for automatic control. The systems integration involves combining multiple equipment and techology into a single control system. This involves design, programming, fabrication, testing installation, and maintenance.

industrial accident

An industrial accident may require an expert engineer or forensic engineer to analyze and evaluate the control system connected with the event. The accident may have nothing to do with the control system. Still, a forensic engineer may be required to analyze the system to determine that the control system was not af fault.

Equipment accident

An equipment accident can require an expert engineer or expert witness to help evaluate the circumstances and situation including the mechanical and electrical components of the equipment. This can be industrial equipment, process equipment, manufacturing system, commercial equipment such as heater or dryer, or pump and compresssor. Industrial equipment is also a flow meter, electrical switchgear, control switch, button, and instrumentation. End devices are pressure, temperature, level, and other physical measurement. Many equipment is used for food production, packaging, transportation, storage, and conveyor. Metal processing such as steel mill, paper mill, refinery, petrochemical, and tank farm. Vehicle can also be equipment itself, or it can contain devices related to an equipment accident.