[Note: this text version is only for web crawler.
Click HERE: PUBLICATIONS to access high quality PDF version ] 




Title: Gremlins That Jeopardize Control System Reliability Deck: Four often-overlooked flaws that can creep into modern automation and control systems while you're not watching. = = = = = Are you committed to control system reliability, or are you merely involved? Ruthless oilman J.R Ewing, of TV's Dallas fame, once explained the difference with, "It's like ham and eggs at breakfast: a chicken was certainly involved, but that pig was committed." Hopefully, the folks who count on you to design and maintain a reliable control system won't expect quite such commitment from you. But they do expect your commitment to a control system that meets their expectations for performance, safety, and economy. And although you know your systems inside-out, and often check their pulse, small problems can fly below radar to dramatically decrease reliability and increase downtime. My experience in field service and forensic engineering has shown that ordinarily dependable systems can be jinxed by relatively minor oversights. Smooth running systems may be the most vulnerable because they are rarely analyzed. And while the four situations discussed below range from obvious to obscure, all of them may be closer than you think. 1. Incompatible electronic spares and diagnostics I've never met a relay that stumped me, but I've certainly met my share of perplexing PLC's and other programmable devices. And one certainty about computerized equipment is that every element must be 100% compatible: hardware, operating software, application programming, and even interconnecting cables. Every aspect must fit perfectly to produce a reliable system. With modern controls, even a small PLC can contain multiple processors as well as several I/O and communications boards, each driven by their own on-board microprocessors. All of those devices must have the proper operating code (either ROM or firmware) in order to function properly with a particular variant of your application code. It's not always obvious that yesterday's program may be incompatible with today's apparently identical spare processor. Therefore, control system reliability is quietly threatened by the continually evolving software provided on replacement parts. I tripped over this anomaly while analyzing an accident on a circuit board manufacturing machine. The equipment had been damaged in shipment to a new facility, and the PLC processor had been replaced with an assumed compatible part--same brand, same part number--but with a later version of firmware. The machine operated normally for weeks before a safety switch failed to prevent automatic operation. The problem was traced to a "trick" in the ladder logic that had been used to work around a flaw in the firmware of earlier controllers. Although the flaw was later rectified onboard the controllers, the earlier work-around was incompatible with the later firmware. The manufacturer's literature explaining the situation had never made it to the factory floor. Merely installing a spare card may therefore introduce unanticipated operation that goes unnoticed until a particular situation arises. Further, even a known incompatibility can result in expensive downtime until a suitable guru can be located to coordinate the mismatched parts and programs. And even the best guru can have his hands tied until a vintage DOS based computer can be found to run the outdated software required to update the equipment. The bottom line here is that although you have lots of spare parts on hand, are they really compatible? And will you be able to determine compatibility in real time while costly production is halted? If not, then now's the time to work out a better plan. 2. Incorrect fusion of Emergency Stop with Lockout-Tagout Although ESTOP and LOTO sound like space-age cartoon characters, they are better known as essential components in any reliable control system. Both relate to effectively halting an industrial machine or process. However, ESTOP and LOTO serve significantly different purposes that can inadvertently work against each other if improperly coordinated. Careful consideration during initial design and later modification is essential for reliable control system operation and maintenance. ESTOP is aimed at producing a safe and immediate shutdown during any phase of operation. The primary goal is minimizing harm to people, property, production, and the planet. The details of performing an ESTOP vary greatly among situations, ranging from very simple (as in a drill press) to very complex (as in a roller coaster). Regardless of any particular design, all ESTOP systems share the common feature of simple and latched activation: once the red mushroom is pushed, a safe mode is activated, and nothing restarts until after a manual reset. LOTO, on the other hand, affects non-operational maintenance concerns that are generally handled by specifically trained maintenance staff. LOTO usually occurs via a pre-planned process rather than as a single action, and applies primarily to offline equipment. Although aspects of ESTOP and LOTO often overlap, a major philosophy difference is that ESTOP does not demand relief of stored energy. In fact, stored energy is often required during ESTOP to prevent undesired motion. Alternately, LOTO requires removal of all energy sources such as electrical, pneumatic, hydraulic, and mechanical. LOTO also requires a means to test its effectiveness before maintenance begins. That particular aspect is one area that can affect control system reliability when ESTOP and LOTO functions overlap in the field. I once analyzed an accident involving mixed ESTOP and LOTO that occurred on a hydraulically powered foundry conveyor. The incident began when a maintenance worker used ESTOP to halt the power source and disable the actuator controls before servicing a high pressure hose. Although the machine appeared dormant while the controls were tested, the energy trapped within the system was enough to remove his arm when the hose was disconnected. One flaw in that LOTO process was the incorrect use of ESTOP to disable the solenoid valves. Subsequent testing after the ESTOP suggested that the conveyor was inert, but actually only the controls were dormant. The use of ESTOP rather than a separate LOTO control mode blocked the means to relieve the trapped pressure and properly check for stored energy. If your controls are part of your lockout-tagout process, make certain the entire plan works as expected. It's possible that working controls are needed to make sure the equipment is indeed locked out. 3. Inadequate isolation of analog inputs It's widely recognized that industrial analog inputs are rated to withstand specific over-voltage conditions. A lesser known aspect of analog inputs is the potential (and undesirable) interference with adjacent points on a multipoint analog input device. This "isolation" rating is far from standardized, and is sometimes difficult to determine without digging through the detailed technical specs. Input systems designated as "isolated" are generally non-multiplexed and do not suffer from this limitation. However, many analog input systems, especially in lower cost equipment, do employ multiplexing circuits that are subject to adjacent channel interference. The detrimental affects may be only temporary, or they can persist until power is removed from the entire device. This phenomenon was the cause of a gas detection failure in a supposedly fault tolerant system based on redundant processors. Although the system correctly indicated a faulted input, the program had no means to sense the inaccurate measurements on other inputs that shared the same analog input multiplexer. This resulted in a high gas situation that went undetected until a flash fire destroyed a section of the facility. Had the operators known of the isolation problem they could have disconnected the faulted input, thereby reinstating the integrity of the remaining inputs. The gas detection system was subsequently rebuilt using more expensive and space-consuming isolated inputs--probably a good plan in any fault tolerant system. Regardless, understanding this phenomenon and planning for it's eventual occurrence is essential when using non-isolated analog inputs. So, what's installed in your systems?   4. Outdated panel layouts No one intentionally plans a confusing panel layout, yet examples of poor designs are common throughout industrial installations. Sometimes, no single person seems to know what all the buttons and lights really mean. Some flaws are inherent in the original design, while others creep into the system due to well intended but poorly executed field modifications. But the end result is that poorly labeled controls can diminish reliability at the very time it's most needed. I once investigated a horrific fatality caused by relocation of an existing panel to the opposite side of a battery recycling machine. Although the manual controls still functioned properly, they had been rotated 180 degrees out of sync with the physical orientation of a multi-ton hydraulic ram---moving the lever left now moved the ram to the right. Under normal conditions this left-right disorientation was of little consequence. But when an equipment operator frantically pushed the control to free a trapped maintenance worker, the panicked rescue efforts actually made the situation worse. The end result was a decapitation under manual control. Improper field modifications are a frequent source of control system problems. Perhaps a few replacement tags and minor wiring changes would have prevented that disaster. Are all your panels properly orientated and labeled? Bonus tip Although the problems described above aren't insurmountable, they all require focused attention that is difficult to schedule in a busy work environment. So here's one tip that won't cost you a bundle, and is so smart that you'll claim you thought of it yourself: recruit a co-op student or summer intern to survey your systems, and then have them document what's both right and wrong. A mid-level Engineering or Industrial Tech student will eagerly delve into all those systems you'd rather not revisit. Their untainted eye will spot those illogical control layouts, missing documentation, and incompatible spare parts. They will be thrilled to salvage a vintage laptop and link it to a real-life PLC while building your disaster recovery kit. And you'll make good use of their final report that details the good, bad, and reliability-challenged aspects of all your control systems. Don't stay just stay involved in control system reliability--get committed. That way we'll never have to meet by accident. -end- = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Illustration suggestions: Section 1: closeup of a circuit board ROM highlighting it's firmware version number. Section 2: emergency stop mushroom button in a field installation. Section 3: Wires terminating onto a PLC input card Section 4: Photo of a messy control panel (with sticky-notes, missing buttons, etc). Other images: seasoned plant-floor worker instructing a novice in the field. = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Bio information for Arthur Zatarain, PE Arthur Zatarain, PE, works in forensics and intellectual property through Artzat Consulting, LLC. He is also Vice President of TEST Automation & Controls, a provider of industrial and oilfield systems worldwide. He can be reached through www.artzat.com.

Best Viewed in Mozilla Firefox

Artzat Consulting is owned by Arthur Zatarain, PE in Metairie Louisiana, a suburb of New Orleans Artzat provides consulting and expert witness services to attorneys, insurers, and end users. Typical projects relate to equipment, automation, instrumentation, and control systems. Service is available nationwide with engineering licenses held in Louisiana, Alabama, California, and Alaska.

Forensic Engineer

A forensic engineer performs analysis and reporting on technhical matters that are typically being pricessed through some form of legal matter. However, a legal environment isn't required for a forensic examination. The analysis may be performed merely to determine the cause of a specific event or condition. For example, a forensic examination may be made on a control system to determine why an accident occured, or why a system did not perform as expected. The forensic analysis may be of software code such as ladder lofic in a PLC, or it may involve hard wired relay logic, electrical controls, power distribution, or instrumentation. Forensic engineering is therefore useful in a variety of situations regardless of the legal entanglement.

Industrial Equipment

Typical equipment includes programmable logic controller PLC, distrubited control system DCS, and electric relay logic. PLC systems use ladder logic for most operations, while a DCS will often use function block programming. The concepts of PLC and DCS have merged into a unified control platform based on open architecture interfaces. The use if ladder logic is widespread due to its earlier application to relay logic circuits.

An expert witness is used to investigate and evaluate the technical and commercial aspects of accidents, intellectual property, and commercial matters. Artzat consulting can assist clients in all these areas, with experience with steam boilers, paper mill, steel mill, burner management, and telemetry scada. Other areas include medical devices, flow measurement, meters, power distribution, and refridgeration.

Expert Witness Services

Expert witness can be provided in any state, with experience in Louisiana, California, Alabama, and Alaska. Other states include North Carolina, Olkahoma, Illionis, and Indiana and Texas. Michigan has also been served, with the states of Washington, Colorado, Oregon, and District of Columbia DC. Any state such as New York or New Jersey can also be served by expert witness service. Professional credentials are important, such as licensed engineer or registered engineer. Also importnat is a masters degree in engineering or similar field. A phd is not a necessity for an expert witness because career experience and expert witness experience is more useful to the client than a phd with no relevant experience.

product Liability

A forensic engineer is useful for matters of product liability and product defects. Artzat Consulting has experience with product liability for industrial and commercial equipment. Product liability has also been analyzed for control systems, programmable controllers, ladder logic, and engineering design. Product liability can result from an original product manufacturer oem, or from a systems integrator who combines components into a complete system.

Forensic Engineering Locations

Service in Louisiana, Mississippi, Texas, and Alabama is efficient due to the proximity of Metairie to those areas. However, an airplane will take Artzat anywhere within the USA in a matter of hours. Travel to Alabama areas such as Birmingham or Montgomery or Mobile is easy, with Huntsville also accessible by car. Visits to Houston, Dallas, San Antonio, and Austin are also less than one day away by car. A phd is not unusual for an expert witness, but is not really important when compared to real life experience with equipment, controls and automation with PLC and DCS control system equipment.

Service in California includes Los Angeles, San Francisco, and San Diego as well as outlying Bakersfield and Antioch. Seattle is a bit far, but the airline does most of the heavy lifting. Travel to New York NYC occurs easily on JetBlue and Delta. Once in NYC the entire tri-state area is easily accessibls, as is upstate new york.

Service to New England is welcomed, so please inquire with your technical requirements for an expert witness. Travel to new England such as Boston is by JetBlue, or other carriers, which can then lead to other New England cities.

Engineer for Machine Accident

An engineer ma be required to serve as an expert witness or forensic for a machine accident such as with a conveyor, power press, steel mill, or extraction machine. The instance could be an equipment accident, or it could be a process accident. A typical example is an expert engineer for a manufacturing accident. This could be an expert engineer or forensic engineer in an assembly plant, or an expert engineer in a production line or on a vehicle assembly line.

Oilfield accident

An expert engineer can be useful to evaluate an oilfield or oil and gas accident. Those events may include oil and gas or the related products such as water, co2, h2s, and sulfates. The accidents occur on oil wells, gas wells, pipelines, storage tanks, and production vessels such as separators, treaters, waste heat recovery units, and water treating facilities. Such events can be generally divided into an oil and gas drilling accident or an oil and gas production accident. An oilfield accident requiring an expert engineer can occur onshore of offshore. The expert engineer can be for control system, production system, safety system or automation system, or instrumentation. The system can be electrical, electric, electronic, hydraulic, and pneumatic. A computer control system can also require an expert engineer. An industiral engineer can also be used if the matter involves safety and production systems.

Automatic control

An expert engineer may be required for an accident involving automatic control. That expert could be for electrical engineer, control system engineer, or automation engineer. A mechanical engineer or someone with experience with mechanical engineering can also be useful for an automatic control accident. A certified systems integrator is someone who can be an expert engineer for automatic control. The systems integration involves combining multiple equipment and techology into a single control system. This involves design, programming, fabrication, testing installation, and maintenance.

industrial accident

An industrial accident may require an expert engineer or forensic engineer to analyze and evaluate the control system connected with the event. The accident may have nothing to do with the control system. Still, a forensic engineer may be required to analyze the system to determine that the control system was not af fault.

Equipment accident

An equipment accident can require an expert engineer or expert witness to help evaluate the circumstances and situation including the mechanical and electrical components of the equipment. This can be industrial equipment, process equipment, manufacturing system, commercial equipment such as heater or dryer, or pump and compresssor. Industrial equipment is also a flow meter, electrical switchgear, control switch, button, and instrumentation. End devices are pressure, temperature, level, and other physical measurement. Many equipment is used for food production, packaging, transportation, storage, and conveyor. Metal processing such as steel mill, paper mill, refinery, petrochemical, and tank farm. Vehicle can also be equipment itself, or it can contain devices related to an equipment accident.