By Adil F. Dalal
Here you will find a practical approach to self-motivation and higher conscious existence as a corporate employee, an executive and most importantly as a powerful human being.
(Based on“Pillar X: Learn From Failures” of Award-Winning Book, The 12 Pillars of Project Excellence: A Lean Approach to ImprovingProject Results)
Adil F. Dalal Is the Author of Pillar X: Learn From Failures
A proactive analysis uses the “why” approach with a focus on system and process-level modes of failure. Prior to any major undertaking, the leaders utilize tools like Failure Modes and Effect Analysis (FMEA) to identify the critical modes of failure and address them, as per the priority of the risks involved. They are wise to know that merely conducting FMEA’s does not guarantee failure-free projects or endeavors. But when a failure occurs, leaders generally take a proactive approach to prevent future failures – they take the time to conduct “5 Why” analysis4, identify the root cause(s), eliminate the root cause(s) and, update lessons learned databases. Although this seems quite simple, it is one of the most difficult aspects of problem solving as it not only needs leaders who have a foresight and ability to peer beyond the horizon but it also needs a culture which supports their leaders and formalizes this process. In the long-term, taking short-cuts by instituting a reactive approach to problem solving will most certainly jeopardize product and service quality, credibility, growth and even survival of even the most successful and powerful organization. A perfect case-in-point is NASA. The history of NASA clearly indicates that the price of taking a reactive or a Human Level approach to solving problems is extremely high. The catastrophic explosion and the disintegration of the Challenger over the Atlantic Ocean on January 28, 1986 is one of the darkest moments in NASA’s history of space exploration. The initial cause of the explosion that nearly everyone believes, even today, is that an O-ring caused the catastrophic failure (Human or Mechanical mode of failure). However, according to the Rogers Commission created by the US government to investigate this accident the root cause of the failure was not determined to be the infamous O-ring, but in fact, the true root causes identified were “System Level” failures at NASA which included the “failure of communication” and “decisions based on conflicting or incomplete information”. According to the hearings in the U.S. House Committee, the primary root cause identified was the “weakness in the decision-making processes at NASA and its’ various contract organizations”5 – a much more serious “System Level” problem. In fact, this is a kind conclusion. It has been pointed out that NASA had strong political reasons to go through with the launch, in spite of the fact that the engineering staff was aware of the high probability of the O-ring failures at low temperatures. The engineers had refused to sign off on the launch. Richard Feynman, a Nobel Laureate and the leading physicist of the period, conducted a personal investigation. He found that the engineers estimated the probability of a mission failure was on the order of 1 in 100, a figure that has turned out to be very close to what has transpired. NASA leadership on the other hand, estimated the probability of failure at less than 1 in 1,000. According to Feynman, “It would appear that, for whatever purpose, be it for internal or external consumption, the management of NASA exaggerates the reliability of its product, to the point of fantasy.,,,. When playing Russian roulette the fact that first shot got off safely is of little comfort for the next”6. Unfortunately, the leaders at NASA failed to deal with system causes, beginning with their own erroneous beliefs about the likelihood of failure. This culture of “decision-making based on conflicting or incomplete information” also resulted in other major failures. We witnessed the crash-landing of the Mars Climate Orbiter (MCO), on the Red Planet on September 23, 1999 - a loss of $125 million to the tax-payers. The root cause of failure given by NASA was a unit conversion error (Human level) but a commission determined that the root cause which truly led to the failure of the Mars Climate Orbiter mission was NASA’s “culture of poor communications, inadequate training, and significant stress caused by multitasking” (System Level failure). NASA again did not address this root cause and it resulted in a much more serious incident of the loss of the crew of 7 astronauts along with the shuttle during the re-entry of the Columbia shuttle on Feb 1, 20037. It is clear that if NASA leaders had taken a System Level approach versus a Human Level approach to failure analysis, the history of space-exploration may have been more effective and accident-free.
According to Dr. Deming, “the aim of leadership should be to improve the performance of man and machine, to improve quality, to increase output, and simultaneously to bring pride of workmanship to people. Put in a negative way, the aim of leadership is not merely to find and record failures of men, but to remove the causes of failure: to help people to do a better job with less effort”8. Thus, great leaders and successful corporations must learn the art of System Level analysis in order to be proactive in their quality assurance and quality control programs. They must realize that the fundamental flaw of the reactive approach focused on the Human-Level mode of failure is that it not only does it fail to address the root cause of the problem but it results ina culture of fear, culture of distrust and a culture of risk aversion, lacking creativity and innovation. On the other hand, using a pro-active approach could help them create a culture of engagement and a culture of scientific problem solving, proactive human error prevention and a culture of “open learning”9, innovation and creativity.
Case Study: Using a Proactive Approach to Determine the Root Cause of an Accident
In May 2011, while consulting a manufacturing facility, I was requested to identify the root cause of an accident in which a mechanic, attempting to fix a problem with a material hopper on the night shift, narrowly missed his hand being cut-off, but lost the tip of his finger on his right hand and had to be rushed to the hospital. Upon investigation, it was determined that a mechanic had put his hand inside a hopper box to loosen the pellet clumps which was causing the equipment to shut-off. With his right hand still inside the hopper, using his left hand to push a switch, the mechanic activated the pneumatic knife which shuts off the hopper throat in order to prevent few thousand pounds of plastic pellets rushing onto the floor. There was a “Caution sign” posted on the equipment, which clearly indicated that no body parts could be introduced to dislodge materials from the hopper. At first glance, it was an “open-and-shut case” of negligence. Human Resources got involved to initiate punitive action against the mechanic for ignoring the caution sign and putting himself and the company at risk. One of the HR managers approached me to further investigate this using my System level approach. I used the simple 5-Why approach as follows:
It took the team about 45 minutes to go through the complete exercise of reviewing data and collecting evidence from eye-witnesses and asking the 5-Why’s. The root cause identified was that there was a possible lack of air flow which resulted in plastic resin pellets being dried at too high a temperature, which resulted in the clumping of resin which needed attention for proper functioning of the equipment, thus leading to the injury. The system was not designed to monitor the air flow or temp on back hopper. It was actually a System level Issue. If we had misdiagnosed it as a Human level problem, it could have resulted in some form of punitive action which could have included firing of an employee who was provided no tools to fix the problem and used a coat hanger, a screw-driver and ultimately his own hand. Although the actions of the employee were without question unsafe, he was trying to do his job within the system designed for him and with the limited resources provided. It was definitely a culture issue and several short and long term steps were immediately implemented to address and eliminate the root cause so the other 24 machines will not similar issues in future resulting not only safer operations but in improved productivity. Thus, this example clearly points out the fact that an investment of some time and effort in the System level approach to problem solving can result in significant hard and soft benefits for the organization in the long-term. In order to steer the organization culture towards a System-level failure analysis, we need to constantly ask ourselves and our leaders on projects and other undertakings, some critical questions:
1. Do we consider every failure as a “new lesson learned” to build on, so that the organization never encounters the same failure ever again?
2. Do we encourage our teams to learn from mistakes throughout the entire project and use the “closing phase” to focus on identifying failures, analyzing the root causes, and learning from each and every failure?
3. Do we categorize failures based on the root cause analysis as “system level”, ”process level,” and “human-level” failures?
4. Do we believe in finding and eliminating the system-level failures first as they can be serious and can have an adverse effect on more than one project within the organization?
5. Do our leaders always utilize a system-level” analysis on projects and consciously attempt to create a culture of “open learning”, innovation and
Problem: Injury while trying to De-clump Material in a Hopper Box