May 21, 2004
The Honorable Linton Brooks
National Nuclear Security Administration
U.S. Department of Energy
1000 Independence Avenue, S W
Washington, DC 20585-0701
Dear Ambassador Brooks:
The staff of the Defense Nuclear Facilities Safety Board (Board) has reviewed the incorporation of safety into work planning at each of the following sites of the National Nuclear Security Administration (NNSA): Los Alamos National Laboratory (LANL), Lawrence Livermore National Laboratory (LLNL), the Y-12 National Security Complex (Y-12), and the Pantex Plant.
Effective implementation of Integrated Safety Management (ISM) must include a process for work planning that protects workers from activity-level hazards. In its reviews, the Board evaluated how each site has accomplished the five ISM core functions (define the scope of work, analyze the hazards, develop and implement controls, perform the work, and provide feedback and continuous improvement) for programmatic work as well as maintenance.
The Board’s reviews revealed significant deficiencies in the ability to effectively incorporate ISM into the process for work planning and control. Problems were noted in the tailoring of generic work documents, the processes used to identify and analyze hazards, the development of appropriate and unambiguous controls to be included in work packages, the use of a hierarchy of controls, and the ability to effectively identify areas for improvement and take action accordingly. A recent review of occurrence reports and of the Board’s site representatives’ reports revealed that while some progress has been made, these issues remain essentially unaddressed.
A previous letter from the Board dated August 7, 2003, dealt with the issues at LANL, while the report of the Office of Independent Oversight and Performance Assurance (OA) on its review of Y-12 identified several deficiencies in work planning. The Board notes that actions to address some of these issues are being developed at the sites; however, significantly more senior management attention is required: The enclosed report prepared by the Board’s staff is provided to assist in this effort. The Board requests that NNSA inform the Board of actions taken to address the issues discussed in the enclosed report, and the Board’s letter and OA report noted above.
John T. Conway
c: The Honorable David K. Garman
Mr. Edwin L. Wilmot
Mr. Daniel E. Glenn
Mrs. Camille Yuan-Soo Hoo
Mr. William J. Brumley
Mr. Mark B. Whitaker, Jr.
DEFENSE NUCLEAR FACILITIES SAFETY BOARD
Staff Issue Report
May 4, 2004
MEMORANDUM FOR: J. K. Fortenberry, Technical Director
COPIES: Board Members
FROM: D. Burnfield
SUBJECT: Summary of Reviews of Documentation and Practices Associated with Activity-Level Work Planning at National Nuclear Security Administration Sites
Purpose. This report provides a summary of observations made by the staff of the Defense Nuclear Facilities Safety Board (Board) resulting from visits to four sites of the National Nuclear Security Administration (NNSA). The purpose of these visits was to review the documentation and practices used for work planning at each site, in order to determine how the Integrated Safety Management (ISM) process is used to establish appropriate controls to protect workers from activity-level hazards. Initial reviews were conducted between March and October 2003 at the Y-12 National Security Complex (Y-12), Los Alamos National Laboratory (LANL), the Pantex Plant, and Lawrence Livermore National Laboratory (LLNL). At Y-12, the review was conducted in conjunction with a review by the Office of Independent Oversight and Performance Assurance. D. Burnfield led the reviews, assisted by C. Goff, V. Anderson, J. Contardi, and A. Jordan, and outside expert D. Volgenau. Additional information on current LANL status was provided by C. Keilers. Additional information was provided via telephone conversations and briefings to D. Burnfield from January to March 2004.
Background. Effective implementation of an ISM System must include a process for work planning and control that protects workers from activity-level hazards. The staff’s reviews at the four sites evaluated how well the ISM System had been implemented for work at the activity level. Fundamentally, work reviewed at each of the four sites visited can be categorized as programmatic (i.e., operations, and research and development) or maintenance/modification (subcategorized as, preventive, corrective, and facility). The reviews emphasized how each site accomplishes the five core functions of ISM (define the scope of work, analyze the hazards, develop and implement controls, perform the work, and provide feedback and improvement) for activity-level work in these two categories. Discussions with responsible individuals were held, documentation and directives were reviewed, and tours of the workplace were conducted.
Summary. The staff’s reviews revealed significant deficiencies at the four sites visited with regard to the ability to incorporate ISM effectively into the process for work planning and to implement adequate controls to protect workers from activity-level hazards. Broad variation in both the effectiveness and implementation of work planning and execution of directives was noted among the four sites, as well as among the different directorates at each site. Generic work documents were not properly tailored to reflect actual work intended. The processes used to identify and analyze hazards and to ensure that appropriate and unambiguous controls are included in work packages were particularly weak. The use of a hierarchy of controls (i.e., preference for engineered controls over administrative controls and personal protection equipment) was not discussed in governing directives and/or was not usefully implemented. None of the sites had a system that would ensure the timely closeout of completed work packages. Lack of such a system limited the ability to effectively identify areas for improvement in activity-level work planning and control, and then to ensure that those observations would result in meaningful changes to future work planning. Although many of the weaknesses in the ability to effectively plan and execute activity-level work noted during these site reviews were known to NNSA through local office and/or headquarters reviews, NNSA had not raised these deficiencies to a level of significance that would have ensured improvement, with the possible exception of NNSA’s Los Alamos Site Office. That site office and LANL moved to a single work management approach in late 2003; however, recent events indicate that senior management needs to keep work planning and control improvements as a high priority.
Comments. The following observations, offered for each of the five core functions of ISM, support the above summary comments.
Define the Scope of Work—All sites had formal procedures for defining the scope of work and for planning the work. Roles and responsibilities were well delineated. However, broad variation in the effectiveness of procedures and methods for their implementation was noted, both within particular sites, and across the sites. Common weaknesses noted include the following: (1) instances of overly complex and cumbersome maintenance work packages, (2) a failure to adequately define the scope of work associated with generic work documents, and (3) lack of a clear division of work at facility and/or programmatic system boundaries. Y-12 had effective manuals and codes of practice, but had not adequately translated them into field-usable work packages and operating procedures. For example, a review of Y-12 maintenance work packages revealed that several contained multiple scoping errors that had been missed by the many reviewers who had signed the package. As noted in the Board’s report of August 7, 2003, LANL directives did not adequately cover planning for conceptual work, while directives for maintenance work had not been updated to reflect the principles and functions of ISM. At Pantex, the incorporation of ISM principles and concepts into activity-level planning and control lacks specifics with regard to how worker safety is ensured, and definitions of engineering responsibilities associated with work planning were implemented inconsistently. A system for prioritizing maintenance work has been instituted. It includes one priority system for modifications and corrective maintenance, another for preventive maintenance, and a third for tooling work. It was not clear how the priorities were integrated. LLNL has formal laboratory-wide procedures, with clear roles and responsibilities, for defining the scope of work. The process appears to work well for programmatic work, but does not work as well for nonprogrammatic work (i.e., maintenance and modification). In the latter case, the bridging documents associated with generic work authorizations do not adequately define the specific scope of work to be done in a manner to support analysis.
Analyze the Hazards—The processes used to identify and analyze hazards associated with activity-level work required improvement at all of the sites visited. The programs at each site had fundamental weaknesses, although the degree of weakness varied. LANL, LLNL, and Pantex were particularly weak in their ability to effectively identify and analyze industrial work hazards. Observed areas of weakness included (1) lack of a formal or effective training and qualification program in the use of processes in place for hazard analyses, (2) a lack of structure and formality in the methodology used to identify and analyze hazards, and (3) little use of a team approach that included both safety professionals and workers. Y-12 was in the process of implementing an automated job hazard analysis (JHA) system to replace the existing manual system, but a number of poor practices had been carried over to the automated process. In general, the staff observed that use of automated JHAs has resulted in a failure to adequately involve the workforce and subject matter experts. For example, as implemented, the automated JHA system leaves controls dispersed in the various work permits (e.g., Radiological Work Permits, Material Safety Data Sheets (MSDS), and Beryllium Work Plans) rather than gathering all of the controls for review by appropriate experts to resolve potential conflicts. Although the remaining sites demonstrated the ability to examine hazards critically at the authorization basis level for program work, they did not have effective programs for examining standard industrial hazards associated with the work. For example, few responsible individuals have received formal training in activity-level hazard analysis techniques even though a relevant course is available to LLNL personnel. The identification and analysis of hazards are routinely performed in an informal manner and typically take place during one or more work planning meetings that include members of the Environment, Safety and Health (ES&H) Team assigned to the project.
A formal process as described in DOE Guide 440.1-1, Worker Protection Management for DOE Federal and Contractor Employees Guide, is not performed. The inability to identify and analyze workplace hazards at LLNL resulted in two life-threatening occurrences during 2003. At Pantex, the directives include an ES&H Checklist, an Activity Hazards Analysis Screen Form, and a Job Hazard Analysis Form, but they either lack guidance or are inconsistent regarding who is to complete these forms and when and how they are to be used. Completed forms are not included with the work packages. There was little evidence that a team approach, including worker involvement, was being used.
Develop and Implement Controls—The ability to properly identify and implement controls appropriate to activity-level work required improvement at all of the sites. The Board’s staff was not able to evaluate the adequacy of the control sets because of the above-noted weaknesses in the process used to identify and analyze hazards and to ensure that appropriate controls would be incorporated in the work procedures. Areas of common weakness included (1) no effective means to review planned controls for possible conflict or to ensure that additional hazards would not be introduced as a result of those controls, (2) use of the hierarchy of controls as defined in DOE Guide 440.1-1 not being discussed in governing directives and/or being ineffective, and (3) failure to reflect appropriate controls in written work instructions.
Perform Work—All sites had provisions in their directives for scheduling and authorizing work and for conducting pre-job briefings. Some of these provisions were more effective than others. Stop-work authorization for potentially unsafe or uncertain conditions was included and understood by the workers. None of the sites had an effective process for ensuring that work packages would be closed out in a timely manner and that appropriate action items would be identified. Some poor work practices were noted at Y-12 during an operations evolution, and during both a corrective and a preventive maintenance action. Further, it was not clear that work tasks were being authorized properly. Work packages reviewed at Pantex revealed technical errors or omissions. In one case, a procedure had been changed by an e-mail message from the system engineer without having been referred back to the planning team. Interviews with key individuals at LLNL indicated a lack of knowledge of, or at least appreciation for the elements of an effective work planning and control process. Further, the LLNL work packages reviewed were not user-friendly; moreover, not all controls were in the work procedure, but rather, the reader was referred to other documents.
Provide Feedback and Continuous Improvement—The ability to effectively identify areas for improvement in activity-level work planning and control and then to ensure that these observations would result in meaningful changes to these processes required improvement at each of the sites visited. Y-12 was not effectively capturing feedback from workers regarding completed work, even though site directives included this provision. LANL did not have an effective system for capturing lessons learned from activity-level work. The system intended to capture lessons learned from activity-level work planning and execution at Pantex was very weak, and important potential lessons learned had not been effectively documented. None of the feedback and assessment provisions prescribed at LLNL effectively evaluated activity-level work planning. The embedded ES&H Teams are hindered in carrying out their independent surveillance and feedback responsibilities effectively because the teams are viewed as a support organization. The process for capturing feedback from work activities does not require input from the workforce and appeared to be only minimally effective. A review of site-wide self-assessments indicated that the self-assessment process is ineffective at evaluating the processes used for activity-level work planning and execution.
LANL Initiatives—Between October and December 2003, LANL implemented a single work management approach to address common safety issues identified in fiscal year 2003 assessments and accident investigations. This is an interim action until longer-term improvements can be implemented, now expected in late 2004. The need for immediate action was driven by a recognition that a significant injury or near-miss had occurred on average every 6 weeks during the previous 6 to 8 months. LANL investigations of these events identified common safety issues. NNSA Site Office oversight and involvement was pivotal in the scope and timing of the action. These are the most positive actions that have taken place at LANL to improve worker safety during the last 2 years. However, the interim process is inefficient, particularly for routine maintenance tasks, and it is far from complete, based on significant events that occurred as recently as March 2004 (e.g., a mobile crane striking a 13.2-kV overhead power distribution line, workers discovering that they were working in an uncontrolled, unrecognized High Radiation Area). NNSA and LANL senior management need to keep work planning and control improvements as a high priority.
Future Actions. The Board’s staff notes that actions have been taken or are in the process of being taken to correct many of the deficiencies discussed in this report. LANL, for example, has been working to address issues raised in the Board’s letter of August 7, 2003. The staff will conduct future site reviews and observe assessments by the Office of Independent Oversight and Performance Assurance to determine the effectiveness of these actions. In follow-up discussions with LLNL, laboratory personnel indicated that all of the major deficiencies were being addressed in an overhaul of their work planning process.