September 13, 2003
Overseers Missed Big Picture as Failures Led to Blackout
By ERIC LIPTON, RICHARD PEREZ-PENA and MATTHEW L. WALD
Twenty-two minutes before North America's biggest blackout, officials at two agencies charged with ensuring the safe and steady flow of power across the Midwest conferred by telephone about what they thought were troubling but still routine electrical line problems in Ohio.
"It looks like they lost South Canton-Star 345 line," said Roger Cummings, a control room operator at PJM Interconnection, referring to a 345,000-volt line, a major artery in the system.
Don Hunter at the Midwest Independent System Operator,
or Midwest I.S.O., monitoring another part of Ohio
and a different utility company, absorbed that news
and replied, "I know that
Now, weeks after the vast power failure, the significance of that exchange, at 3:48 p.m. on Aug. 14, has become clearer. While the two agencies charged with monitoring the grid and warding off huge problems were discussing the loss of two power lines, there were, in fact, eight lines down, and others headed for failure.
Neither agency, it turns out, could see most of the picture.
Indeed, nearly 10 minutes after the blackout had swept from Michigan to Connecticut, monitoring officials in the Midwest were just starting to grasp how far the crisis had spread. "We want to know if anybody else is experiencing any problems," an official asked innocently.
In the end, then, it was not just a circuit breaker tripping or a transmission line sagging into a tree that caused the system to fail. Documents and interviews make clear that the blackout may well have resulted, just as surely, from the fact that the people whose job it was to respond to those failures lacked much of the information about what was happening.
In the 65 minutes that a sequence of power line failures built up to a cascading blackout across the Midwest, the Northeast and parts of Canada, these two regional agencies took no active steps to stop the progression, largely because they were unable to see the full extent of it.
They were, that afternoon, like air traffic controllers trying to keep order in the sky without knowing where all the planes were.
That portrait emerges from a detailed examination of hundreds of pages of telephone transcripts, interviews with industry officials and experts, and a compilation of the numerous timelines of system failures that have been assembled by utilities and government officials in recent weeks.
That review offers a far deeper appreciation not only of what crucial elements went wrong that day, but also of the fundamental weaknesses in the way the nation's electricity grid is overseen and policed, especially in the Midwest. For example:
In the weeks since the blackout, electricity experts, including some involved in the government investigation, have turned to PowerWorld, a sophisticated computer program that can simulate conditions on the entire North American grid. The program shows that if the people monitoring the grid had known all the problems unfolding around them, they would have seen the need for decisive action and they could have limited the catastrophe, or even prevented it.
"It is an unacceptable level of confusion," said Ian A. Hiskens, a former electric utility executive and a professor of electrical and computer engineering at the University of Wisconsin, who ran PowerWorld simulations. "They should know what the state of their system was, that is fundamental to operating. And by 4:07 they are still not sure what has happened."
The Midwest I.S.O. declined to make any senior officials available for interview. A spokeswoman, Mary Lynn Webster, said, "We operated the system based on the information we had at the time."
The United States and Canadian governments are still in the preliminary stages of their joint investigation into the blackout. Secretary of Energy Spencer Abraham issued the government's first timeline of the blackout yesterday, a four-hour chronology of line failures and power plant shutdowns.
But he said answers as to cause and blame were not near. "We are not going to compromise quality for speed," he said.
An Early Failure
"We've got a huge problem."
The first call for help to the Midwest I.S.O. on
Aug. 14 came not from FirstEnergy of Ohio in the hour
before the blackout. Telephone transcripts released
by the I.S.O. show that Spencer House, a controller
at another utility,
The transcripts show that from 12:22 p.m. until at least 3:31 p.m., just 39 minutes before the blackout, the attention of Midwest I.S.O.'s monitors was consumed mostly by Cinergy's concerns.
Cinergy had lost the use of two major transmission lines south of Indianapolis. The solution, the company and the Midwest I.S.O. agreed, was to take the strain off the remaining lines by shifting power output from one part of the state to another.
This process was a textbook example of one function for grid monitors, intervening to prevent an isolated problem from becoming a major disruption. Many experts say it is also what should have happened later in the day, in eastern Ohio.
Yet a look at the Cinergy incident also illustrates the inherent weakness of the Midwest I.S.O., which, unlike other I.S.O.'s, can only urge companies to act responsibly. It cannot order them to do so.
To take the strain off Cinergy's lines, the Midwest
I.S.O. turned to another power company,
Concerned that they might lose another 345,000-volt line, Mr. House told Midwest I.S.O., "I think we're a trip away from, 345 trip away from setting a little history."
Nine minutes later, Doug Kiskaden of Cinergy warned Midwest I.S.O. officials that if one more piece of the company's grid were to fail, it would "be in imminent danger of collapsing."
Coordinating the response to Cinergy's troubles involved at least six officials at Midwest I.S.O. They were hindered, too, by the failure of an I.S.O. computer program that left them unsure about what to do.
The program, called a state estimator, helps to monitor grid conditions and tries to predict what would happen if breakdowns were to occur.
"The state estimator has been down for an hour and a half," said Ron Benbow, lead monitor with the I.S.O. said at 2:36.
Ultimately, despite such obstacles, Cinergy was able to fix the problem by reducing power output at two of its own plants.
Both Cinergy and the Midwest I.S.O. insist that there was no causal connection between these events in Indiana and the later collapse that began in Ohio. But government officials and independent experts say it is probably too soon to rule out that prospect, or the possibility that the two crises were initiated by a common, still-unidentified cause.
What is beyond dispute is that the Cinergy problem taxed the monitors.
The Crunch Begins
The first overt sign of trouble in eastern Ohio came at 1:31 p.m., when a 597-megawatt unit at FirstEnergy's Eastlake power plant, in the northeast corner of the state, shut down for reasons that are still not known. Officials at the Midwest I.S.O. did not know about the failure until FirstEnergy called to let them know, more than 40 minutes later.
A number of plants in northern Ohio were already down that day, mostly for maintenance. A major line in the southwest part of the state owned by another utility, DPL, had already failed because of a brushfire. Vast amounts of power were coursing across the remaining lines to keep cities like Cleveland and Akron powered.
At 3:05 p.m., FirstEnergy's 345,000-volt Chamberlin-Harding line near Cleveland went out, straining other lines further. Once again, the I.S.O. learned of the failure well after the fact, through a phone call.
This pattern would be repeated several times in the next hour, as Midwest I.S.O. officials saw snippets of what was happening — voltage irregularities, high loads on some lines — but no more.
The Midwest I.S.O. receives, day in and day out, a flow of information from the utilities under its monitoring, like data on the performance of power lines. But there are crucial limits in that information that sets Midwest I.S.O. apart from other monitoring groups in the country.
The grid monitors who control New York and vast parts of the Mid-Atlantic region, for instance, constantly review computer screens that tell them the condition of every major transmission line and even some smaller lines. A problem on any line sets off visual and audible alarms in their control rooms.
But computers at the Midwest I.S.O., which has been in operation for less than two years, are not set up to sift the mountain of information and display most of it in the control room. Instead, its main monitoring computer system takes information from only selected "flow gates" — places where problems are thought likely to occur, and the data is updated much less often than at other control rooms.
On Aug. 14, Midwest I.S.O. officials say, their system was monitoring just one of seven 345,000-volt and 138,000-volt FirstEnergy lines that failed, though they were told of two others. As a result, the transcripts show, I.S.O. controllers were perplexed by the disturbances they were seeing on the system.
"I wonder what is going on here," Mr. Hunter said at 3:36 p.m. "Something strange is happening."
Ordinarily, there would be ways for the Midwest I.S.O. to fill in the gaps in its knowledge. One is the state estimator program, which operated only intermittently that day.
The I.S.O.'s main backstop is the utilities, which call to pass on information about their systems. But on Aug. 14, FirstEnergy's computer problems prevented it from recognizing its own line failures. Transcripts show a reversal of the usual exchange, with FirstEnergy repeatedly asking the I.S.O. for information about its own equipment.
At 3:57, Jerry Snickey, in a FirstEnergy control room in Akron, told Mr. Hunter at the Midwest I.S.O. that the voltage on a major line was dangerously low. "Do you have any idea on what is going on?" he asked.
Mr. Hunter replied that the Hanna-Juniper line was out, adding, "I am wondering if it is still out."
"We have no clue," Mr. Snickey said. "Our computer is giving us fits too. We don't even know the status of some of the stuff around us."
One Region, Two Monitors
Contributing to the lack of information was the odd division of the Midwest region between two intertwined authorities, PJM and the Midwest I.S.O. The region had no I.S.O. until 2001, when some utilities formed the Midwest I.S.O., while others signed with PJM, which historically had overseen Pennsylvania, New Jersey, Maryland, West Virginia and the District of Columbia.
Among those which joined PJM were American Electric Power and DPL, utilities to the south of FirstEnergy. After the FirstEnergy lines began to shut down, so did several of American Electric's.
But electronic signals about American Electric and DPL lines go to PJM's two command centers in Pennsylvania, not to the Midwest I.S.O., whose terrain was being indirectly affected.
In interviews, PJM officials said they knew of the line failures in their region, but not most of FirstEnergy's troubles. The Midwest I.S.O. has said that it, in turn, did not know of most of the failures in PJM territory.
"It is fairly clear from the transcript nobody was aware of the exact nature of the problem or the extent," said Michael J. Kormos, vice president for operations at PJM.
PJM and the Midwest I.S.O. have long acknowledged needing to share more information. Even before the blackout, they developed a plan to begin doing so.
Had the I.S.O. a full appreciation of what was going on, and the authority to order power companies to make quick adjustments, it might have been able to take corrective action, according to some experts who have studied the blackout.
"It appears that the system was slowly slipping into the oblivion during the two hours before the blackout," said Mani V. Venkatasubramanian an associate professor of electrical engineering and computer science at Washington State University. "It may be that they were blinded by the lack of available data."
Executives at the International Transmission Company, a Michigan-based distributor of electricity overseen by the Midwest I.S.O., said they were prepared to take steps to stabilize the grid but had no inkling of the unfolding trouble.
Richard Schultz, vice president for engineering and planning at International Transmission, said that even with limited information, the I.S.O.'s are responsible for reacting to problems. "The buck stops with the security coordinator and there are two security coordinators at issue here on that day," he said.
A System Overwhelmed
The transcripts show how handicapped those coordinators were.