About Us

System Blackout Causes and Cures

10.6.03   Damir Novosel, Senior Vice President and General Manager T&D Consulting, KEMA

Article Viewed 3752 Times

  • Email This Author

    How Blackouts Happen

    The transmission grid that serves the eastern US and Ontario is congested by large power transfers from neighboring regions (to allow for open competition) and is vulnerable to disturbances. Although there is a tendency to point at a "single" event triggering cascading outages, major blackouts are typically caused by multiple contingencies with complex interactions. Major blackouts seldom happen, requiring a sequence of low probability events to occur. Accurate sequence of events is difficult to predict, as there is practically an infinite number of operating contingencies. As system changes (e.g. independent power producers selling power to remote regions, load growth, new equipment installations), these contingencies may significantly differ from the expectations of the system designers. As a chain of events at various locations in the connected grid happens, operators cannot act quickly enough to address fast developing disturbances.

    The likelihood of low-probability events escalating into a cascading outage increases when the grid is already under stress due to certain preconditions such as weather (high temperatures, thunderstorms, fog, geo-magnetic disturbances, etc.). There are, however, a number of controllable preconditions and factors for blackouts. A congested grid is a major precondition. Public pressures and the "Not in My Back Yard" sentiment make it difficult to site lines especially in the more densely populated heavy load areas. Another important factor for causing blackouts in recent years is a lack of reactive support close to the load so that the adequate voltage levels can be sustained. It is very important to assure that sufficient reactive power is available (e.g. strengthening the system with reactive power sources) and to exhaust all generator reactive power capabilities when required. Other factors include aging equipment that is prone to failure, maintenance practices such as adequate tree trimming, and insufficiently coordinated equipment maintenance and generation scheduling during stressed conditions.

    In general, low level of investment in the grid in recent years is also a major contributing factor - the challenge here being identifying who is to invest and recuperate the costs of such investment. As blackouts rarely happen, it is not viable to require 2-3 year return-on-investment. The system now runs under tight operating margins in order to sustain profitability without increasing rates. Regulatory uncertainty at both the state and Federal levels has impeded transmission investment and prevented necessary system coordination on a regional scale.

    The combination of these factors makes power systems more susceptible to disturbances. It is the cascading events that cause disturbances to propagate and turn into blackouts. A system is stressed and as system and equipment faults occur the chain of events starts. For example, some generators and/or lines are out for maintenance, or a line trips due to a fault. Other lines get overloaded and another line comes in contact with a tree and trips. There is a hidden failure in the protection system (e.g. outdated settings or HW failures) that causes another line or generator to trip. At that stage, power system is faced with overloaded equipment, voltage instability, transient instability, and/or small signal instability. If fast actions (e.g. load shedding, system separation) are not taken, the system cascades into a blackout.

    Evaluation of disturbances show that protection systems have been involved in 70% of the blackout events. For example, zone 3 distance trips on overload and/or low voltage sensitive or ground over-current trips on high unbalance during high load. Inadequate or faulty alarm and monitoring equipment, communications, and real-time information processing can further exacerbate disturbances in the system. Either information is not available or operators are flooded with alarms, so that they cannot make proper decisions fast.

    Human error or slow operator response are major contributing factors for cascading outages. As a disturbance develops operators in various regions are faced with the questions 'is the best course of actions to sacrifice own load, cut interties, or get support from neighbors?', 'should we help or should we separate?'. An important aspect in designing connected power systems is that individual systems should not allow cascading outages to spread through out the system. For example, as one part of the network is heading towards, or is in, a blackout state, neighboring power systems should not go totally black as well.

    There are a number of other contributing factors that allow a blackout to spread, including lack of coordinated response among control areas. Each region focuses primarily on its own transmission system. Each of the individual parts can be very reliable yet the total connected system may not be as reliable. While accounting systems have boundaries, electric power and critical communications do not obey these boundaries. Intertie separations are not pre-planned for severe emergencies, leaving it to the operators to decide - very difficult during a fast developing disturbance. In any case, it is desirable to take automated actions before system separates or to separate it in a controllable manner. Special protection schemes are wide area schemes designed to detect abnormal system conditions and initiate pre-planned automatic and corrective actions based on system studies. A lack of or inadequate special protection schemes makes it more difficult to prevent spreading of the disturbance.

    Phenomena that manifest during a power-system disturbance are divided into the following categories: angular instability, voltage instability, overload and power system cascading.

    In recent years, voltage instability has been one of the major reasons for blackouts. The power system becomes unable to maintain voltage so that both power and voltage are controllable. A typical scenario here is high system loading due to heavy transfers across the grid, followed by events that initiate relaying actions (a fault, line overload, or generator hitting an excitation limit). As the grid becomes even more overloaded, more reactive power is consumed causing voltages to drop. As it is difficult to transfer reactive power across distances, it is desirable to provide enough reactive power close to the load.

    However, regardless of provisions for reactive power support, due to heavy loading and tripping actions, the power system can experience "point of no return" where voltage can no longer be maintained. Voltage instability can cause the whole grid to experience blackout unless actions are taken to maintain the voltage. Those actions include switching on shunt capacitors and SVCs, blocking tap changers, exhausting reactive generation resources, and, as a last line of defense, shedding load (e.g. on under-voltage). .
    During major blackouts very often some areas separate from the rest of the system causing power unbalance. Where there is surplus generation in an area, a coordinated generator tripping should be pursued to avoid sudden loss of power leading to a complete blackout. Where there is surplus load in an area, a well-coordinated under-frequency load shedding scheme should be employed and coordinated with the generator under-frequency schemes.

    Reviewing an example of recent system blackouts can lead to further insight into the causes and cures for such events. The August 10, 1996 outage in California [1] alone cost $1 billion. An hour before the disturbance three 500kV lines tripped. This resulted in a heavy power flow (4700 MW) from North to South. A fourth line tripped due to a fault and a fifth line tripped due to design flaw. As a result 230 kV and 115kV lines experienced heavy loads. The 115kV line tripped due to relay hidden failure and the 130 kV line sagged and flashed over a tree. Voltage declined, power units went to full excitation and tripped. Power oscillations caused the tie line trip and caused out-of-step conditions, separating the system and causing further cascading separations. In this case, 30,500 MW of load is lost and 7.5 million customers were affected. Although each blackout is different, the August 14, 2003 blackout experienced some very similar patterns.

    How to Prevent Blackouts

    The events of August 14 underscored the need for increased investment in the transmission system, but any investment should be preceded by a prudent analysis of which investments are most necessary. There is no silver bullet solution to preventing blackouts, but there are general measures than can and should be taken to minimize impact of disturbances. Since the recent outage was caused by a complex sequence of cascading events, electric utilities, industry regulators, and state and Federal legislators must undertake the following steps to determine what happened, understand why the it happened, and prevent it from happening again.

    Step One: Analysis & Audits

    Multiple regulatory and government agencies have already begun an intense analysis of the blackout data to identify what actually happened and to put to rest the rumors and conjecture offered thus far in the media. An immediate focus must be placed on the possible failures of the control and protection systems intended to restrict and contain power system disturbances within smaller areas. System protection design changes are also needed to limit the impact of future blackouts. Critical alarm monitoring systems must be maintained in top operating condition, and newer alarm analysis technologies should be deployed to detect and to prevent the spread of major disturbances. Other equipment failures may be involved, requiring detailed failure and root cause analyses. Most important, however, inadequate flows of information between neighboring control centers may have resulted in an inexcusable time delay in reacting to an escalating problem.

    Other reviews should also be conducted in the immediate term including audits of planning, operating, and maintenance practices to identify the factors that contributed to the recent blackout. Transmission system capabilities for handling today's higher flow levels and the huge volume of transactions must be investigated more thoroughly. Maintenance procedures should be revised to reduce the rate of equipment failures in critical transmission equipment, and vegetation clearance around transmission lines must be reexamined and corrected as necessary.

    Step Two: Preventive and Corrective Actions

    The analysis and audit process will identify the next set of actions required to minimize the possibility that a similar outage will happen again. Short-term upgrades will likely be required, such as improved monitoring and diagnostics as well as remedial action schemes and training for system operators. Monitoring systems with faster detection and wider communication capabilities could play a key role in the near term, as well as improved transient stability analysis capabilities and improved control algorithms that can take quicker and more appropriate corrective actions. The development of special protection schemes can help manage system disturbances and prevent blackouts. These wide area protection schemes to detect abnormal system conditions are based on pre-planned, automatic and corrective actions implemented based on system studies, with a goal to restore acceptable system performance. NERC defined standards of acceptable SPS performance. It is expected that planning and budgets for 2004 will be significantly influenced by the August 14, 2003 blackout.

    In summary, following measures should be taken to prevent blackouts:

    • Improve monitoring, diagnostics, and control center performance (e.g. availability of critical functions needs to increase to 99.99%);
    • Secure real-time operating limits on daily basis (e.g. dynamic line ratings);
    • Implement Special Protection Schemes and Adaptive Protection;
    • Perform protection coordination studies on a regular basis as system conditions change;
    • Test not only individual relays but system protection applications;
    • Perform dynamic voltage and transient stability studies on a regular basis as system conditions change;
    • Condition assessment of aging infrastructure and improved maintenance;
    • Operator training, including a coordinated approach among control areas
    Step Three: Public Policy, Transmission and Future Investments

    A tightening of procedures at utility control centers and at the independent transmission system operations (ISOs) that will have long term effects on the industry is expected. Regulatory actions can be expected as well.

    Preventing a blackout of this magnitude in the future will require a combination of long-term investments in the transmission system and much needed improvements in public policy. Significant investment in transmission hardware will be made in the next decade. The retirement and replacement of transmission equipment at the end of its useful life will be another important remedy for skyrocketing failure rates and potential outages in the future. But beyond the aging infrastructure issue, the transmission grid must also be upgraded and expanded to handle the increased energy flows. High-voltage power electronics devices would allow more precise and rapid switching to improve system control and to help increase the level of power transfer that can be accommodated by the existing grid. Distributed energy technologies could also play a role in relieving certain power flow demands on the transmission and distribution networks as well as in improving reliability. While the new investment will certainly include some new transmission lines, it will also encompass new power delivery technologies, including thyristor switches, superconducting materials, and VAR controls.

    Today's communication and computer technologies have produced a new revolution in the power industry, especially in the field of power system control where vertical integration is much improved. Computer relays communicate not only with a center, but also with each other. This in turn will facilitate the overall system-wide protection and control philosophy. Microprocessor-based coordinated protection, monitoring and control systems are the key to innovations in preventing cascading disturbances. The implementation of an advanced wide-area protection system first requires a significant improvement of the existing decentralized systems. Decentralized subsystems will have to utilize advanced algorithms to make local decisions based on local measurements and/or selected remote information. With an information infrastructure, it is possible to tie all the monitoring, control and protection devices together through an information network. The key to a successful solution is fast detection, fast and powerful control devices, communication system, and smart algorithms, in other words "True Wide Area Protection and Control System".

    In summary, following measures should be taken to prevent blackouts:
    • Regulatory actions to assure coordination and enable efficient system planning, permitting, and market operations.
    • Strengthen transmission network, through building lines and cables and distributed generation;
    • Increase transmission power flow control capability by use of HVDC links and FACTS;
    • New technologies enable coordinated wide-area protection, monitoring and control systems as cost-effective solutions (a true Wide Area Protection & Control System);
    • Energy Storage;
    • Superconductivity.

    1. "Wide Area Protection and Emergency Control", Working Group C-6, System Protection Subcommittee, IEEE PES Power System Relaying Committee, January 2003.

    Copyright 2004 CyberTech, Inc.

    Readers Comments

    Date Comment
    Ravinder Singh
    Dear Damir, There is very little difference in the articles by trading professionals, metering companies, law firms, politicians and your thoughts. You will agree with me--- power lines and substations are always overdesigned by a factor 4 to 8 times.Transmission lines are designed to keep transmission losses under 1% per 100 km. approx. SO ALWAYS HAVE CAPACITY TO TRANSMIT MUCH MORE ENERGY THAN MAX. DESIGNED CAPACITY. Cord of 500W appliance can take 2kw or more load.

    Your idea of auditing to determine cause of blackout is also very immature-instruments record the electricicity flow parameters which will give indications of the system disturbances and their timings.

    I can't make out why transmission people (all) prefer one grid for 280m people. It's possible to bake one Christmas cake for all Americans but is not prefered because it will be too costly, it will be too difficult to distribute, it will not suit taste of most Americans, it will lose freshness before reaching the consumers. So the cake is baked in homes and local bakeries. Likewise it will be economical to have 50 to 100 grids and there may be interconnections to transfer surplus electricity. Under power disturbace the smaller grids be isolated and quickly revived with assistance from nearby functional grids. Local generators will serve local needs efficiently and ensure electricity is delivered most economically to consumers.

    Auditing be carried out to BREAK UP THE COST OF ELECTRICITY CHARGED TO CONSUMERS into fuel cost, generation cost, trading cost, distribution cost and billing cost. Please write about patents you have secured which may be more interesting.---Ravinder Singh

    Dwight DaCosta
    Damir, I found the article to be very informative. However, if I should summarize what I understand from what you worte, it is;

    "Do what we power engineers know should be done, and this would never have , and will never happen"

    While that is a true statement, it leaves us nowhere since we should have been doing all these things before. What will make us do them now, given the other current prevailing issues and environment which have driven us in this direction in the first place and which still exist?

    I think, stronger regulation and sanctions are crucial. We have to provide the right incentive for ALL to do what "THEY KNOW" should be done and enough disincentive for them to find the cheapest , less reliable way out.

    Continue to let your voice be heard.

    Sam Mullen
    This "overview" article touched on an impressive amount of causes and possible cures, and though mildly technically oriented, could be understood by many folks with a modest power background and certaily by most people who have served a time in system operations (system control). I congratulate the author for bringing the writing to earth, given his advanced background, as well as to the level of detail for more advanced readers. This is not always simple to do.

    However, this article is just as much about how to facilitate better system design and operation as it is about blackouts, and perhaps more so. And for that I applaud the writing as well. Implementing more DC ties has been the song of many experts involved in improving system stability at the HV transmission ties level, and I am a strong advocate of doing more of this if the funds are available. Getting more generation nearer the load center is also a very strong argument, but putting it there continues to be a problem. Give me an answer to how we can overcome this hurdle and sell it to everyone, and you'll see the more stable system we dream of, but at what price?

    More than most of what I have read about the blackout(s), this article offers more than "what might have happened" and reads to me like an excellent review of what utilities must continue to comprehend and operate by and with - everyday. I think many of us know the potential solutions, but find solutions difficult to implement - as always. Should we be more optimistic that solutions will come on a more timely basis? I think the jury is still out.

    Nice job, Damir

    Sam Mullen Author, Emergency Planning Guide for Utilities

    Roger Clarke-Johnson
    Alas, Mr. Ravinder, it is you who are immature. Mr. Novosel correctly states that there is a mountain of unsynchronized sequence-of-events data, from different devices, owned and calibrated by differenct organizations, that must be sifted through. "The man with two watches doesn't know what time it is."

    Add to that the potential for - indeed, the inevitability of - lawsuits following the identification of the smoking gun(s), and you see that the only mature, prudent course of action is to call for an complete and independent audit of all the data.

    To Mr. Novosel I would ask: When was the last time this system was modelled? Did the utilities stop consulting the model, or did it become obsolete behind their backs? Paraphrasing Mr. DeCosta: Why was this a massive outage a surprise? It gives utility consultants and system operators both a bad name.

    Finally, to use a real world analogy: If you know the brakes are bad on your car, you don't go very fast, and you surely don't go down steep hills. To blame ancient equipment is a bit of a Red Herring unless you can prove there were multiple concurrent (not sequential) failures in the transmission/distribution grid.

    Wallace Brand
    Damir Novosel has provided an excellent summary of the conventional thinking on what can be done to prevent or at least to minimize future blackouts resulting from cascading outages.

    But it is time to go back to first principles of system planning. He rightfully comments that Public pressures and the "Not in My Back Yard" sentiment make it difficult to site lines especially in the more densely populated heavy load areas. And he correctly notes that major blackouts require a sequence of low probability events to occur. He suggests that low investment in the grid is a major factor.

    First principles of system planning suggest that all these be taken into account in system planning as possible failure contingencies Therefore, in system planning why not avoid all these possible failure contingencies simply by distributing generation and locating it at or near the load it serves. That will avoid the need to get permissions to locate transmission lines and large central stations near people who object to them. In that way one can also avoid those multiple transmission contingencies with complex interactions since no transmission will be required. Finally one can avoid inadequate tree trimming along transmission rights of way, insufficiently coordinated equipment maintenance and generation scheduling during stressed conditions, and insufficiently funded transmission.

    In addition to avoiding blackouts, we can, by using fuel cells and renewable distributed generation, also avoid almost all toxic pollution. The small scale of these devices would also permit a large number of electric power suppliers thus facilitating real competition in the supply of electric power.

    Going back to the 1880s we note that small central stations were placed within one half mile of the load they served. When Nikola Tesla's polyphase transmission became available, there was a trend to ever larger central generating stations because these were more efficient. Since fuel cost was the biggest single cost of generation, there was a trend to larger generating stations needing transmission to collect load of first several and than many load centers, placing a substation at each load center collecting load from individual sites by primary distribution. Larger central stations also brought down the unit investment cost of the capacity.

    There is no intrinsic value in transmission . The reason for the transmission was the inadequacy of the small sized coventional Carnot cycle generators to convert the chemical energy of fuel into electric power efficiently. Edison's Pearl Street "Big Jumbo" reciprocating steam engines are estimated to have had an efficiency of only 8%. Over the years central station size increased to as large as 1,400,000 kW increasing the efficiency of base load steam turbines to 38% and in supercritical units to 43%. Later came the somewhat smaller aeroderivative simple and combined cycles with slightly higher efficiencies. Now we have tiny simple cycle fuel cell generators of 300 kW which can supply a DC output at an efficiency of 54% and even after inversion to AC can do so with an efficiency of 47%. Integrate the fuel cell with a turbine and raise the efficiency to 65%.

    Is this a real solution? How about the high first cost of the fuel cell capacity?

    Yes it is a real solution. The fuel cell manufacturers have already solved the problem of efficiency for small generators. They can solve the high cost problems of manufacture by volume production. The production/cost curves slope down drastically with volume.

    For effective system planning skip the transmission.

    Wallace Edward Brand

    Len Gould
    Mr Brand. Your repeating of the (conventional) arguments for distributed generation fail at the point of the required fuel source for the distributed generation. The reason few large organizations will commit to these systems is that they have calculated the Nat. Gas costs, curent and future. Or have you found a fuel cell which runs on coal and a a guarantee of no future CO2 penalties?

    Wallace Brand
    Mr. Gould: I believe that the construction phase for a 2 MW carbonate fuel cell to be operated on coal gas at the Kentucky Pioneer project is complete and I am awaiting the operating results. The project is estimated to reduce carbon dioxide emissions by about 20%. I am not familiar, however, with the necessary scale for the BG/L or Destec gasifiers. If large scale is necessary, then transmission again becomes necessary if the fuel cell must be integrated with the gasifier.


    Copyright 2002-2004, CyberTech, Inc. - All rights reserved.