Myths of the Year 2000

Martyn Thomas

Chairman Emeritus, Praxis Critical Systems


© Martyn Thomas 1998. This article appeared in the Computer Journal 41(2), 1998. Reprinted by permission.

The shape of the Year 2000 problem around the world is becoming clearer, as many companies finish their building their inventories of affected systems and processes, and are able to assess the time and resources they will need if they are to reduce their risks to the minimum. For two years, I led Year 2000 services for one of the world’s largest global management consultancies, seeing projects in most industries and in many of the world’s leading economies. This is a snapshot of what I have learnt.

Myth 1: Year 2000 is a single problem

Several problems come together in the next three years.

For hundreds of years, people have abbreviated dates by omitting the century, causing ambiguity and confusion for historians and archivists. In the 1950s and 1960s, as computers were used more and more for business data processing, it was inevitable that this convention would be carried forward. Storage space and processor cycles were scarce and expensive, and the cost of any potential ambiguity seemed insignificant. Few programs had to handle date ranges that spanned two centuries, and those that did (such as pension administration) were either written to cope, or they soon encountered problems and were corrected.

As we reach the end of this century, most programs will need to manipulate date intervals that cross the century boundary. When the year is only represented by two digits, files that are sorted by date will have "00" records added at the front rather than after "99". Calculations that subtract an earlier date from a later will get a negative result and fail. Comparisons of dates in different centuries will give the wrong answer, so that a credit card that expires in 01 seems 98 years out of date in 99, whereas one that expires in 99 may seem valid in 01 (and for a long time afterwards). Similar problems arise with the shelf lives of perishable foods and medicines.

Throughout the 50-year history of computing, whenever there was the possibility of a serious problem, programmers have found many creative ways to make the problem worse. The "two-digit year" problem is no exception: year values of 99 and 00 have been used with special meanings or to mark invalid fields. Programmers designing user-friendly systems have assumed that if the year field is typed as 00 up to 09 then what was meant was 90 to 99, because the 9 key is next to the 0 key and these are common typing errors. Some programmers, knowing that century years need different leap year processing, have then made mistakes in the calculation and lost February 29th 2000 (1).

There is also a separate, coincidental problem with the real-time clock in PCs, which may reset to 1900, 1984, 1980 or some other date instead of ticking over into 2000 successfully at the end of the century. This will not usually cause a problem as the BIOS in most recent PCs will detect the error and correct it. At worst, someone may have to reset the clock once manually. However older PCs and those with a faulty BIOS may need the correct date set every time they are powered up, and if the PC is being used to control some process directly, with the time taken straight from the real time clock and not through BIOS calls, any clock failure may have more serious consequences. PCs performing critical applications will need checking and may have to be replaced. (2)

Myth 2: Year 2000 is mainly a problem for mainframe systems

Three quarters of all mainframe applications have year 2000 faults. Finding the errors, making corrections, recompiling, re-linking, testing, integrating and further testing will cost a great deal of money: somewhere between 25p and £1 per line of software, depending on how professional the IS department is. Unfortunately, most large companies cannot reliably rebuild all their mainframe applications from program sources, even without the added problem of needing to change 10% of the source lines. The latest changes do not appear in the master source libraries, or parts of the system have not been recompiled for so long that they need an obsolete version of the compiler. (3) The stories are depressingly common, and the lack of basic software engineering disciplines will probably double the final cost of the Year 2000 problems.

Even so, an average of $1 US per line of software source may not seem an enormous cost, but many companies have tens of millions of lines of mainframe software source, and some have billions. An unforeseen expense of over $1 billion, with no business benefit, may not fatally wound a Fortune 500 company - but it is certainly painful and represents a volume of work that is unlikely to be funded, staffed, and completed successfully before systems start to fail.

So perhaps it is surprising that mainframe applications are not the biggest part of the Year 2000 problem.

Mainframe applications are usually managed by teams of programmers who know their systems well and who are able to change them and rebuild them competently. This may not be true for departmental systems (e.g. stock control), desktop systems (e.g. spreadsheets, laboratory systems), factory and warehouse automation, EDI, or communications systems. These present greater difficulties, because they may have been acquired or developed informally, the original vendor or developer may have disappeared, and the system may not be well understood by anyone. (4)

Year 2000 problems also exist in security and access systems (5), in air conditioning management and building control, in vital control systems, such as those driving industrial gas valves or monitoring temperature in power stations, and in engine management systems, alarms, and consumer products (6). The list of potential areas of risk is almost endless. It is already far too late to find and correct all the faults in these "embedded" systems, but some will be critical to safety, the environment or the business, and must be given priority for diagnosis, correction or replacement.

Even if the business has very modern systems, thoroughly checked and warranted free of Year 2000 problems, there could still be trouble. Customers may be unable to pay invoices for lengthy periods. Suppliers may fail, perhaps several of them at once. Business partners may have to switch from electronic data exchange to paper. Essential utilities may be interrupted.

Year 2000 is not especially a mainframe problem, or even an IT problem.

Myth 3: Year 2000 is not yet urgent

It is unfortunate that the 21st Century Date Problem was not called the 1999 problem, or even the 1998 problem, since that is when many systems will first fail. Too many companies are still saying "we know that we have a Year 2000 problem, and next year we will put something in our budgets to sort it out".

For most companies, systems will start to fail in 1988 or 1999 if they are not failing already. The critical time, for every application, is the first moment that it encounters dates in the 21st Century. From that point forward, errors could occur at any time. They may cause application failures, they may cause wrong results that are obvious, or the failures may be much subtler. Wrong data may be calculated and stored or passed to other systems. Records may be sorted into the wrong sequence and processed twice or ignored (7).

It makes sense to talk about the failure horizon for each application or item of plant or equipment. Some of these dates will be much closer than you expect; some may even have passed.

Myth 4: Year 2000 is an issue for the IT Department

Year 2000 affects the whole business, the deadline is immovable, and resources are limited in every company. Inevitably, important business investments will have to be delayed or abandoned if the year 2000 project is to be given the resources it needs. In most companies, only the executive committee or the Board can take such decisions. Auditors are already commenting on year 2000 readiness in their reports to audit committees. Soon they may have to start qualifying companies’ accounts. There may be issues affecting legal regulation, Health and Safety legislation, and litigation risk. Insurance cover for Year 2000 damage is limited and, in some cases, has even been withdrawn completely, leaving companies and individual Directors exposed to the possibility of crippling damages. Few IT Directors have the breadth of knowledge and executive authority to make the necessary decisions on behalf of the Board. Year 2000 is not an issue that can safely be left to the IT department.

Myth 5: Year 2000 is the only date-related problem

Year 2000 is a very significant member of a family of date related problems. The GPS Global Positioning System overflows an internal clock field in August 1999. Countries that use local calendars have similar problems on other dates - for example, some Japanese systems used a calendar based on the years of the emperor’s rule. The hardware clocks in most (perhaps all) processors and the date fields in most operating systems overflow at some time - one such problem occurred last Autumn. Then there is the Year 10,000 problem - but that can wait for a later issue of Computer Journal.

Myth 6: There will be a magic technical solution

The problems that have been created by incorrect date programming are very different from each other and embedded in almost every form of electronics technology. The corrections that have to be made, and whether they can be made at all, differ for each application. There will be no magic solution (8). There are tools that can be very cost-effective in helping with parts of the problem: preparing an inventory of software on a particular hardware platform; scanning code for suspected date processing; managing test data or controlling versions of source code. These tools can save more than half the effort that would otherwise be spent in some phases of the Year 2000 programme, and the cost estimates given earlier assume that tools will be used. Nevertheless, most risks can only be identified, prioritised and managed by people who understand the business and its processes.

Myth 7: The problem is under control

It is very difficult to get accurate information about the scale and nature of Year 2000 problems nation-wide or world-wide. At the end of 1997, it seemed that most companies had not finished their inventory of Year 2000 risks, so they had insufficient information to be sure what the problem would cost them, or whether they would get all the necessary work completed in time. Not surprisingly, most companies initially underestimate the scope and cost of the work needed, so budgets and timescales are constantly revised upwards. Those surveys that have been published have all depended on data from questionnaires filled in by companies themselves, without independent audit. The surveys are inevitably based on incomplete information and optimistic estimates.

Organisations are not good at delivering complex projects on time and within budget. Estimates vary, but it seems that more than 75% of projects are late or over budget and that many of the remaining 25% deliver less than was originally intended. Year 2000 has fixed deadlines and scope; it seems inevitable that a lot of the desirable work will not get finished, that testing and other quality management activities will be skimped, and that unplanned failures will occur.

Internationally, the level of awareness and action differs greatly from country to country. My impression from my own international experience, which is supported by the leaders of Year 2000 services in other major consultancies, is that the USA and other English-speaking countries are generally ahead of the rest of the world, but that even these countries still have a large part of their economic activity at risk. In continental Europe, preparations for European Monetary Union have taken priority over Year 2000 work. In Asia, awareness of the issue is at an early stage, although the problems exist in the same form as they do elsewhere. Central Europe and Russia seem to have major problems, as do South American countries.

The evidence is weak, in that it is anecdotal, but it is quite consistent. The problems are far from being under control.

Myth 8: There is nothing the individual can do

Computing professionals created this problem, and we have a responsibility to do what we can ensure that the risks are well understood, that priority is given to the most important areas (9), and that nothing like this is ever allowed to happen again.

The individual can ensure that their employer is made aware of all the Year 2000 risks and the actions that should be taken. It is particularly important that companies that play an important role in the national infrastructure keep functioning; if there is a risk that they may fail, their customers need to know the worst feasible outcome so that they can be prepared. A free flow of credible, auditable information is essential (10).

In another, very different way, individuals’ actions will play a major part in what happens in 1999 and 2000. If there is long-term disruption at the end of the century it could cause serious damage to the health, wealth and security of individuals and families, so if there is a belief that such damage is a real possibility, people will sensibly try to protect themselves. Consumers will stockpile food, water, and fuel. Investors will want to hold real assets of assured value. This is rational behaviour and feeds on itself. As shelves empty and shortages occur, more people will buy when they can, for fear that they may not have the opportunity later.

The same factors will influence professional forecasters. If a bank is unsure that a company will survive undamaged, it will be more likely to reduce the company’s overdraft limit than to agree to additional credit, yet this will be at just the time when the company may need help to get through a cash crisis if customers have system failures and cannot pay their bills on the usual timetable. Each investment fund manager will have to decide whether they want their fund to be invested in a potentially falling market, or if they would rather sell and wait for buying opportunities. If they decide to sell, they must act while other managers still want to buy, so markets may become increasingly unstable.

It would be satisfying to be able to end with reassurance; a plan of action perhaps, and some confident predictions. Unfortunately, over the two years since I first became involved in Year 2000 issues, my fears have grown alongside my understanding, and the reports in the press have become steadily worse. The only solution that I can see is that we continue to prioritise and to address a million critical problems with diligence and urgency.

To quote Franklin Roosevelt in 1942: "Never before have we had so little time to do so much."

© Martyn Thomas 1998. This article appeared in the Computer Journal 41(2), 1998.

Martyn Thomas is Chairman Emeritus of Praxis Critical Systems Ltd. He can be contacted at: mct@hollylaw.demon.co.uk


(1) A year is a leap year if it divides by 4 with no remainder, unless it is a century year, which must divide by 400 with no remainder. If a programmer ignores the century rule, 1900 and 2100 will be incorrectly identified as leap years (which they are not), but 2000 will be treated correctly.
[back]

(2) It is important to prepare carefully before testing a PC by changing the system date to see what happens, as there may be unexpected consequences.

A company that was using a PC-based email client tested its PCs in this way last year, only to discover that the licence for the email software expired. On resetting the date to 1997, the email software still would not work because it maintained a secret record that it had already expired, as security against a dishonest customer resetting the PC clock to an earlier year so that they could continue to use the software after the expiration of the licence. Even reinstalling the software did not clear the problem; it was necessary to reformat the disks and reinstall Windows.

Some level of check can be made with one of the utilities that have been designed to test the PC clock and BIOS. It is important to understand what these utilities are actually testing, as the various utilities report different results on many PCs.
[back]

(3) Ideally, a company will keep a register of all its systems, and changes to these systems will be rigorously controlled, to ensure that there is full and up-to-date design documentation, that all changes are approved, documented, and made by modifying the design and the program source, recompiling, rebuilding and thoroughly testing, and that the test suite is kept up to date. These change control processes should also ensure that all the files, compiler versions and other tools needed for successful rebuild of every system are kept in a working state.

Unfortunately, some IS managers have decided that these disciplines are not necessary or cost-effective (perhaps in response to time pressures and lack of resources). Their companies are now paying the price for this lack of professionalism.
[back]

(4) In one factory, the critical testing tools were calibrated daily using a program that had been written some years before by an engineer who had now left. The calibration program was essential to the manufacturing process and the binaries existed on the C: drive of several PCs, but no source code or documentation could be found. When the program was tested with a Year 2000 system date it crashed.
[back]

(5) Security systems may use cards containing an expiration date that only has a two-digit year. The first sign of trouble may be when new cards, expiring in the next century, are rejected by the system. There is a more insidious fault: when 2000 comes, all the expired cards may suddenly become valid again.
[back]

(6)I have deliberately chosen examples of applications that have been shown to contain serious faults.

For example, a UK power generator reported at an industry Year 2000 meeting that they had carried out some Year 2000 testing on a UK power station during routine maintenance. One test involved setting system clocks to the end of the century and watching what happened when they ticked through midnight on December 31st 1999. Shortly after the start of the new century the power station was shut down by a temperature monitor in a cooling stack. The temperature monitor had been programmed to respond to average temperature readings over a 20-second period, which it calculated by taking readings every second and time-stamping them. When it attempted to average readings that spanned two centuries the calculation went wrong and the monitor tripped, shutting down the station.

As a single incident this would be embarrassing, but probably not disastrous. When every power station using the same monitor, in the same time zone, trips simultaneously, the consequences could be more severe.

More complete lists may be found on the Internet, for example at http://www.compinfo.co.uk/y2k/examples.htm and http://www.iee.org.uk/2000risk.
[back]

(7) There have already been a number of illustrative failures. A consignment of tinned meat that arrived at Marks and Spencer in 1996 had a bar-coded expiry date in 2000, represented as "00". The stock control system treated this as 1900, making the meat 96 years past its sell-by date!

A labelling system for pharmaceutical products failed when it was first required to generate Year 2000 expiry dates, brining the production line to a standstill.

A packing system was programmed to dispatch the products with the shortest shelf lives first. It packed all the products that expired in "00" before those expiring in "99".

An advisory system monitoring nuclear waste storage recommended the release of radioactive waste several years too early, because the calculated number of half-lives led to a release date in 2002 ("02"), which was then treated as 1902.
[back]

(8) Every few weeks, someone publicises a "new idea" to solve the Year 2000 problem. These ideas have ranged from automatic scanners that will automatically modify software, to resetting all calendars back 28 years at some agreed instant before 2000. It should be clear that no automatic scanner could do enough of the task, in general. The suggestion that the calendar should be set back a multiple of 28 years (to preserve the correct day of the week) is ingenious but, on consideration, seems as difficult to implement safely and consistently as other solutions, and creates other difficulties. Nevertheless, this may be a viable tactic for isolated equipment, such as some domestic video recorders.
[back]

(9) Any Government that understands the seriousness of the Year 2000 crisis will want to ensure that it does not make the problem worse by passing legislation that requires big changes to computer systems in the next two or three years. At the time of writing, in March 1998, no Government has announced such a moratorium.
[back]

(10) The three-day week during the UK miners strike in the early 1970s showed how effective planning can be. Despite the absence of electricity on two week days, industrial output actually rose.

The worst outcome of Year 2000 would be major, unexpected, long lasting failures of power distribution, water, healthcare, communications, security, transport or emergency services. Short term failures are sufficiently likely, even under normal circumstances, that most organisations have adequate contingency plans. Hospitals and air traffic control centres have stand-by power generators, but they only store enough diesel for a few days, and building new storage tanks will take time. In the extreme, if food distribution and emergency services were to break down for an extended period, there would be risk of serious public disorder.

There is not yet enough information available to rule out such a possibility.
[back]