How One Company Is Exterminating The Millennium Bug
If you think fixing the Year 2000 bug is routine, talk to Mary Libens, chief "Y2K" troubleshooter at Medical Mutual of Ohio. The $2 billion medical insurance company in Cleveland made all the right moves. It got an early start grappling with its millennium problem. It brought in Ernst & Young consultants--some of the most experienced in the business--at bargain rates. Libens' coworkers cooperated, grasping the enormity of the problem. And senior management made Y2K a top priority.
The results? At 20 months and counting to Jan. 1, 2000, Mutual is still racing to debug and test its systems. Libens' team long ago despaired of scrutinizing every date in the company's sprawling software operations and databases, which contain the equivalent of 64 billion pages. For whole swaths of noncritical programs, they've resorted to a patchwork of quick fixes, trade-offs, tricks, and triage. Through it all, a string of unpleasant surprises in the repair process, called remediation, have left Mutual's Y2K team awestruck at the task's scope. "The job is orders of magnitude bigger than expected," Libens admits. "We're scrambling."
The good news: Mutual's critical systems should be shipshape by the end of 1998, allowing for a full year of testing. When the final bell tolls, Libens is confident that no major system will bump into a date that reads "00," interpret it as 1900, and run amok. But it has been a slog, even with tons of time and resources. And though all may end well, Mutual's travails and harrowing discoveries make it a chilling case study for any large company that is just heading down the Y2K obstacle course.
The majority of big corporations like Mutual have spent much of the past 40 years adding new computers and software willy-nilly to their data centers, building up great Augean stables of information and a tangle of jerry-built routines. With Y2K, for the first time they will be forced to sift through these diverse systems. The programs they devise to remedy the date bugs--like any software--will be imperfect, causing new glitches. And at this late date, companies could have trouble enlisting the aid of qualified consultants, most of whom are already tapped out assisting customers that signed up early. As the clock ticks, programs slapped together will produce increasingly less satisfactory results. The problems, says Libens, "are overwhelmingly huge."
Mutual owes its early start to Libens' boss, Chief Information Officer Kenneth Sidon. Early in the decade, Sidon recalls that the millennium bomb "was something we kidded about." But when he began to consider the implications for Mutual, then known as Blue Cross/Blue Shield of Ohio, all joking stopped. In 1995, he began combing through the company's systems and was aghast at what he discovered: a dizzying total of 25,000 computer programs, many added through three acquisitions made over the previous decade. All told, Sidon counted 70 distinct operating systems at work in 36 locations across several states.
The hardware diversity was equally horrific--a hodgepodge supplied by 64 different vendors. In one basement office, an ancient IBM mainframe toiled away. In another stood a spanking-new Hitachi mainframe, running a different operating system. PCs--some 3,300--were linked in heterogeneous networks managed by UNIX and Windows NT servers. These shared work with about 800 1970s-era "dumb terminals."
Incredibly, the hodgepodge performed just fine for years, processing a million claims for Mutual each month with nary a hitch. But Sidon knew this would unravel when it ran up against the millennium. Surveying the awesome complexity, he made a key decision: Mutual would fix as few dates as possible so it could focus all its resources on the dates that mattered.
DIGITAL TRIAGE. Some of the calls were no-brainers: Birthdates would all get remediated because any system handling a claim in the year 2000 would need to know whether a customer born in "99" was a baby or a 101-year-old. For the rest, Sidon's team began sorting programs into three categories. Marked for oblivion were all customized programs that would be obsolete by the Year 2000. A second group, deemed low-priority, included those programs whose dates weren't critical to operations. The third group consisted of software from outside vendors such as Electronic Data Systems, IBM, and hundreds of smaller players, which Sidon felt should be fixed by the supplier.
At this juncture, Sidon was handed another nasty surprise. Some suppliers had never gone back and tested their products for Y2K compliance. Many of the smaller companies were no longer in business. Some who were still around tried to turn Mutual's emergency into a sales opportunity, asking Sidon to buy special, Y2K-compliant upgrades. "We've shut some of those vendors down," Sidon snorts.
After the initial winnowing, Sidon found he had 9,000 programs to revamp. In October, 1995, he went to top management and the board for the first time and explained the problem. It was an easy sell, he says, and he was handed an initial budget of $5 million. That's small change today, when demand for expert Y2K programmers far exceeds supply. But in 1995, the Y2K service niche was just taking shape and consultants were hungry for experience in a promising new market.
Sidon got in touch with William T. Ruckle, managing director of Ernst & Young's nascent Year 2000 business. Ruckle's group was just finalizing plans for a 12-person software "factory" in Costa Mesa, Calif. At the same time, E&Y was negotiating a joint venture with an Indian contractor, the software arm of the giant Tata Group. This backup greatly eased the anxiety of customers such as Sidon. When the Y2K crunch hit, Ruckle explained, E&Y would be able to bounce terabytes of clients' data off satellites to programmers in multiple time zones. That way, E&Y would be able to fix faulty dates around the world and around the clock.
Ruckle took on Mutual as his first customer, offering to handle the bulk of Sidon's business in return for most of Mutual's budgeted $5 million. That's about one-third what he would charge for a similar contract today. "It's the one we learned on," he explains.
After that, Sidon and his team performed the digital equivalent of brain surgery on Mutual's systems. Sidon and Ruckle began by dividing the company systems into seven clusters, including workers' compensation cases, prescription drugs, and the most crucial one: the claims system. One by one, the team isolated the clusters and determined which programs and files to fix.
SAFE ZONE. Along the way, the two companies used myriad work-around schemes to safeguard programs and data without actually changing any dates. One is a well-known technique called a "sliding window," in which a computer is instructed to interpret any ambiguous two-digit date from 00 to 50, say, as 2000 to 2050. Currently, Mutual has applied the sliding window to about 80% of its date fixes.
Wherever two-digit dates must be replaced with four-digit ones, Mutual ships the whole software cluster out to Costa Mesa with explicit directions on what needs to be fixed. While under repair, that portion of the company enters a state called "freeze." People can still call up needed information. But they can only enter new information once a week or at the end of the entire process, depending on the depth of the freeze. It's a huge inconvenience. But it assures that no one unwittingly enters a program with a two-digit date, thereby contaminating the cluster and spoiling perhaps thousands of hours of work.
After the fixed files return from California, the Mutual team and its E&Y assistants begin testing--a process that soaks up 45% of total remediation time. Finally, each repaired cluster is removed from freeze and placed into a clean area of the mainframe, called a "library." This is segregated from the dirty library, where the two-digit files are still awaiting their fix. And woe to the employee who absentmindedly shuttles a dirty file into the clean library. "It's a danger that we face--making sure that what gets renovated stays renovated," says Sidon. Today, three of seven clusters reside in the clean library. Sidon intends to fix the remainder by June, so that Medical Mutual can close the books on 1998 with the new system--and make sure that it works.
FALSE HOPE. But if this sounds like a tidy finish, it's not, as Mary Libens can testify. Take the sliding window. Libens originally planned to use this shortcut to deal with two-digit dates on electronic files that are used to track annual deductibles on claims. Since nothing in these files other than birthdates refers back as far as 1950, the team wrote a program that instructed computers to simply interpret all year dates up to 50 as postmillennial.
This solution almost worked. Programmers who tested it in Costa Mesa said the computers were no longer confused about which century the files addressed. But there was a different, unexpected snag: Whenever the computers bumped into the number 00, they would assume that they had reached the bottom of the deductible file. "There was no way to code around it," says Libens. If they let it go, the computer could easily make an incorrect payment, by assuming, for example, that the deductible hadn't been paid, when in fact it had been. The program had to be scrapped, and Libens is back to inserting four-digit dates in every file.
Libens perseveres, because she knows Mutual will make it through. But what of the large companies that are just getting started? In a recent survey of 450 businesses in North America, the Information Technology Assn. in Arlington, Va., found that 45% of them are still studying their systems and have yet to start fixing them. A year from now, they'll begin hitting the same kinds of hitches and delays that Mutual has been living with for four years. It's inevitable. It's software. To one and to all...Godspeed.