Y2K + 20

There’s been some discussion on the inter-webs of the Y2K bug/crisis, partly as it still biting us, because some fixes were not permanent but were more stopgaps, and 20 years later, we’re running across some of those. Parking meter card readers in New York, or freight truck telematics. And these aren’t minor issues, but tend to cascade into reboot loops and other catastrophic failures. But wait! Those are new and cool. They aren’t old-timey systems that were around in the late 90s, much less earlier. How did they get a Y2K bug, at all.

Okay, first. If you were wondering why all this happened, forget just the space saving issue. It many ways it was short-term planning and assumptions of obsolescence. The systems that we realized would fail in the mid 90s were mostly built in the 70s (or, were their descendents and no one revised that part o the code), and no one expected they'd be around decades later. So the two-digit year shortcut was a reasonable assumption for early software, not just done out of laziness or stupidity. How can it be around in decades? Running critical systems? No way.

Many Y2K fixes were the same. A stopgap that says “don’t assume the year is from 1900, but from 19… oh, how about 20, assume that it’s 20[nn],” on the assumption it'll get replaced by 2020 surely. And yet, same issue. Forgotten, or parts of the code/database reused even today.

And why do new systems still have these bugs? What’s “new” anyway? Go find a software dev or DBA, strap them to a chair, and make them admit that a hell of a lot of their work is libraries they got online, borrowed from someone else, or simply IS a legacy system they keep on using. Decade after decade.

But much of the coverage has been that it was all overblown, that nothing happened, and therefore “the media” blew it out of proportion. If your politics lean towards conspiracies and anti-government rhetoric, you may even use it as an example of why not to listen to climate change alarms. No, seriously, that’s a thing.

Well, I am one who was old enough to live through it, and did a tiny bit of work related to it, had friends and co-workers who were deep into this stuff.

Before the day-of, there were plenty of giggles and eye rolls even among the fairly technical folks. See, the big end-consumers of stuff decided they would not be caught out. Governments, big manufacturers (auto, appliances, computers), some stores, construction, etc. would require a Y2K audit from all their providers. All of them. Audits don’t judge, or “common sense” anything, but demand a real audit.

A common one to guffaw about was sand. An actual provider of sand (a stone’s throw away from me here in Kansas no less, FML) shared their request for audit, for the lulz. "Y2K compliant sand, funny.”

BUT... 

That most people didn’t laugh it off is actually exactly why there was no disaster. When a sand plant was asked for a Y2K audit, it is because the customer isn’t actually buying sand, but the ability of the plant to deliver sand, to specifications, in the quantities they asked for, at the right place, on time.

That means, cascading down supply chain, full process analysis. 

What's that take? Well, it's not just sand and shovels. It's dredges, pumps, seives, trucks, scales, baggers, more trucks, invoicing, accounts receivables, phones, computers, databases, file cabinets, payment terminals, scanners,... etc.

That same article of Y2K jokes and eye rolls has a solid point that gravestones, the actual stone ones, were not Y2K compliant; they were pre-cut with the “19” part of the year on date of death, so as a graveyard, or supplier, you need to get new “20” ones, and not have too much 19 inventory. Once the manufacturer gets up to speed. Supply chains!

And very, very often, some part of the business even in 1998 was computer controlled. So the sand plant reluctantly hires someone to do the audit, and they tell the manager that (say) the gate access control has a 2 digit year, so the trucks can't get into the plant when it re-opens on January 2.

The scale is the same, as well as tied to the gate ID, so won't work either. And on testing, that actually means all the records will be unrecoverably mangled, so the truckers won't get paid, ever (and good luck getting deliveries the next week if that happens), and you can't tell where the product came from so the entire inventory and quality control system implodes. You are out of business in a week. 

So, you fix it, or more likely you either replace the system or go yell at the supplier of the access control systems, and database. The audit cascaded down, and this happened at hundreds of thousands of dumb little businesses all over the US, millions around the world.

And thus, no tiny link in the chain caused it all to come tumbling down. 

Preparation, and taking it seriously, working hard, for years means we didn’t have a crisis. Maybe, for the last time ever. Whether social network vulnerability to evil intent, poor security of many digital products, urban planning, or climate policy, we’ve apparently lost this ability to look more than 2 weeks into the future and do something about it.

BloggingSteven Hoober