Chernobyl to COVID — on Truth in Data

Yeah, I watch a lot of mainstream movies and TV when I get around to it, often years later. Just now I’m watching Chernobyl, the HBO series.

The not great not terrible meme is almost entirely missing a critically key point that we are also missing a whole lot in our day to day design work, and even our understanding today of how the pandemic is impacting and threatening us.

off-scale-high.jpg

It’s even explained right there in the show:

Legasov: Yes, 3.6 roentgen, which, by the way, is not the equivalent of one chest X-ray, but rather 400 chest X-rays. That number's been bothering me for a different reason, though. It's also the maximum reading on low-limit dosimeters. They gave us the number they had. I think the true number is much, much higher. If I'm right, this fireman was holding the equivalent of four million chest X-rays in his hand.

The number they had

On 28 March 1979 the morning shift arrived at the control room for reactor 2 at the Three Mile Island nuclear plant. As they went about their work of recording all info to hand it off, they noticed the overflow cooling tank temperature was high.

A digital readout said the temperature was 280° F. Which is, much like the Chernobyl dosimeters, higher than normal, but not too terribly bad.

It is too hot, so they start dealing with it, and when things don’t respond right for other signal failure reasons, start taking more and more decisive actions to account for a loss-of-coolant accident, instead of a danger of a going-solid accident.

Of course, this is all wrong because the water is not really 280°. It’s a digital readout that simply crops off all values over 280, and unlike on a dial meter where you can see the needle pegged to the end, there’s simply no indication of this. The control room technicians didn’t even know that 280 was a lie.

Gauges are not the machine

In the excellent book Inviting Disaster, which covers the TMI2 disaster, and many others, James Chiles talks not just about displays, but about how the control room is oddly isolated from the machinery.

A nuclear power plant is a boiler-driven system, and no boiler operator from a century before this would let himself be remote from his machinery. Not just in proximity, so he can observe it, but in the type and value of his instruments. Those instruments were developed carefully, over time, to meet specific needs and avoid dangers.

Boilers used to blow up a lot, and eventually a few safety features were added, like overpressure valves; you hot water heater has one of these even today. But a good operator of a steamship boiler would likely be fired if the safety valve released pressure. You have to know how the equipment works, and know what is actually happening.

All sensors and and gauges have limits. There’s no way around this, due to physical constraints. But back in the old days when machine operators were allowed to be near their machines, and the guages were all dials, they knew what everything meant, contextually, and when to ignore a nonsense reading because the machine is working, or not working, instruments notwithstanding.

99 new messages and heat in the summer

Almost none of you are nuclear power station workers, but you encounter this same issue every day in one way or another. And many of you can design systems to avoid it, now you are aware.

Let’s take one simple, often-misused case, the alert count. There’s a little red circle hanging off the side of your email, Twitter, Slack, or other icons. If big enough, it has a number inside it, the count of new messages.

How many of you have that number say “99.” Yeah, a lot. And how many are there?

Not 99. More than that. Same issue exactly. Over two characters doesn’t fit, so the digital industry decided that we just truncate it. This wasn’t a good idea in the 70s, and it isn’t a good idea now.

You should indicate when it’s “a lot” instead of lying with the largest easy-to-show number you have. I actually had a thermostat that did this once. It would say “Lo” when temp was too low. Actually, a few cars I have seen do this also. You can set the thermostat to 64°, but if you want it colder, it’s just Lo, and works as hard as it can.

And that’s the next problem: your system taking actions for you based on out-of-range data. See, the thermostat has a problem, and in the heat of summer would decide it was dangerously cold, so for a transient bad temp reading fire up the furnace on 100° days.

Precision, truth, and your real location

I encounter this sort of use of data incorrectly a lot in location services I have used. Because location is another where everyone assumes high precision equals high accuracy.

Traditional location reference systems have imprecision built in. If you say that you are at 39° N 94° W, then humans understand that’s pretty vague compared to 39° 01.650' N 94° 39.582' W.

Computers do not, and people’s interpretation of computers are as bad. The vague one isn’t, and is read and used as 39.0000° N 94.0000° W. That’s not read as “about here” but /exactly/ at this grid reference to four decimal places. Now there is an accuracy value also, usually an R85, meaning the system is confident that there’s an 85% chance that the very, very precise location is actually somewhere in a circle. But there’s a 15% chance it’s outside that.

If you go to the map program on your phone, zoom in and you’ll see your location isn’t a pinpoint, but a circle. Keep watching, and you can see it wander and jump around.

I’ve seen this data misused, in ways that cause really troublesome answers, like jumping in and out of geofences, so firing notice rules over and over. Or, fully bad data that shows all zeroes. Which we’d say is wrong, but computer go right ahead and interpret. Yet, I am confident there is not a fleet of trucks and aircraft parked in the middle of the ocean, 600 miles south of Ghana.

Data sources and crashing into the ocean

Say instead you are flying a high tech airliner, at night, over the ocean. Your instruments become useless, airspeed and altitude randomly moving from off-scale-low to other, arbitrary values. Or are they? Are some of them the true values and you can fly to that?

Who cares because you are fighting the computer, that doesn’t understand transient data, impossible fluctuations, and make do. It is reacting to every change with a series of warnings, some of which are contradictory: over-speed and stall, at the same time. Without any visual frame of reference, there's no way to tell even roughly how high or fast you are going. What do you do?

Ask for help.

Air traffic control has radars covering the area still, as you are close enough to the city still. So you call back and tell them the problem, ask them to give you accurate information off their screens.

They can easily, if slowly and by voice, give bearing, direction of travel and speed, and altitude.

Who knows what is wrong with this last data point?

Altitude is not derived from radar data, but is instead telemetry; the aircraft sends the information, along with the identifying codes, to the air traffic control system, and then it appears on the radar screen. If flight instruments are inaccurate, like for AeroPeru 603, then the number on the ATC display screen is wrong also.

And as a result, you fly into the ocean and everyone on the plane dies.

This is the same issue, of data presented on screen being entirely trusted at face value. The radar displays radar data, right? How can some be other type? How can it be wrong?

Tested is not infected

This same exact issue is arising daily in our new pandemic world. How many people in the US are infected with the SARS-CoV-2 virus? How many have been killed from COVID-19?

We don’t know.

The numbers you see, even from very, very good sources like Our World In Data, are the reported numbers. Hopefully by now you are staring aghast at the screen as you realize that our national testing program is the low-rate dosimeter.

Just in the last day there was some fanfare that the 2 millionth test was made available, which means that — even if everyone only needs one test, which isn’t true — we might have data on as much as 0.6% of the population.

How many are infected in the US? All we can say is “more than” 16,690. We do not know how many with any degree of precision, but we know with absolute certainty when the meter is at the edge of the measuring range, that it is more than the reported number.



All data and other references to current events inside the post above are as of the morning of 10 April 2020. Links will provide real-time updates.

https://twitter.com/jamesrchiles

https://twitter.com/OurWorldInData

BloggingSteven Hoober