This is a guest post by an anonymous IT professional.
We’re living in strange – “unprecedented” to use the cliché – times of the sort usually reserved for the subject of a dystopian novel. Unsurprisingly, the response to Covid-19 has exposed shortcomings in government. While the media love to report numbers (often with little or no context to put them into perspective) and refer to experts, little attention has been paid to the quality of those experts’ work.
Professor Neil Ferguson of Imperial College has told us that he implemented a flu pandemic model in thousands of lines of undocumented computer code 13 years ago:
I’m conscious that lots of people would like to see and run the pandemic simulation code we are using to model control measures against COVID-19. To explain the background – I wrote the code (thousands of lines of undocumented C) 13+ years ago to model flu pandemics…
— neil_ferguson (@neil_ferguson) March 22, 2020
To say I was gobsmacked at his admission is an understatement. He’s one of the experts advising the government about the Covid-19 pandemic, and was consulted in previous health crises such as Foot & Mouth disease. Like the approach to combating that, we’re seeing a kind of scorched earth approach to containing another transmissible disease.
Even though the “C” programming language that Ferguson used is nearly 50 years old, the language chosen isn’t the problem. Undocumented means that modules and other code fragments are not commented, so their purpose may be unclear to someone unfamiliar with the code. In the worst case it means that modules and variables don’t have self-documenting names. For example, an accounting program could have the variables BalanceBroughtForward and BalanceCarriedForward, but a sloppy programmer might call them B1 and B2 instead – a sure recipe for confusion.
The “C” language is good to work with but has some inherent issues which can lead to subtle bugs affecting the output without causing an error. A common problem is the conditional which uses two equals signs rather than one.
To compare variables A and B for equality you would write this: if (A == B). However, it’s easy to accidentally write this: if (A = B). The latter always returns true and assigns the value of variable B to A. I have no idea whether Ferguson’s code contains any bugs, this is just one minor example of the need for strict testing.
The reason for commenting code extensively and properly is so that the programmer himself, and anyone else who maintains it, can understand what it does and how it works, reduce the chance of mistakes and avoid unnecessary effort. During my IT career I would have terminated the contract of any contractor working for me who wrote thousands of lines of undocumented code. Not only is such code a nightmare for others to work on, it can be difficult for the original programmer to maintain if coming back to it after a long time. Sloppiness in the coding raises the worry of a concomitant lack of rigour in testing, although that’s not to assert Ferguson’s code isn’t working as intended and/or wasn’t tested.
A few years ago I did a stint at a bank testing software to generate returns to the Prudential Regulation Authority for stress testing the bank’s financial state. Millions were spent on the project and strict controls applied to design, code development and testing. If anything, the testing was almost anally retentive in its strictness.
Ferguson’s code isn’t the only example of poor software engineering standards in academia. Hacked emails and files from the Climate Research Unit at the University of East Anglia in 2009 uncovered a debacle in some of their climate modelling software which was riddled with problems.
Any software which produces results used to inform public policy should be coded and tested to high standards. If only that were the case! I would be aghast, but not surprised, if Ferguson’s pandemic modelling program underwent less stringent testing than software used to manage our bank accounts.
For transparency it’s fortunate that Ferguson publicly admitted his model comprises undocumented code. Subsequently it’s been documented and ported to another computer language. Thanks to his tweet we have further insight into the lack of best practice in academia. To reassure us, Ferguson should be explain what level of testing was carried out on his program, if it was reviewed by another programmer and provide evidence of such testing. It’s common in large organisations to have independent test teams and provide an audit trail of their work.
When the clamour for a public inquiry into the government’s Covid-19 response grows, I hope its remit includes the state of programming and testing standards in academia.
18 thoughts on “Anon IT Professional: Can we trust Neil Ferguson’s flu pandemic model?”
Great article, thanks 🙂
James Delingpole has also written about Neil Ferguson’s model & its released code here https://www.breitbart.com/europe/2020/04/23/delingpole-why-must-covid-19-decision-makers-be-clouded-in-secrecy/ which is also a good read!
“Subsequently it’s been documented and ported to another computer language” Well, something has been documented and ported. How do we know that it hasn’t been polished up a bit too?
You write comments throughout your program; you give variables, subroutines, and so on helpfully explicit names; you test the program bit by bit, storing the results of your tests. That’s what I was taught decades ago and a great help it was. I could routinely re-use bits of code because it always told me what everything was, and was for. Does Ferguson suffer from the arrogance of physicists?
He seems to me to have been a reckless twerp. Has he done more damage to the British economy than the Luftwaffe managed?
Back in the late 1970s, as a self taught programmer working in a safer language than C, I wrote a program to provide performance analysis programs with selectable extracts from power plant instrument data. I knew how important comments were. There was a section of code where I decided that the logic behind 3 to 5 lines or so would be so obvious that comments were unneeded. (Admittedly, I may have been “clever” – I always prided myself on the efficiency of the code I wrote.)
A couple of years later, I returned to that employer (one of numerous bad career moves I made over the ensuing decades) and found myself, in a different role, needing to modify / enhance that program – my program, not written or touched in the interim by anyone else. I reached the section described above, and stopped – what on earth had I been trying to do there? It took a considerable time, deep in thought, for me to figure it out, and immediately slapped comprehensive comments into it.
What is breathtaking about what’s being reported now, is in part that the man in question considers it acceptable to make an admission of such enormity, and that it is considered acceptable both by peers and decision makers (who admittedly wouldn’t understand how devastating it is) to the extent it’s viewed as a trustworthy source of input to hugely far reaching strategy.
I’m not saying it’s incorrect, just that it’s unknowable, in any practical sense.
The lockdown strategy has been like asking the Luftwaffe not to bomb us so that we can shoot our people instead.
https://www.youtube.com/watch?time_continue=1&v=BTLii-e_UtY&feature=emb_logo
My apologies, but I don’t understand any of the above (no criticism implied or inferred as I’m an utter dunce where computers are concerned) but do, please, have a look at the above. That American doctor appears to know what he’s talking about and what he’s saying surely reflects what’s happening here in Britain.
Substitute 100 lines and still sacked
“Long time” < 1 year
Lots of comments, explanations; Easy to modify/update; No "GoTo"… was drummed into me from day one of Computing degree
Neil Ferguson of Imperial College and his/their 'models' were discredited years ago. It's incomprehensible and inexcusable that both UK & USA based their hugely damaging response on this 'establishment' charlatan's claims
Three weeks on from Neil Ferguson saying to Telegraph "I will release code next week", has he?
@James Macpherson
+1 re code comments, we forget very quickly
“Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it”.
– Brian W. Kernighan
Or rather how Milo Minderbender ran the war in Catch 22 – he contracted with the Germans to use his own bombers to bomb his own airfields so they didn’t have to…….it was good for business!
I’m not at all surprised by the state of the code, however, I would be very careful thinking this is much of an issue.
It simply does not matter.
The real problem is how the modelling was used. There is some utility for models, however it cannot be said enough; computer models have no predictive ability in this application.
That is the error that matters. The model created a Big Scary Number and the term ‘computer model’ give the Big Scary Number care more credibility than it ever should have had.
That Bog Scary Number then propagated something that will go down in history as one of the biggest human misjudgments of all time.
I was amazed to find he’s still working for the government after the F&M débacle, and his prediction that millions of us would get Mad Cow Disease over what is now the last 25 years. At least for now, I’m prepared to accept that lockdown is the best strategy against the Wuhan Virus, but, frankly, learning that he’s behind it casts more doubt in my mind than any counter-argument.
I’m a self-taught amateur, and even I know that failing to comment and document is bad practice, not to say downright stupid. If I go back to my own code after a few weeks I have trouble following it, but it’s almost as if he’s proud of the complex obscurity of his “thousands of lines”. “Look how clever I am. You won’t be able to follow it, of course…” Can he honestly say he understands it all himself, after all that time?
Whatever the usefulness of the model, a small error in input can result in a large error in prediction.
How fatal is this disease? We have never had a solid figure for the number infected, so we can’t know, even now.
How quickly does it spread? Well we can’t be sure that the first known case is in fact the first case, and we have no solid figure for the number infected so we can’t know.
What factors alter the rate of spread? Sure we can make some assessments based on other viruses but this one is new, we can’t know.
And if you feed a load of guesses, however intelligent, into a model the prediction is going to be worthless.
But the real failing is with our culture. The public cannot accept that sometimes we just have to put up with bad things. The press and the media encourage this.
We cannot accept that some things are unknowable, so we get told “facts” to keep us quiet.
And no politician dare say that this thing must be endured, a politician is required to have an answer to all problems.
And no expert dare admit that his knowledge is less than complete.
We demand certainty where none is possible and hence get lies.
“Any software which produces results used to inform public policy should be coded and tested to high standards.”
Indeed! And the code should be made public, so that anyone who has the time & interest can delve into it. To use the Climate Change Scam as an example, it was individual citizens (not peer-reviewing academics) who demonstrated the unacceptable flaws in Michael Mann’s “hockey stick” software.
The required openness should go beyond the software, and also include the asserted science & theory which the code is intended to represent. Michael Crichton made an excellent proposal in his unusual footnoted novel “Climate of Fear” that any science which is proposed for use as the basis of public policy should first be subjected to a good old-fashioned adversarial trial, with the opposition being adequately funded.
The Catholic Church has long used a Devil’s Advocate in proceedings about Beatification. Is it too much to ask that supposedly democratic governments should subject themselves to the same discipline on public policies supposedly justified by science?
I agree with the original post to a large extent and some very good points are made. “C” is not a language for amateurs and testing is everything. These days all code should be unit tested and developers find themselves writing more test code than application code.
On the point of documentation – commenting code – I have to disagree. Most comments should not be necessary and in fact are, themselves, bugs, because if you code correctly they should not be required and are often used to disguise poor coding. There may be rare occasions when you need a comment to explain the chosen algorithm but if you use well chosen names, well structured code and extensive unit tests they should not be necessary. Code gets changed – that’s ok if you have unit tests as the changes will be picked up – the comments are virtually never updated. Very soon you come to realise that the only thing you can trust is the code itself
The definition of writing software – naming things.
Having started forty years ago with Fortran and then used COBOL, Pascal,C and for the last twenty years Java, I have had this discussion with many developers but I’m convinced even more today than I ever was; commenting code is lazy.
One problem with comments is that they frequently don’t get updated when the code does, so code and comments end up out of sync. This means comments ought to be quality-checked occasionally too if your code makes regular use of them, but that’s harder (needs more eyeball and thinking time) than checking the code works, which ought to be automated if the code has been made with testing in mind from the start.
While I understand the attraction of an injunction against commenting and a preference for writing “obvious” clear, well-structured code instead, I’m not sure how practical that always is.
It may also depend what paradigm you’re working in. If you’re working in a language that encourages a functional programming style, then the bulk of your code is likely going to consist of a bunch of relatively small functions that take certain inputs and apply a particular rule to produce certain outputs, with a smaller amount of code with all the control structures to actually “do the work” (what initial inputs you pass to which function which of those outputs you pass to the next function and so on). In this approach it does make sense to document the purpose of each function, the nature and meaning of the inputs and outputs, and the process that is used to get from input to output.
The video you link to has been pulled by Youtube. It might be the same one that Dr Erickson put out (watched by 5 million people over 4 days) which has been memoryholed as well. When Youtube starts censoring doctors, you know the doctors are well and truly over the target.
As an ancient programmer, who cut his teeth on assembler, I once worked for one of the big computer companies (name withheld to avoid embarrasment/libel action) in the operating systems dept. Whilst examining a very convoluted and impenetrable chunk of assembler code in a tape-drive handler I found the following, neatly-boxed, comment from one of my predecessors…
“When I wrote this code last month only God and I knew what I was doing. I’ve looked at it again and now God alone knows”.
Ah, “the good old days”..!
I can remember working on some Assembler code decades ago where the programmer commented an INC A instruction with something like “increment accumulator by 1”. That merely stated what the instruction meant, not what value the accumulator was holding and why it needed to be incremented! The rest of his code was commented in this stupid way. It was one of many WTF moments in my career.
As for the importance of comments, I think that no matter how clearly written a program is they can help. They’re especially needed to document what a module does, what the input and output arguments are, what it returns, error conditions etc.
Comments are closed.