The Covid-19 pandemic has brought many scientific issues to wide public attention, but even in these extraordinary times, the way computer coding is used in research is not a topic many would have predicted for mainstream discourse.
Nonetheless, the subject has burst into the open, mainly because of scrutiny of the code used in epidemiological modelling ā in particular, the highly influential Imperial College London paper, led by Neil Ferguson, just as the UK started going into lockdown.
The code underlying the modelling came in for criticism after it was to the public repository for programming, GitHub, although, according to this month, scientists who have tested the code have found that its results can be reproduced.
Bill Mitchell, director of policy at the British Computer Society (BCS), said that although it agreed that there was ānoĀ credible evidenceā of major problems with the Imperial code, the episode had shone a light on the issue of how programming was performed and reviewed in academia.
Āé¶¹
The BCS released last month in which it said āthe quality of the software implementations of scientific models appear to rely too much on the individual coding practices of the scientistsā and called for professional software engineering standards to be used where scientific code formed the basis of policy.
Dr Mitchell, a former computing lecturer at the universities of Manchester and Surrey, said there were ālots of very, very standard things that you would expect in the software worldā that are not always being done in science.
Āé¶¹
This included code being readily shared on public repositories such as GitHub; being written in such a way that it can be easily understood and tested by others; and tests being published so reviewers can easily try to replicate the results.
āIt goes to the heart of doing science. You tell people what experiments youāve done; you allow them to look at your working,ā he said.
Dr Mitchell said his āvery personalā view was that scientists might sometimes view coding as just a āmechanical way of generating dataā and might not fully appreciate ājust how much innovation and ingenuity and cleverness is embedded in their own code and how valuable that is to other peopleā.
Changing this culture ā especially given the āintenseā publish or perish pressures in academia ā might require incentives similar to those seen in the open access movement, he said.
The āsimplest thingā would be to say that all scientific software developed with public money must be made openly available. āIĀ think suddenly when people realise that, āOh my gosh, people are going to be looking at my codeā, the standard will instantly improve,ā Dr Mitchell said.
Others say the direction of travel is moving towards more openness, but there was a debate to be had about how to speed up progress.
Āé¶¹
āIn my field, there has been a movement towards transparency for quite a number of years, and it is becoming more and more common for journals, reviewers and the community to require code to be made available with papers,ā said Rosalind Eggo, assistant professor in infectious disease modelling at the London School of Hygiene and Tropical Medicine.
She added that one longer-term solution would be to invest more in employing research software engineers āwho are experts in writing and translating scientific code and making it more efficient, shareable and, ultimately, more usefulā.
Āé¶¹
āMaking sure we have the resources that allow the hiring and long-term funding of software specialists would improve the quality of scientific code and hopefully make it easier to build efficient analysis, and to reuse and repurpose code,ā she said.
Konrad Hinsen, a biophysicist at Franceās National Centre for Scientific Research (CNRS) and an expert in scientific computing who often , suggested that employing more research software engineers was a good idea.
However, he added, using them to help write code might be difficult for āsmall, exploratory projects that are done in informal collaborationsā.
āYou canāt just add a software expert with a very different working style to such a team. But you can still do after-the-fact code review before accepting results for publication,ā he said.
This is where research software engineers could have a key role more generally, including through the traditional publishing process, he said, pointing out that some āpioneering journalsā were already including code review as an āintegral partā of the peer review process.
More broadly, Dr Hinsen added, the issue was one of ātraining enough people, and then employing them in appropriate jobsā. However, he was somewhat sceptical about whether progress could be sped up across all disciplines in science.
Āé¶¹
āMuch scientific code is long-lived, and habits are even more subject to inertia. Faster improvement is not possible for scientific code in general, though it is in specific, well-defined subjects where motivation is high. Epidemiology might be in that situation right now,ā he said.
POSTSCRIPT:
Print headline: Pandemic models spark calls to reveal more code
Register to continue
Why register?
- Registration is free and only takes a moment
- Once registered, you can read 3 articles a month
- Sign up for our newsletter
Subscribe
Or subscribe for unlimited access to:
- Unlimited access to news, views, insights & reviews
- Digital editions
- Digital access to °Õ±į·”ās university and college rankings analysis
Already registered or a current subscriber?







