The sequencing of COVID-19
Labs across the country have converted to the genetic sequencing of coronavirus samples to help track its mutation and spread. The initiative, COG-UK, is being led by Cambridge. We spoke to one of the scientists lending their time and expertise.
Dr Charlotte Houldcroft is spending long days inside a tent inside a lab.
The small black hydroponic gazebo (“like something that might be used to grow weed in”) has seen action at the sites of African Ebola outbreaks, but is now one of several erected within laboratories on the Cambridge Biomedical Campus.
“They act as self-contained little boxes: easy to clean, they prevent spread of contaminated genetic material. Whenever this is finally over, the tents will be sterilized and packed down, waiting for the next outbreak,” she says.
Houldcroft, a virus evolution researcher, volunteers for COG-UK: a massive national initiative to sequence the shifting genetics of COVID-19 as fast as possible, providing ‘real time’ data that can help map its spread and detect mutations.
The University of Cambridge is leading on coordination of COG-UK with the Wellcome Sanger Institute, but samples from confirmed COVID-19 cases are distributed to a network of sequencing centres: Belfast, Birmingham, Cardiff, Edinburgh, Exeter, Liverpool, and Nottingham to name a few.
Calls went out for scientists across social media. “I was hoping I could help if the UK started mass sequencing, a lot of researchers were."
“There was a rush to volunteer, a bit like the Hunger Games”
And so, in a tent in a lab in Cambridge, Houldcroft works as part of a team led by Professor Ian Goodfellow, hosing down COVID-19 genetics at just the right pace. Virus RNA from across East Anglia is painstakingly amplified and converted to DNA for the sequencers.
“Put very simply, it’s a long series of steps, part of which involves repeatedly sticking genetic material to tiny beads, and then washing all the crap off with ethanol – all the RNA from other bugs up your nose or in your throat. But do it too slowly or too fast and the virus RNA diminishes.”
“We get samples from Public Health England about 10am, then finish about 7pm, by which time my brain’s a bit fried from concentrating very hard on carefully moving around miniscule quantities of liquid.”
The scientists generally work in pairs, swapping tasks to provide relief, and double-checking each other’s work. Virus samples arrive the day after being taken, attached to an anonymised barcode. Once the sample is prepped, it gets pipetted into the injection port of the latest handheld minION sequencer.
“The chemical properties of each DNA base change the electrical current in the machine, so it reads DNA really fast,” explains Houldcroft. “Instead of taking up to two days, it takes one to eight hours.” This means her lab can sequence the genomes of between 24 and 70 virus samples a day.
‘Wind the clock back’
A virus is essentially a parasitic packet of genetics programmed to copy itself inside a host. Coronaviruses are encased in a layer of fat, which is why soap is so effective, says Houldcroft. “It breaks down fat, and the genetic guts of the virus spill out.”
As Covid-19 replicates within a host, mistakes get made. Most of these little genetic mutations make no difference to the effectiveness of the virus. They can, however, be tracked by scientists through virus genome sequencing.
The minor mutations lead to subtly different lineages. This can be seen in the RNA sequence, and used to determine the phylogenetics: the coronavirus family tree, as it splits and diversifies. Very roughly, a mutation occurs every 20 “transmission events” or about once every two weeks.
“You look at the diversity in the genome and try to wind the clock back: working out what are mutations, when they occurred and so where this strain emerged and how it fits into the UK pattern,” says Houldcroft.
“Some of the first samples of the virus in the UK, the Chinese visitors in York and the cluster in Brighton, are not related that closely to anything we’re seeing in the country now. It suggests the tracking and tracing at the beginning worked.”
“It was not until it started to spread across Europe and the US that we get multiple introductions across the UK. Virus clusters in Wales, for example, had contributions from all over the world. There’s not a single Welsh strain.”
Unsurprisingly, perhaps, many UK strains are closely related to those in neighbouring countries, such as France and Holland, with London seeing the most viral diversity.
The sequencing data from across the nation is uploaded several times a day, feeding into a big picture that gets pored over at senior levels of the COG consortium. This can be used in mathematical models to provide better indications of infection rates.
It can also flag if part of the country starts “behaving weirdly” says Houldcroft. “If you have a sudden expansion of a single viral lineage somewhere, you know you need to look closely at that area’s containment measures, focus resources."
“If you have a COVID ward with multiple examples of the same lineage, you might have a local outbreak within a particular area, community or perhaps even hospital. Whereas diversity of viruses suggests pre-lockdown infection.” This genetic detective work can help locate transmission “hot spots”, or reassure that steps to control infection are working.
One of the truly unnerving features of the new coronavirus is its unpredictability. Many suffer almost no symptoms, while some young and seemingly healthy people end up with pneumonia or worse. As well as short-term detection, COG-UK is building an invaluable resource for long-term prevention.
“Using electronic health records, we will ultimately be able to see if changes in the viral genome are associated with more or less severe symptoms, or cause problems for those with particular underlying conditions,” says Houldcroft.
Herpes and Corona
COVID-19’s closest known relative was found in bats seven years ago in China. It is believed the new coronavirus jumped into the human “reservoir” via an intermediary species, possibly the much-persecuted pangolin, traded for its scales.
“Highly mobile, high-density mammals are ideal for coronaviruses"
Members of this virus family – named for their protruding proteins resembling a spiky crown – have long troubled humans, causing childhood coughs and colds. Houldcroft points out that the genetics of familiar coronas date back to the Middle Ages.
“We get them as kids, build an immunity, and then the only reservoir available to the virus is the next generation. But COVID-19 is new to humans: no one has immunity.”
“Mutations in the spike proteins, probably in the bat ancestor, is what made this virus so successful – helping it access the cells in our lungs. We’re vigilant for new mutations within these proteins, which could affect vaccination strategies.”
Before COVID-19 hoved into view, Houldcroft worked a lot on herpes, a larger virus with a big DNA-packed genome – five or six times the size of coronavirus – which often manifests as cold sores or genital lesions.
“Herpes became a good passenger, staying in a body for life, and its slow genetic mutations over decades and centuries can reveal deep histories.” Her research has looked at how herpes jumped species barriers between hominins over a million years ago, for example.
“The small covid-19 genome is made of RNA, and mutates much faster, which tells us stories about last week or last month, rather than hundreds of thousands of years ago. Working on virus evolution means moving between vastly different timescales.”
Dr Charlotte Houldcroft
Charlotte is a Research Associate in the Department of Medicine. You can follow her on Twitter at @DrCJ_Houldcroft.