Leveraging Next-Generation Sequencing Technology in the Fight Against COVID-19

Leveraging Next-Generation Sequencing Technology in the Fight Against COVID-19

With the help of NGS, we can enhance our understanding of the virus and its underlying pathways impacting humans

The current lockdown of the country due to the COVID-19 pandemic will be difficult to sustain, especially for the extended period required to develop a vaccine. Most public health experts agree that the key component needed to reopen the country in a safe way prior to having a vaccine is a robust testing and contact tracing system. 

While detecting the mere presence of the virus in an individual is better than no detection method at all, to truly track viral spread in a population and detect vulnerabilities in infection control, more information is needed than simply a positive/negative test. One needs to be able to have the capability to follow the chain of infection precisely, tracing it back to the initial infection in a group, subpopulation, or community. This requires a fast, easy method to analyze the sequence of the virus present in individuals, and to determine the relationship of the virus in one person to that found in another person. This level of detail is extremely important to support optimal decision making in subpopulations, such as in hospitals and on deployed ships. 

Next-generation sequencing (NGS) of SARS-CoV-2 can make a tremendous contribution to enhancing our understanding of the underlying pathways in which the virus impacts humans. In a short period of time, we have made significant progress; on January 24, 2020, the SARS-CoV-2 genome was published in the New England Journal of Medicine.

Through the Global Initiative to Share All Influenza Data (GISAID) and GenBank, researchers are sharing their understanding of the origin of the new virus, the epidemiology and transmission routes, and facilitating development of diagnostic and treatment strategies. This website provides a wealth of information as well as latest news on this subject. Understanding the genome of SARS-CoV-2 early enables us to understand the viral spread and impacted response strategies. Here are a few examples in this context:

A recent Nature paper discusses the whole genome sequences from COVID-19 that were obtained from five patients at an early stage of the outbreak. The sequences are almost identical and share 79.6 percent sequence identity to SARS-CoV. Furthermore, the authors were able to show that 2019-nCoV is 96 percent identical at the whole genome level to a bat coronavirus. Pairwise protein sequence analysis of seven conserved non-structural proteins domains show that this virus belongs to the species of SARS-CoV. In addition, the COVID-19 virus isolated from the bronchoalveolar lavage fluid of a critically ill patient could be neutralized by sera from several patients. Notably, they confirmed that COVID-19 uses the same cell entry receptor-angiotensin converting enzyme II (ACE2) as SARS-CoV. This study shows how future research studies can be designed. 

The scientific community publishes SARS-CoV-2 sequences on a regular basis, along with the date and location the sample was collected. Golden Helix’s software supports highly specific and accurate contact-tracing and identification of potential undiagnosed persons by determining to what previous occurrence a particular sample is most similar (see Figure 1). In a partnership with a collaborator in Germany, our system was able to identify the SARS-CoV-2 virus cases in a hospital and trace them to their origin (see Figure 2). What this fast, simple analysis revealed was that there were two separate infection clusters within the hospital, and steps were subsequently then taken to tighten up specific infection control measures. 

NGS is allowing researchers to better understand the phylogeny of this virus, which is useful to establish a clear picture of so-called transmission routes. Additionally, NGS technologies will allow us to gain a deeper understanding of this virus and to develop advanced diagnostic capabilities to help patients and provide us the ability to conduct research at the same time. 

Figure 2. Principal component analysis was used as a technique for reducing the dimensionality of the SARS-CoV-2 genome, increasing interpretability but at the same time minimizing information loss. This graph shows that the 876 virus genomes fall into three clusters (red). The colored dots are virus genomes of new patients. With this representation it is possible to establish the proximity of these new cases to published virus genomes with known time and location information.
Figure 1. This graph is a dendrogram of 50 samples observed in various labs showing how these viruses are related. The analysis showed that the viruses essentially fall into two clusters.