Genomic Signature Analysis of Virus Causing COVID-19

enabled by data from

The rapid spread and subsequent evolution of the SARS-CoV-2 (hCoV-19) virus makes it hard to apply traditional phylogenetic methods (such as trees), which are compute intensive and may overlook the more complicated relationship between viral strains such as recombination. By using alignment free approaches such as genomic signature analysis, we are able to represent the similarity of viral genomes in 2-dimensional space. This allows us to visualize the relationship between all viral strains at once. For a more detailed description of the method, see Bauer et al 2020.

Here we analyze the hCoV-19 sequences in the GISAID EpiCoVTM Database. In particular, we focus on the emergence and evolution of Australian isolates (blue samples). Data is updated regularly with all samples that are isolated from human with their entire genome sequenced made possible by the contributors of data in GISAID.

Please note, that the isolate positions are relative and get updated with every new sequence added to the analysis, hence there are not fixed x-y coordinates for each isolate.

