
In April 2020, Genome Canada launched the Canadian COVID-19 Genomics Network (CanCOGeN). The mission of CanCOGeN was to establish a coordinated pan-Canadian, cross-agency network for large-scale SARS-CoV-2 and human host sequencing to track viral origin, spread and evolution, characterize the role of human genetics in COVID-19 disease and to inform time-sensitive critical decision making relevant to health authorities across Canada during the pandemic. The network will further build national capacity to address future outbreaks and pandemics. Within CanCOGeN, VirusSeq coordinated and funded expanded genome sequencing efforts and supported data sharing within an open, ethical framework.
The project focused on metadata and analytics to aid the national viral sequencing effort as a part of CanCOGeN VirusSeq. The team set up the Canadian SARS-CoV-2 genomic metadata specification and reporting standard, which was adopted by the Canadian Public Health Laboratory Network (CPHLN) and brought about harmonized standards across the country. The team also developed an international SARS-CoV-2 genomic metadata specification. They curated 277,000 SARS-CoV-2 genomic records for the national database at the National Microbiology Laboratory (NML) and 376,154 samples for the public database at VirusSeq Portal, enabling high quality SARS-CoV-2 genomic epidemiology data to become nationally and publicly available.
The project team also developed and manages the "DataHarmonizer", which is an open-source tool that facilitates fast data harmonization. It is now available on GitHub and has been adopted by groups outside of CanCOGeN. This tool can potentially be adapted for use with other pathogens, such as monkeypox.
