sector_ico_Environment_trans Environment

Global scale metabolic pathway reconstruction from environmental genomes

285MPR
  • Project Leaders: Steve Hallam
  • Institutions: University of British Columbia (UBC)
  • Budget: $1063953
  • Program/Competition: Bioinformatics and Computational Biology Competitions
  • Genome Centre(s): Genome Canada
  • Fiscal Year: 2018
  • Status: Closed

For more than 3.5 billion years, microorganisms have been the dominant form of life on Earth, mediating global fluxes of matter and energy. Over the past decade, high-throughput omics platforms generating data (DNA, RNA, protein and metabolites), which contain information about the function and identity of microbial life, have transformed our perception of this microcosmos, illuminating microbial dark matter and conceptually linking microorganisms at the individual, population and community levels to a wide range of ecosystem functions and services. However, turning this data into useful knowledge has been a challenge because of a lack of scalable software tools to mine, monitor and interact with environmental sequence information limiting knowledge creation and translation.

With the goal of exploring emergent metabolism, the project developed a software platform called MetaPathways (MP) and a data portal called the Environmental Genome Encyclopedia (EngCyc, https://engcyc.org/) supporting metabolic pathway assessment. The team designed, built and tested EngCyc, a repository of metabolic models known as environmental Pathway Genome Databases (ePGDBs) built from microbiome sequence information. EngCyc’s core processes are supported by MetaPathways, which permits visualization and analysis at the individual, population and community levels. MetaPathways was deployed on grid and cloud computing infrastructures to generate ePGDBs and associated data products facilitating gene and pathway discovery. Additionally, the team developed downstream analysis modules which offer user-friendly data exploration features to enhance knowledge generation and data interpretation.

This project led to the design and beta-testing of a scalable Software-as-a-Service (SaaS) pipeline that was commercialized through Koonkie Canada Inc. and the Digital Supercluster Mining Microbiome Analytics Platform (M-MAP) and established a robust user community, driven by use cases in the biorefining, mining and energy sectors. The resulting software modules and portal system have broadened access to comparative and pathway-centric forms of microbiome sequence analysis needed to explore and harness the metabolic problem-solving power of microbial communities.