The reduced cost and increased quality of DNA sequencing technology, coupled with an ever expanding collection of experimental results, allows for the regular sequencing of individual human genomes which will result in social and economic benefits through improved lower cost healthcare. There is currently a global human reference genome based on the combined genetic material of 13 anonymous individuals. The availability of a reference genome is critical for the experimental methods used for sequencing DNA. Much like solving a jigsaw puzzle, efficiency is significantly improved by the availability of a picture of the completed jigsaw to guide placement. Improvements in the speed and accuracy of DNA sequence analysis will allow genetics to be used regularly in a clinical setting leading to highly personalised and preventative medicine by utilising existing knowledge regarding actionable disease related genes.
This project aims to implement a novel graph model, the GNOmics Genome Model (GGM), for representing the human genome and other genetic data. GNOmics, an acronym for Graphs ‘N’ Omics, is the brand name used for the research focused around the novel underlying graph model. This project will develop a robust new computational framework for the analysis of genetic variation which includes a unified reference database of publically available genetic data and will replace text-based references genomes. The implementation of the GGM incorporates significantly more information when representing genomic data, potentially allowing for improved accuracy without compromising speed or dramatically increasing the strain on computational resources.