Poster Presentation Society for Molecular Biology and Evolution Conference 2016

FINGERPRINT: Computational filtering of targeted sequences from environmental contaminants (#308)

Michael S Rosenberg 1
  1. Arizona State University, Tempe, AZ, United States

DNA sequencing is often performed on mixed samples in NextGen laboratories, both intentionally as in meta-genomics or unintentionally as when the material of interest is mixed with environmental contaminants. The latter is particularly a problem in ancient DNA projects, where differential preservation makes avoidance of contamination difficult. The problem of contamination is substantially worse when the material of interest is a bacterium or virus which (a) cannot be readily separated or purified prior to sequencing and (b) is likely to resemble common environmental contaminants. A number of pre- and post-sequencing protocols have been developed to sort or filter NGS reads from mixed samples, but have proven to be inadequate for removing closely related contaminants from aDNA samples. FINGERPRINT is a new, simple bioinformatics approach based on targeted k-mer genome profiling that more readily filter NGS reads into desired and contaminant bins prior to sequence assembly. Data from an ancient tuberculosis sequencing project are used to illustrate the power and efficacy of the method.