Jeremy Wang, Ph.D.
Department of Genetics
jeremy <at> TLD


My research focuses on metagenomics and long-read (single-molecule) informatics. We develop novel methods to enable more efficient *omic analysis and apply carefully architected high-performance computing approaches to improve the utility of genomics in studies of human disease.


Long-read metagenomics, selective nanopore sequencing, and mucosa-associated gut microbiota in inflammatory bowel disease

Single-molecule sequencing of metagenomes produces more specific taxonomic and genic characterization than standard NGS, but challenges remain due to relatively low throughput and high error rate, particularly from host-associated microbial communities. We have developed novel sequencing and informatics approaches to increase throughput and classification accuracy in both synthetic and human tissue-associated microbiota.

Hybrid de novo genome assembly

Assembly combines next-generation (short) reads with single-molecule (PacBio, Oxford Nanopore) long read sequences. We have developed methods for rapid overlapping and hybrid error-correction of long, error-prone reads. Through several collaborations, we have used these methods and others to produce high-quality assemblies for several bacteria, plants, fungi, water bears, and various diptera.

Long-read amplicon clustering/denoising

Various amplicon sequencing-based studies would benefit dramatically from increased read length produced by single-molecule sequencing, including 16S rDNA analysis, HLA typing, and viral evolution experiments. However, deconvolving within-sample variation among full-length amplicons with relatively high error rate is impossible using existing methods. We are developing a method to derive consensus subpopulations via unsupervised clustering able to distinguish unique amplicon populations with sequence divergence far below the sequencing error rate.