Using statistical alignment to elucidate the impact of selection on indels in protein-coding sequences

College of Science Department of Ecology and Evolutionary Biology

Image
Cartwright v2

When

3 – 4 p.m., March 11, 2024

Where

Environment and Natural Resources 2 (ENR2) Building, Room S107
1064 E. Lowell St., Tucson 85719

or

Join Virtually

Event Information

Insertion and deletion, or indels, are an important source of genetic variation but have not been studied as extensively as single-nucleotide mutations, due to limitations of bioinformatic software. For example, in protein-coding sequences, indels can occur in three different phases (1, 2 and 3) depending on where they start in a codon; however, existing alignment strategies for protein-coding sequences typically assume that indels only occur between codons (phase 3). Due to limitations of existing bioinformatic methods, large scale analyses of indels phases across the tree of life have not been done until now.

To address this need, we present COATi, a statistical, codon-aware pairwise aligner that supports complex insertion-deletion models and can handle artifacts present in genomic data. COATi allows researchers to generate more accurate sequence alignments while reducing the amount of data discarded during curation.

We have utilized COATi to quantify how selection shapes indels in coding-sequences across the tree of life using a comparative genomic dataset of 90 species-pairs representing bacteria, archaea, protists and multicellular eukaryotes. We have found evidence that the genome-wide patterns of phases of indels in coding sequences is rather consistent across the tree of life. Additionally, we have analyzed our data on a per-gene level and find that genes with stronger selection against non-synonymous single-nucleotide mutations also show evidence of stronger selection against indels that change an additional amino acid.

More information on Ecology and Evolutionary Biology seminars

Presenter Details

Reed Cartwright, PhD
Associate Professor, School of Life Sciences
Associate Professor, Biodesign Center for Personalized Diagnostics
Associate Professor, Mechanisms of Evolution Researchers
Arizona State University

Cartwright is an associate professor in the School of Life Sciences and the Biodesign Institute at Arizona State University. He runs a lab focused on research software engineering and using computational and statistical approaches to study evolutionary processes from genomic data. Recent research in his lab has focused on improving alignment algorithms for coding sequences, improving mutation detection for non-model and emerging model organisms and understanding the evolution of indels across the tree of life. He is also interested in improving pedagogy and is a Carpentry Instructor Trainer with The Carpentries, an organization that focuses on using evidence-based pedagogy to improve the computational and data analysis skills of researchers.