Date of Graduation
5-2025
Document Type
Thesis
Degree Name
Master of Science in Cell & Molecular Biology (MS)
Degree Level
Graduate
Department
Cell & Molecular Biology
Advisor/Mentor
Adams, Richard
Committee Member
Alverson, Andrew J.
Second Committee Member
Pummill, Jeff F.
Keywords
Comparative Biology; Evolution; Phylogeny
Abstract
Recent advances in computational biology, artificial intelligence, and sequencing technologies have enabled new perspectives for addressing longstanding questions in evolutionary biology. Increased computational power has made large-scale simulations feasible for investigating diverse aspects of evolutionary and phylogenetic modeling. Similarly, deep learning algorithms now allow for the prediction of reasonably accurate tertiary protein structures, unlocking the potential for investigating diversity and evolution of protein structure across a broad range of organisms. This thesis presents two studies that target different aspects of evolutionary inference from unique perspectives. Yet, the share an overarching goal of presenting new perspectives on the evolutionary basis of biodiversity from comparative analyses of protein and trait evolution.
Viewing molecular evolution through the lens of protein structural diversity, the second chapter leverages deep learning models to examine olfactory receptor (OR) evolution in long-horn beetles. That is, we sought to investigate the diversity and structure of proteins encoded within recently-sequenced insect genomes using machine learning. Using two recently developed deep learning models, RoseTTAFold and AlphaFold, we predicted the tertiary structure of OR proteins in two Cerambycid species. We then investigated diversity among these OR proteins and analyzed the relationship between structural and sequence-level evolutionary distances. These findings highlight the promise of deep learning models for gaining meaningful biological insights, particularly in systems where experimental resources are limited.
The third chapter addresses evolutionary biology at a broader comparative scale, evaluating how phylogenetic assumptions influence evolutionary conclusions using statistical regression approaches. Here, we focused on a core question of comparative biology: understanding how phylogenetic modeling choices influence statistical conclusions about trait evolution. Through large-scale simulation studies and analysis of an empirical dataset containing traits associated with longevity, we assessed the sensitivity of phylogenetic regression to tree choice. Across these analyses, tree selection emerged as an important factor influencing the behavior of phylogenetic regression. The results show that an incorrect tree choice, that does not accurately represent the evolutionary history of a trait, can lead to significantly elevated false positive rates. This holds particularly true as the amount of data (species and traits) included in the analysis increases. These findings underscore the importance of thoughtful tree selection across studies in comparative biology. Together, these chapters highlight the diversity of questions and scales encompassed by evolutionary biology and contribute to a broader understanding of the field.
Citation
Duncan, M. (2025). Investigating Evolution Through the Lens of AI-driven Protein Exploration and Phylogenetic Modeling. Graduate Theses and Dissertations Retrieved from https://scholarworks.uark.edu/etd/5765