- Analyze client/collaborator data with current bioinformatics software and pipelines
- Develop custom reports to better visualize and explain complex data
- Develop scripts to customize analyses
- Develop software to meet needs of collaborators
- Draft manuscripts and assist with manuscript production
- Train student workers as needed
- Develop training material for the Atlas supercomputing cluster
- Lead training workshops for the Atlas supercomputing cluster
- Support USDA-ARS researchers on the Atlas supercomputing cluster
Experience
Computer Specialist
Institute for Genomics, Biocomputing & Biotechnology
February 2021—Present
Research Associate II
Institute for Genomics, Biocomputing & Biotechnology
April 2019—February 2021
- Analyze client/collaborator data with current bioinformatics software and pipelines
- Develop custom reports to better visualize and explain complex data
- Develop scripts to customize analyses
- Develop software to meet needs of collaborators
- Draft manuscripts and assist with manuscript production
- Train student workers as needed
Research Associate I
Institute for Genomics, Biocomputing & Biotechnology
August 2015—April 2019
- Analyze client/collaborator data with current bioinformatics software and pipelines
- Develop scripts to customize analyses
- Develop software to meet needs of collaborators
- Draft manuscripts and assist with manuscript production
- Assist with training student workers as needed
Education
PhD, Computer Science
Mississippi State University
August 2019
Dissertation
A machine learning approach to genome assessment
An exploration of current methods of genome assembly assessment, where these methods fall short, and how machine learning can be used to model expert knowledge for genome assessment
Highlighted Coursework
- Machine Learning
- Visual Data Analysis with R
- Data Information and Visualization
- Directed Individual Study of Genome Assembly
MS, Computer Science
Mississippi State University
May 2016
Highlighted Coursework
- Essentials of Molecular Genetics
- Genomes and Genomics
- High Throughput Sequence Analysis
Certificate, Computational Biology
This certificate requires students to take a combination of five computer science and biological sciences courses to build their understanding of how to combine the two disciplines.BS, Software Engineering
Mississippi State University
May 2014
Programs
PAST
- Institute for Genomics, Biocomputing & Biotechnology
- USDA-ARS
In recent years, a method for interpreting genome-wide association study (GWAS) data using metabolic pathway analysis was developed and successfully used to find significant pathways and mechanisms explaining phenotypic traits of interest in plants. The scripts implementing this method were both difficult to use and slow to run, sometimes taking longer than 24 hours. PAST (Pathway Association Study Tool), a new implementation of this method, has been developed to address these concerns and implemented as a package for the R language. Two user-interfaces are provided---console and R Shiny application. PAST completed analyses in approximately half an hour to one hour and produced the same results as the previously developed method. Thus, to promote a powerful new pathway analysis methodology that interprets GWAS data to find biological mechanisms associated with traits of interest, we developed a more accessible, efficient, and user-friendly tool. PAST is available on GitHub, Bioconductor, and at MaizeGDB.
Keanu
- Institute for Genomics, Biocomputing & Biotechnology
- US Army ERDC
One of the main challenges when analyzing complex metagenomics data is the fact that large amounts of information need to be presented in a comprehensive and easy-to-navigate way. In the process of analyzing FASTQ sequencing data, visualizing which organisms are present in the data can be useful, especially with metagenomics data or data suspected to be contaminated. Keanu, a tool for exploring sequence content, helps a user to understand the presence and abundance of organisms in a sample by analyzing alignments against a database that contains taxonomy data and displaying them in an interactive web page. The content of a sample can be presented either as a collapsible tree or as a bilevel partition graph. Keanu is freely available at GitHub.
Quack
- Institute for Genomics, Biocomputing & Biotechnology
The quality of data generated by high-throughput DNA sequencing tools must be rapidly assessed in order to determine how useful the data may be in making biological discoveries; higher quality data leads to more confident results and conclusions. Due to the ever-increasing size of data sets and the importance of rapid quality assessment, tools that analyze sequencing data should quickly produce easily interpretable graphics. Quack addresses these issues by generating information-dense visualizations from FASTQ files at a speed far surpassing other publicly available quality assurance tools in a manner independent of sequencing technology. Quack is freely available at GitHub.
Skills
Languages
Programming | Web | Markup |
---|---|---|
Python | HTML | LaTeX |
R | Javascript | Markdown |
Rust | CSS | orgmode |
Communication
- data visualization
- writing documentation
- creating training materials and leading workshops
- writing manuscripts
Technical
- software and analysis pipeline development in high-performance computing environments
- data organization
- project management via
git
- reproducible projects via
orgmode
, software containers, andLMOD
Bioinformatics
- differential expression and functional analysis
- genome assembly and annotation
Awards and Honors
Featured Article
G3: Genes | Genomics | Genetics
2023
Best Paper Published in 2020
ALTEX
2021
Innovation Award
USDA-ARS
2020
Publications
Poudel, S.; Jia, L.; Arick, M.A., II; Hsu, C.-Y.; Thrash, A.; Sukumaran, A.T.; Adhikari, P.; Kiess, A.S.; Zhang, L. 2023. In silico prediction and expression analysis of vaccine candidate genes of Campylobacter jejuni, Poultry Science. DOI: 10.1016/j.psj.2023.102592
Arick, M.A., II; Grover, C.E.; Hsu, C.-Y.; Magbanua, Z.; Pechanova, O.; Miller, E.R.; Thrash, A.; Youngblood, R.C.; Ezzell, L.; Alam, M.S.; Benzie, J.A.H.; Hamilton, M.G.; Karsi, A.; Lawrence, M.L.; Peterson, D.G. 2023. A high-quality chromosome-level genome assembly of rohu carp, Labeo rohita, and its utilization in SNP-based exploration of gene flow and sex determination, G3: Genes, Genomes, Genetics. DOI: 10.1093/G3JOURNAL/JKAD009
Grover, C.E.; Arick, M.A.; Thrash, A.; Sharbrough, J.; Hu, G.; Yuan, D.; Snodgrass, S.; Miller, E.R.; Ramaraj, T.; Peterson, D.G.; Udall, J.A.; Wendel, J.F. 2022. Dual Domestication, Diversity, and Differential Introgression in Old World Cotton Diploids, Genome Biology and Evolution. DOI: 10.1093/gbe/evac170
Warburton, M.L.; Jeffers, D.; Smith, J.S.; Scapim, C.; Uhdre, R.; Thrash, A.; Williams, W.P. 2022. Comparative Analysis of Multiple GWAS Results Identifies Metabolic Pathways Associated with Resistance to A. flavus Infection and Aflatoxin Accumulation in Maize, Toxins. DOI: 10.3390/toxins14110738
Poudel, S.; Li, T.; Arick, M.A.; Hsu, C.-Y.; Thrash, A.; Sukumaran, A.T.; Adhikari, P.; Kiess, A.S.; Zhang, L. 2022. Complete Genome Sequences of Four Campylobacter jejuni Strains Isolated from Retail Chicken Meat and Broiler Feces, Microbiology Resource Announcements. DOI: 10.1128/mra.00898-22
Tekedar, H.C.; Arick, M.A., II; Hsu, C.-Y.; Thrash, A.; Blom, J.; Lawrence, M.L.; Abdelhamed, H. 2020. Identification of Antimicrobial Resistance Determinants in Aeromonas veronii Strain MS-17-88 Recovered From Channel Catfish (Ictalurus punctatus), Frontiers in Cellular and Infection Microbiology. DOI: 10.3389/fcimb.2020.00348
Thrash, A.; Hoffmann, F.; Perkins, A. 2020. Toward a more holistic method of genome assembly assessment, BMC Bioinformatics. DOI: 10.1186/s12859-020-3382-4
Thrash, A.; Warburton, M.L. 2020. A pathway association study tool for gwas analyses of metabolic pathway information, Journal of Visualized Experiments. DOI: 10.3791/61268
Rycroft, T.E.; Foran, C.M.; Thrash, A.; Cegan, J.C.; Zollinger, R.; Linkov, I.; Perkins, E.J.; Garcia-Reyero, N. 2020. AOPERA: A proposed methodology and inventory of effective tools to link chemicals to adverse outcome pathways, Altex. DOI: 10.14573/altex.1906201
Thrash, A.; Tang, J.D.; Deornellis, M.; Peterson, D.G.; Warburton, M.L. 2020. PAST: The pathway association studies tool to infer biological meaning from GWAS datasets, Plants. DOI: 10.3390/plants9010058
Li, H.; Thrash, A.; Tang, J.D.; He, L.; Yan, J.; Warburton, M.L. 2019. Leveraging GWAS data to identify metabolic pathways and networks involved in maize lipid biosynthesis, Plant Journal. DOI: 10.1111/tpj.14282
Thrash, A.; Arick, M., II; Barbato, R.A.; Jones, R.M.; Douglas, T.A.; Esdale, J.; Perkins, E.J.; Garcia-Reyero, N. 2019. Keanu: A novel visualization tool to explore biodiversity in metagenomes, BMC Bioinformatics. DOI: 10.1186/s12859-019-2629-4
Grover, C.E.; Arick, M.A., II; Thrash, A.; Conover, J.L.; Sanders, W.S.; Peterson, D.G.; Frelichowski, J.E.; Scheffler, J.A.; Scheffler, B.E.; Wendel, J.F. 2019. Insights into the evolution of the New World diploid cottons (Gossypium, subgenus houzingenia) based on genome sequencing, Genome Biology and Evolution. DOI: 10.1093/gbe/evy256
Thrash, A.; Arick, M., II; Peterson, D.G. 2018. Quack: A quality assurance tool for high throughput sequence data, Analytical Biochemistry. DOI: 10.1016/j.ab.2018.01.028
Warburton, M.L.; Womack, E.D.; Tang, J.D.; Thrash, A.; Smith, J.S.; Xu, W.; Murray, S.C.; Williams, W.P. 2018. Genome-wide association and metabolic pathway analysis of corn earworm resistance in Maize, Plant Genome. DOI: 10.3835/plantgenome2017.08.0069
Grover, C.E.; Arick, M.A.; Conover, J.L.; Thrash, A.; Hu, G.; Sanders, W.S.; Hsu, C.-Y.; Naqvi, R.Z.; Farooq, M.; Li, X.; Gong, L.; Mudge, J.; Ramaraj, T.; Udall, J.A.; Peterson, D.G.; Wendel, J.F. 2017. Comparative Genomics of an Unusual Biogeographic Disjunction in the Cotton Tribe (Gossypieae) Yields Insights into Genome Downsizing, Genome Biology and Evolution. DOI: 10.1093/gbe/evx248
Rice, J.; Dees, K.; Perkins, A.; Thrash, A. 2015. Investigating genome similarity through cross mapping percentage, BCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. DOI: 10.1145/2808719.2811456