ConGen Population Genomic Data Analysis Course/Workshop, South Africa 2025
Goals: To train students, postdocs, faculty, and agency researchers to understand and use population genetics principles and DNA data to improve biodiversity conservation and management. The course teaches participants to understand, analyze, and interpret genetics datasets including microsatellites, SNPs, and genome-scale sequencing datasets (RADseq, amplicon-seq, targeted capture, WGseq). We will teach genome assembly, and how to take raw sequencing reads through to genotyping (with & w/o a reference genome). The course will help bridge the gap between researchers and managers to improve conservation. This course is urgently needed given the biodiversity crisis and the recent Kunming-Montreal Global Biodiversity Framework in which 196 parties “committed to reporting the status of genetic diversity for all species"– wild and domestic (Mastretta-Yates et al. 2024; Hoben et al. 2024).
Target audiences:
- Advanced undergrads and technicians
- Master's and Ph.D. students
- Postdocs, faculty, government researchers or PIs
- Conservation and wildlife & plant managers (especially day introductory lecture day 1)
When: Sunday-to-Friday, December 7-12 (plus a field trip to Kruger National Park Dec 13-16).
Where:
Register:
*Recommended knowledge and background:
- Understanding of DNA markers (Microsatellites, SNPs, RADseq, WGseq), population genetic diversity metrics (H, allelic richness, haplotype diversity) and mechanisms of evolutionary change: genetic drift, gene flow, selection, & mutation.
- Understanding of pop gen concepts and testing: effective population size, inbreeding depression, testing for HW and linkage disequilibrium, etc.). See Chapters 5-9 in Allendorf et al. (2022) book. Ask instructors to send you book’s Chapter 5 and the Appendix to help you prepare or get them here.
- Experience using R and Linux. You can learn some basic R skills at our introductory Zoom lectures the weeks before the course (see Tentative Course Agenda Link).
- Participants should understand English (written, spoken).
Main instructors:
Eric Anderson (NMFS, Colorado State U), Ellie Armstrong (UC Riverside), Paulette Bloomer (U Pretoria), Gordon Luikart (U Montana), Will Hemstrom (Colorado State University), Monica Mwale (SANBI Pretoria), Jessica Da Silva (SANBI Cape Town), Paul Grobler (University of Free State), Paul Hohenlohe (U Idaho), Marty Kardos (NOAA, NMFS), Laura Bertola (National Centre for Biological Sciences (NCBS) in Bangalore, India, Robin Waples (NMFS/U Washington), Sandi Willows-Munro (U. of KwaZulu-Natal) Additional instructors to be announced. Companies like NanoPore and Diplomics also contribute training and sponsorship.
Workshop content:
ConGen teaches fundamental statistical and computational approaches that help prepare students and professionals to use population genetic and genomic data in their work. Microsatellites and SNP datasets will be discussed and analyzed – with forensics and other applications (individual ID and match probabilities). Emphasis will be on next-generation sequence (NGS) data analysis (RADs and genome sequencing/resequencing) and interpretation of output from fundamental and novel statistical approaches and software programs (including R and Linux command line). The course promotes interactions among early-career researchers e.g., grad students & postdocs), mid-career faculty and agency researchers, and leaders in population genomics to help develop our "next generation" of conservation and evolutionary geneticists. We will identify and discuss developments needed to improve data analysis approaches to advance the field. This course often feels like a workshop because multiple instructors ask questions and provide helpful comments during another instructor’s lecture to help advance learning of basic and advanced concepts and approaches.
This course will cover concepts and methods including the coalescent, Bayesian, and likelihood-based approaches. Special lecture sessions and hands-on exercises will be conducted on assessing population structure, testing for HW proportions, detecting selection, genetic monitoring (of Ne, FST, Nm, etc.), inbreeding detection (RoH), population assignment (with microsatellites then lcWGseq data), whole-genome sequencing & assembly, phylogeny construction & phylogenomics, and more.
We will use popular programs like Structure, NeEstimator, and packages in Rstudio. We’ll analyze datasets (hands-on) using key software packages including GenePop, Structure, bottleneck programs, GeneClass, Rubias, WGSassign, etc.). We’ll discuss approaches for detecting illegal trafficking and quantifying dispersal of individuals using assignment tests. Finally, participants will learn to assess Ne without (and with) genetic data to help countries address the recent Kunming-Montreal Biodiversity Framework adopted by the UN Convention on Biodiversity (CBD) (see Mastretta-Yanes et al. 2024; Hoban et al. 2024).
Sponsors and course publications:
This course is sponsored by the American Genetic Association, the Journal of Heredity, NASA (the National Aeronautics and Space Administration), NSF, along with Nanopore and support from publishers of Environmental DNA, Molecular Ecology Resources, and Conservation Genetics. This course/workshop should lead to a publication (meeting review) describing the course's main topics, course outcomes, genome assembly (quality testing), and recent advances in population genetics and phylogenetics worldwide. For example, see pubs below:
Registration fees: African Nationals (at an African institute) get a reduced registration fee thanks to scholarships (ask ConGen organizers). Non-African Nationals get the 20th which is $800 (so apply & pay early!). Cost for non-African nationals is $900 if you pay after September 19th (Friday).
Click for a Google sheets link for additional scholarship/fellowship opportunities (NOT associated with the course!)
Lodging and food: Participants will pay for their lodging ($30-$60 per night) and food ($5-30 per meal), both available within walking distance from the course. This course’s web page can advise on lodging and restaurants. Lodging costs less if you have a roommate. Ideally, you should get lodging on campus so it’s quick and easy to get to the lecture hall, including for evening work sessions (hands-on) if you can attend in the evenings.
Recommended lodging: (can be arranged through course organizers)
VISAs: check if you need a VIAS soon/ASAP and apply if needed; we can provide an invitation letter. North Americans and Europeans do NOT need a VISA. Some African countries require a visa to go to South Africa.
Field trips (stay tuned for more information):
Kruger National Park, approximately December 13-16. We’ll drive (on safari) around the southern half of the park – around the Skukuza area, with local experts. Costs are uncertain but likely $80 to $150 per day, per person including transportation (depending on if you share a room and what you spend on your food). Room costs are likely R380 (shared with 1 or 2 other people). You’ll purchase your own food; We’ll will cut costs by bringing some food. It’s a 5-hour (interesting) drive to the park. We’ll drive in spacious 8-passenger University vehicles. Register here for this trip to reserve your place (limit: 20-22 people, in 3-4 vehicles). Several main ConGen instructors are going.
This course is held in collaboration with SANBI. See .
References cited:
Allendorf, F.W., W.C. Funk, S.N. Aitken, M. Byrne, G. Luikart. 2022. Conservation and the Genomics of Populations. [3rd Edition]. Oxford University Press. Pp. 784
Bailey, R. I. (2024). Bayesian hybrid index and genomic cline estimation with the R package gghybrid. Molecular Ecology Resources, 24(2), e13910.
Jombart T. and Ahmed I. (2011) adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics. doi: 10.1093/bioinformatics/btr521
Hemstrom, W., & Jones, M. (2023). snpR: User friendly population genomics for SNP data sets with categorical metadata. Molecular Ecology Resources, 23(4), 962-973.
Hoban, S., Paz-Vinas, I., Shaw, R. E., Castillo-Reina, L., Silva, J. M., DeWoody, J. A., ... & Grueber, C. E. (2024). DNA-based studies and genetic diversity indicator assessments are complementary approaches to conserving evolutionary potential. Conservation Genetics, 1-7.
Jenkins, T. L. (2024). mapmixture: An R package and web app for spatial visualisation of admixture and population structure. Molecular Ecology Resources, 24(4), e13943.
Kamvar, Z. N., López-Uribe, M. M., Coughlan, S., Grünwald, N. J., Lapp, H., & Manel, S. (2016). Developing educational resources for population genetics in R: An open and collaborative approach. Molecular Ecology Resources. https://doi.org/10.1111/1755-0998.12558
Kamvar ZN, Brooks JC and Grünwald NJ (2015) Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality. Front. Genet. 6:208. doi: 10.3389/fgene.2015.00208
Kardos, M., and G. Luikart. 2021. The genomic architecture of fitness drives population viability in
changing environments. American Naturalist, 197:511–525. doi.org/10.1086/713469
Kardos M, Armstrong EE, Fitzpatrick SW, Hauser S, Hedrick PW, Miller JM, Tallmon DA, Funk WC. The crucial role of genome-wide genetic variation in conservation. (2021) Proc Natl Acad Sci USA. 118(48):e2104642118. doi: 10.1073/pnas.2104642118.
Mastretta‐Yanes, A., Da Silva, J. M., Grueber, C. E., Castillo‐Reina, L., Köppä, V., Forester, B. R., ... & Hoban, S. (2024). Multinational evaluation of genetic diversity indicators for the Kunming‐Montreal Global Biodiversity Framework. Ecology Letters, 27(7), e14461.
Paradis, E. (2020). Population genomics with R. Chapman and Hall/CRC.
Zhang, R., Jia, G., & Diao, X. (2023). geneHapR: an R package for gene haplotypic statistics and visualization. BMC bioinformatics, 24(1), 199.
Yang, C., Mai, J., Cao, X., Burberry, A., Cominelli, F., & Zhang, L. (2023). ggpicrust2: an R package for PICRUSt2 predicted functional profile analysis and visualization. Bioinformatics, 39(8), btad470.