Start Here
Purpose: This field guide documents considerations for building a biomarker discovery pipeline from RNA-seq meta-analysis, with emphasis on lessons learned, reproducibility, and practical recommendations for researchers working with noisy multi-study omics data.
Grant mission: Developing molecular resilience biomarkers to support selective breeding and management of shellfish aquaculture under disease pressure. (Full narrative: ProjectSummaryandNarrative.pdf)
What was done: Integrated RNA-seq datasets from Crassostrea virginica exposed to Perkinsus marinus, conducted differential abundance analyses across multiple independent studies, and developed a two-step classifier pipeline that identifies reproducible gene expression signatures.
Key results: Identified a validated 6-gene classifier panel that distinguishes tolerant from sensitive oyster phenotypes, confirmed via Leave-One-Study-Out (LOSO) cross-validation.
Intended application: This guide is a reusable template for researchers facing batch effects, weak signals, and overfitting risks in multi-study biomarker discovery.
How to Navigate This Guide
| Section | What You'll Find |
|---|---|
| 1. Research Context & Problem Framing | Background on Dermo disease, datasets, phenotype definitions, and key constraints |
| 2. Process Narrative | Chronological account of decisions, pivots, and surprises — the honest story of what happened |
| 3. Big Lessons Learned | Distilled, numbered insights for researchers adapting this work |
| 4. Methods & Pipelines | Decision guide and validated analysis pipelines |
| 5. Analysis Code & Source Code | Where to find and how to run everything |
| 6. Glossary | Terms defined in the context of this project |
| 7. Sources & References | Notebook posts, GitHub issues, and external references |
Developed by Shelly Wanamaker and Steve Yost with AI assistance. Cite as: Wanamaker, S.A. and Yost, S. (2025). Resilience Biomarkers Field Guide. https://resilience-biomarkers-for-aquaculture.github.io/field-guide/