Portfolio
๐ Featured Projects
๐งช scRNA-seq CRISPR Pipeline
Reproducible Python Pipeline for ML-Ready Perturbation Datasets
Production-grade Snakemake pipeline that transforms raw Perturb-seq data into harmonized, balanced, ML-ready datasets for AI-powered perturbation prediction.
Key Features:
โธ Automated QC + normalization (Scanpy)
โธ Smart class balancing (scikit-learn)
โธ Batch integration (Harmony/BBKNN)
โธ Leave-genes-out cross-validation
โธ Interactive Jupyter notebooks
โธ Comprehensive Quarto documentation
Tech Stack: Python 3.10, Snakemake, Scanpy, scikit-learn, Harmony, Quarto, GitHub Actions
Performance: Processes 10k cells (~500MB) in ~15 minutes on laptop (AMD Ryzen 5 7535HS, 16GB RAM)
๐ R/Pharma 2025 Workshops Portal
Knowledge Base for Pharmaceutical Bioinformatics
Comprehensive Quarto website documenting R/Pharma 2025 workshops, tools, and modern R workflows for pharma/biotech.
Key Features:
โธ Structured workshop documentation
โธ Reproducible examples (Positron IDE, Shiny)
โธ Curated resource library
โธ GitHub Pages deployment
Tech Stack: Quarto, RMarkdown, GitHub Actions, Positron
๐ Client Projects
๐งฌ Lung Cancer Subtype Classifier
Precision Medicine Tool for MEK Inhibitor Response
Developed 113-gene PAM classifier identifying 3 transcriptional subtypes (MUC, PRO, MES) in >800 lung adenocarcinoma patients. Classifier predicts MEK inhibitor response with 87-91% accuracy.
Impact: Published in Clinical Cancer Research (2021)
Methods: Consensus NMF, PAM, survfit, Cox models
Tools: R/Bioconductor, TCGA/cBioPortal integration
๐ฌ CRISPR Screen Analysis Platform
Genome-Wide Screen for Immunotherapy Resistance
Analyzed ~160,000 sgRNAs screening for tumor immune evasion genes. Identified ZFX transcription factor as biomarker for anti-PD-1 response.
Impact: Published in iScience (2025)
Methods: crisprVerse, TMM normalization, edgeR, ChIP-seq
Tools: R/Bioconductor, BWA, COSMIC signatures
๐งซ Single-Cell Atlas of Tumor Microenvironment
Multi-Modal Analysis of KRAS-Driven Lung Cancer
10X Genomics scRNA-seq analysis (40,000 cells, 22 clusters) revealing EGFR/ERBB-mediated immune escape mechanisms.
Impact: bioRxiv preprint (2023)
Methods: ImmGen annotation, PANTHER enrichment, UMAPs
Tools: 10X Genomics, Seurat, WES integration
๐ NK Cell Activation Profiling
Bispecific Antibody Mechanism Study
RNA-seq analysis of NK cells post-TDB treatment, identifying cytokine-driven activation pathways (IFN, TNF, IL2/IL10 axes).
Impact: Published in Cancer Immunology Research (2024)
Methods: HTSeqGenie, fgsea, MSigDB enrichment, Luminex integration
Tools: R/Bioconductor, Smart-Seq V4, NovaSeq 6000
๐ง CNS Remyelination Biomarker Discovery
Microarray Analysis for Multiple Sclerosis Therapy
Identified PDE4 inhibition as therapeutic approach for remyelination. Analyzed microarray data (Rat Exon 1.0ST) with integrated in vivo validation.
Impact: Published in EMBO Molecular Medicine (2013)
Methods: RMA normalization, limma, qRT-PCR validation
Tools: Affymetrix platform, R/Bioconductor, GEO (GSE50042)
๐ Interactive Shiny Dashboard Suite
Real-Time NGS QC & Exploratory Analysis
Custom Shiny applications for client-facing data exploration, including:
โธ Real-time sequencing QC monitoring
โธ Interactive UMAP/volcano plot explorers
โธ Drug response visualization tools
โธ Pathway enrichment browsers
Tech Stack: Shiny, ggplot2, plotly, DT, shinydashboard
Deployment: RStudio Connect / shinyapps.io
๐ ๏ธ Open Source Contributions
๐ฆ R Package Development
Custom Packages
Experience developing R packages for internal workflows at Roche/Avenga, including Fluidigm qPCR data handler and other analysis tools.
Skills: devtools, roxygen2, testthat, pkgdown
Status: Internal use (proprietary)
๐ง Pipeline Templates
Snakemake/Nextflow Starters
Open-source templates for common NGS workflows (RNA-seq, ChIP-seq, CRISPR screens) with reproducible environments.
Status: In development for ACTN3 Bioinformatics
๐ Documentation & Tutorials
R/Pharma Workshop Materials
Quarto-based educational resources for pharmaceutical bioinformatics, including reproducible examples and best practices.
๐ผ Want to Collaborate?
Iโm available for:
โธ ๐งฌ Contract NGS Analysis
โธ ๐ฆ R Package Development (custom Bioconductor packages, documentation, testing)
โธ โ๏ธ Pipeline Development (Snakemake/Nextflow workflows, Python/R)
โธ ๐ค AI Integration (ML-ready dataset preparation, predictive modeling)
โธ ๐ Shiny Dashboard Creation (interactive data exploration tools)
โธ ๐ Publication Support (methods sections, supplementary analyses, figures)