Christian Bertsch chr13b

Hi, I'm Christian Bertsch 👋

Machine learning for biology, health, and decision-making under uncertainty — from algorithms built from scratch to deep learning, LLMs, and provable guarantees.

🎓 MSc @ ETH Zürich · 🔬 ML × computational biology · 📍 Zürich / Basel

About me

I build end-to-end machine-learning pipelines and care about getting the whole pipeline right — leakage-aware validation, honest metrics, and reproducibility — not just the model. My work spans applied ML in biology and healthcare, probabilistic / trustworthy ML, and a few data-science hackathons. I'm equally comfortable implementing a method from first principles in NumPy and wiring up PyTorch, BoTorch, or an LLM-backed RAG system.

🧬 Domains: protein structure & design, genomics, clinical & mobile-health data, molecular simulation
🤖 Methods: deep learning, foundation-model fine-tuning, LLMs & RAG, Gaussian processes & Bayesian optimization, formal verification
🧪 Principle: cross-validation over leaderboard-chasing; I report what actually reproduces

🚀 Featured projects

Project	What it is	Highlights
OpenFold3 — PDE10A Domain Adaptation	Fine-tuning the OpenFold3 structure-prediction model for PDE10A protein–ligand pose prediction via distribution-aware, PDB-scale data augmentation	+0.20 PL LDDT and −2.3 Å ligand RMSD on a held-out set · foundation-model fine-tuning
GP-BayesOpt for Antibody Design	Gaussian-process surrogate + multi-task Bayesian optimization over ESM-2 protein embeddings	GP predicts developability/specificity at R² ≈ 0.86 / 0.97 on real lab data; closed-loop BO beats random search (2.26 vs 1.45) · BoTorch · GPyTorch
DeepPoly Robustness Verifier	Sound neural-network verifier that proves L∞ robustness via convex relaxation + learnable ReLU bounds	69/70 test cases correct, 0 unsound certifications, across 13 networks · PyTorch (autograd)
LLM Post-Training Recipes Lab	An autonomous, git-ratcheted loop that searches SFT/DPO post-training recipes for small instruct models	Reproducible evals and honest negative results · SFT · DPO
Custom RAG Challenge	Retrieval-augmented generation that grounds GPT-4o answers in a scraped-web corpus, with source attribution	Cleaning pipeline cuts the corpus ~9× (17 GB → 1.9 GB); live Gradio demo · LangChain · Chroma · GPT-4o
Synthetic Market Sharpe	Cross-sectional position sizing for a simulated market, optimizing the Sharpe ratio	Packaged pipeline: feature builders, Bayesian/gradient-boosted regressors, and an attention-BiLSTM trained on a differentiable Sharpe loss · session-level CV

📚 More projects

Machine learning & deep learning

Machine Learning for Genomics — gene expression from chromatin signal (Spearman ρ ≈ 0.56) + single-cell clustering & bulk RNA-seq deconvolution
Clinical ML Interpretability & Tweet NLP — explainable clinical classification (SHAP / Grad-CAM, sanity-checked) + fine-tuned BERT (F1 0.73)
ml-regression-to-deep-learning — four ML tasks from regularized regression to deep metric & transfer learning (all cleared the course's hard baseline)
Classical → Deep Computer Vision — five CV projects from PCA/graph-cuts to CNNs and ResNet transfer learning, built from primitives
Probabilistic AI — Gaussian processes, Bayesian deep learning (SWAG), constrained BO, and DDPG reinforcement learning

Computational biology & from-scratch algorithms

molecular-phylogenetics-r — alignment, tree-building & Felsenstein likelihood in R, 363 passing test assertions
Data Mining from Scratch — distances, DTW, graph & string kernels, k-NN/Naive Bayes implemented in pure NumPy
GROMOS Biomolecular MD — six molecular-dynamics studies benchmarked against experiment

Health & sensing

Mobile-Sensing Symptom Prediction — leakage-aware, participant-grouped CV on 4 years of GLOBEM data
IMU Step Counting & Activity Recognition — peak-detection step counter + Random-Forest activity classifier on wearable IMU data

Data-science hackathons

Solar PV Forecasting & Spot-Market Trading — day-ahead Swiss PV forecasting (Ridge + XGBoost ensemble) → trading positions (Axpo Datathon 2024)

Tooling & open source

honest-loop — a Claude Code plugin for disciplined, self-falsifying experiment loops: propose one change, measure it against the noise, keep only real wins, and report the honest limits (the methodology behind the LLM post-training lab above)

🛠️ Tech stack

Also: BoTorch / GPyTorch · OpenFold3 · Scanpy / AnnData · XGBoost / LightGBM · SHAP / Captum · Gradio · Chroma

📫 Get in touch

📧 Email · 💼 LinkedIn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly