I started coding in Python in 2019, after completing my bachelor's degree. During my MSc in Biochemistry at the University of Geneva, Switzerland, I learned R and strengthened my background in statistical modelling, machine learning, multivariate analysis, and computational approaches for biological research.
During my MSc thesis, my work focused on untargeted metabolomics and data-driven strategies to identify and annotate biomarkers. This was a highly enriching experience that allowed me to develop skills in data science and bioinformatics. To further gain experience in software development, I completed my MSc thesis with a final project: development of a Python library to automate unsupervised and supervised multivariate analysis.
After graduating, I had the opportunity to work at the Faculty of Medicine of the University of Geneva, where I joined the Bioinformatics Support Platform as a junior data scientist. This experience allowed me to broaden my expertise and learn how to analyse bulk RNA-seq and single-cell RNA-seq data. I also received in-depth training in machine learning, including running experiments with neural networks, implementing variational autoencoders in Torch, and conducting benchmarks for different prediction tasks. In addition, I gained practical experience with Docker, high-performance computing, version control, and collaborative software development workflows.
Since 2024, I have been a PhD candidate at the University of Würzburg, Germany. My current work is multidisciplinary. On one hand, I am developing a deep learning model for personalized drug prioritization. This involves model architecture design, data collection, leakage-proof model training, and benchmarking, with the goal of achieving robust generalization to patient-derived samples. On the other hand, I am working on the discovery of molecular subtypes in oral squamous cell carcinoma (OSCC). In this context, I analyse bulk RNA-seq, single-cell RNA-seq, and spatial transcriptomics data. I apply statistical learning models, test hypotheses using robust statistical approaches, and interpret the results in their biological context.
Overall, my work is tailored toward translational impact, aiming to help OSCC patients benefit from more precise drug recommendations and a better understanding of the disease.
I hope to share some of my latest work soon.
- 📄 Know about my experiences
-
🔭 I’m currently working on this page to showcase my projects which includes:
-
🐍 Python
- numpy
- pandas
- sckit-learn
- PyTorch
- plotly
- seaborn
-
📈 Machine Learning
Imbalanced data sampling strategies:
- SMOTE
- TomekLink
- Random Under-Sampling
Models:
Regression Classification OLS with LASSO - RIGDE regularization Logistic Regression Gradient Boosting Random Forest Classifier Random Forest Multi Layer Perceptron PLS-RA PLS-DA -
📊 R
- Bioconductor
- survival
- ggplot2
- caret
-
💻 BASH scripting