Software Engineer · AI Engineer · Explainable ML · Visual Analytics · Agentic AI
Dual Ph.D. background in Computer Science and Geology. I build research software that makes complex models, high-dimensional data, and scientific workflows easier to inspect, reproduce, and use.
Website · Google Scholar · Repositories · GitHub
I am based in Utrecht, Netherlands, with a Ph.D. in Computer Science from Utrecht University and a Ph.D. in Geology from China University of Geosciences, Beijing. My work connects machine learning, visual analytics, geoscience, and research-oriented software development: implementing methods from papers, building usable prototypes, and turning research workflows into software that other people can run and inspect.
I am especially interested in research software engineering roles where software quality, reproducibility, and collaboration across scientific domains matter. Recently, I have also been focusing on agentic AI, including retrieval-augmented generation, tool-using agents, and AI-assisted workflows. My broader research interests include explainable machine learning, human-in-the-loop data generation, inverse projection, and decision maps.
- Ph.D. in Computer Science, Utrecht University: enhanced decision maps for exploring classification models.
- Ph.D. in Geology, China University of Geosciences, Beijing: machine learning and visualization for mineral-genesis classification.
- Publications in venues and journals including Computers & Graphics, American Mineralogist, IVAPP/VISIGRAPP, Algorithms, SN Computer Science, and JGR: Solid Earth.
- Best Student Paper Award at IVAPP/VISIGRAPP 2024.
- Open-source contribution to SHAP.
For a quick review of my research software work:
- Best RSE example: LCIP shows tested Python research software, GUI tooling, CUDA/PyTorch workflows, and reproducibility scripts.
- Best Earth-science example: SDBM for Pyrite is a reproducible geoscience ML workflow linked to an American Mineralogist paper.
- Best reusable-package example: InverseProjections exposes inverse-projection methods through a scikit-learn-style API.
Loss-Controlled Inverse Projection of High-Dimensional Data. This is my strongest current research-software project: a paper-linked Python implementation with a Qt GUI, command-line entry points, tests, CUDA/PyTorch-based demos, reproducibility scripts, and documented workflows for inverse projection experiments.
In plain terms, LCIP supports human-in-the-loop, visually guided generation of high-dimensional data from a 2D embedding. It connects visualization, generative modelling, and interactive model steering.
Signals: research software engineering, scientific visualization, human-in-the-loop generative ML, interactive tooling, reproducibility, GPU-enabled workflows.
Reproducibility workflow for interpreting mineral-genesis classification with supervised decision maps on pyrite trace-element data. The project combines geoscience data, classifier evaluation, SSNP-based projection, inverse feature mapping, and notebook-generated manuscript figures.
Signals: Earth-science-facing ML, geochemistry, explainable classification, visual analytics, reproducible computational research.
Implementation accompanying research on fast and accurate decision maps for explaining classification models.
Signals: explainable AI, model inspection, visual analytics, research-method implementation.
A Python package implementing inverse projection techniques such as NNinv, iLAMP, RBF inverse mapping, and MDS multilateration with a scikit-learn-style API.
Signals: reusable research code, dimensionality reduction, interpretable high-dimensional data analysis.
A lightweight local shell agent that turns natural-language requests into executable shell commands, explains the proposed command, and asks for confirmation before running it.
Signals: LLM tooling, CLI user experience, safe tool use, practical agent design.
Exploration of retrieval-augmented generation workflows and knowledge-aware LLM application design.
Signals: retrieval, context construction, LLM application engineering.
A browser-based interactive decision-map demo using TensorFlow.js and D3 to inspect MNIST projections, inverse projections, decision regions, and observation windows directly in the browser.
Signals: interactive visualization, browser ML, scientific demos, user-facing research prototypes.
Languages: Python · JavaScript · SQL
ML/data: PyTorch · TensorFlow · scikit-learn · pandas · NumPy · XGBoost
Visualization: Matplotlib · seaborn · D3.js · PySide/PyQt · pyqtgraph · vispy
RSE strengths: Reproducibility · Scientific Workflows · Documentation · Interactive Tools
AI systems: Agentic AI · RAG · LLM Applications · Tool Use · Workflow Automation
Engineering: FastAPI · SQLAlchemy · Pydantic · pytest · PostgreSQL · TensorFlow.js
I am interested in research software engineering, AI engineering, and data science roles, especially in teams that build software for scientific research, environmental and Earth-science applications, visual analytics, explainable AI, agentic AI, or knowledge-intensive workflows.
Website: http://yuwang-vis.github.io/
GitHub: https://github.com/wuyuyu1024
