Skip to content
View Keshav0781's full-sized avatar

Block or report Keshav0781

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Keshav0781/README.md

Keshav Jha

AI Engineer based in Erlangen, Germany, with an M.Sc. in Data Science from Friedrich-Alexander University Erlangen-Nuremberg.

I have spent most of my career working at the intersection of data, analytics, and engineering. At Accenture I worked as a Senior Software Engineer for 4 years 7 months on BestBuy's order management, delivering OMS solutions on IBM Sterling Order Management System and IBM Sterling Commerce, sustaining 80,000+ daily orders with zero critical outages across Black Friday and Cyber Monday peaks. At Siemens Healthineers I worked as an AI Engineer (Working Student) building enterprise RAG semantic search over 5,000+ R&D documents, agentic workflows using LangGraph, and LLM-based summarisation pipelines using LangChain, Qdrant and Azure OpenAI.

My current focus is production AI engineering: building systems that work end to end, from the retrieval architecture to the cloud deployment. I am also interested in data science, data analytics, and senior software engineering roles where the emphasis is on turning complex data into decisions that matter.


Skills

Python is my primary language. On the AI engineering side: LangGraph, LangChain, ChromaDB, sentence-transformers, FastAPI, Docker, GCP Cloud Run, BigQuery, Pub/Sub, Terraform, Ragas. On the deep learning side: PyTorch, U-Net, nnU-Net, SAM, LoRA adapters. For analytics and forecasting: Pandas, Scikit-learn, Prophet, Statsmodels, Power BI, SQL. Cloud experience across GCP and Azure, with hands-on BigQuery work both at Accenture and in personal projects.


Projects

MedFlow Referral Agent: Clinical Referral AI System

A production-grade multi-agent AI system built for LMU Klinikum Munich that automatically processes patient referral PDFs and routes them to the correct hospital department, classifies urgency, and generates a clinical summary for coordinator review. The coordinator reviews every AI decision before any action is taken — AI assists, never decides.

The pipeline runs as a LangGraph graph with five nodes: PDF extraction via PyMuPDF, department routing using ChromaDB RAG and Gemini, urgency classification using ChromaDB RAG and Gemini, escalation check for Emergency cases, and clinical summary generation. Emergency cases trigger an immediate escalation alert before the coordinator has even reviewed the document. All coordinator decisions — Approve, Edit, or Reject — are logged to BigQuery for full clinical audit trail compliance.

The system is deployed across three isolated environments on GCP Cloud Run with a full CI/CD pipeline. Every push runs 43 automated tests. Staging deployment requires a Ragas accuracy gate — routing accuracy above 90% and urgency accuracy above 95% — before promotion. Production deployments require manual approval.

Tech Stack: Python, LangGraph, gemini-2.5-flash, ChromaDB, FastAPI, PyMuPDF, GCP Cloud Run, Pub/Sub, BigQuery, Terraform, GitHub Actions, Ragas

Live Demo:

View on GitHub


ClinicalRAG: Intelligent Document Search System

A production RAG pipeline for searching Siemens Healthineers clinical documents, deployed as a live REST API on GCP Cloud Run. The system takes a natural language question and returns a precise answer with source references in under 3 seconds.

The retrieval pipeline uses two stages: ChromaDB for semantic search across 834 document chunks, then CrossEncoder reranking to select the best five results. Session-based conversation memory allows follow-up questions to work naturally across multiple turns. Three guardrail rules block medical advice requests and off-topic queries before they reach the LLM. Every query is async-logged to BigQuery with retry logic and exponential backoff so the API response is never delayed by the logging operation.

The Docker setup uses a multi-stage build with a non-root user and went through GCP Container Analysis scanning before deployment. Critical OS vulnerabilities were identified and patched before the service went live.

Tech Stack: Python, FastAPI, ChromaDB, LangChain, Groq (Llama 3.3 70B), Docker, GCP Cloud Run, BigQuery

Live API: https://clinical-rag-service-324111066236.europe-west1.run.app/

View on GitHub


OCT Biomarker Segmentation: Master's Thesis

Multi-class segmentation of retinal biomarkers in OCT scans for automated detection of Age-related Macular Degeneration and Macular Hole. The work compared multiple U-Net variants against nnU-Net across two clinical datasets, with full methodology documentation and quantitative evaluation. This is the foundation that the SAM fine-tuning project below builds on directly.

Tech Stack: Python, PyTorch, nnU-Net, U-Net, Medical Imaging

View on GitHub


SAM Fine-tuning with LoRA: OCT Biomarker Segmentation

An extension of the thesis work, exploring whether a large vision foundation model could outperform task-specific architectures with the right adaptation strategy. SAM performs poorly on medical scans without fine-tuning: zero-shot IoU of 0.27 on AMD OCT images. Fine-tuning with LoRA adapters, which trains only a small fraction of the model parameters, brought that to 0.86. On Macular Hole the result was similar, from 0.20 to 0.89.

The approach is practically relevant because annotated medical imaging data is always limited. A method that achieves strong performance with few trainable parameters has real clinical value.

Tech Stack: Python, PyTorch, SAM, LoRA (HuggingFace PEFT), Medical Imaging

View on GitHub


Supply Chain Operations Analytics

An end-to-end analytics case study on four years of daily delivery operations data from Munich. The work covers SLA performance analysis, order volume forecasting across four models (Holt-Winters achieved MAPE 4.70%), driver capacity planning, and revenue impact quantification. The analysis identified 5.7 million euros in lost commission from SLA failures and translated that into operational recommendations across three planning scenarios.

The structure of this analysis reflects the kind of work done at Accenture on BestBuy's operations data: order lifecycle tracking, fulfilment metrics, and capacity planning applied here with full end-to-end documentation.

Tech Stack: Python, Prophet, SARIMA, Holt-Winters, Pandas, Scikit-learn, Matplotlib

View on GitHub


Contact

LinkedIn: https://www.linkedin.com/in/keshav-jha-11899b255/

Email: itskeshavjha1996@gmail.com

Pinned Loading

  1. media-intelligence-agent media-intelligence-agent Public

    Multimodal AI agent for automated video summarisation and archive indexing for media industry

    Python 1

  2. gtm-ops-agent gtm-ops-agent Public

    Production-grade GTM operations automation using LangGraph, n8n, and MCP — multi-agent system for lead intelligence, email triage, meeting summaries, and CRM hygiene

    Python

  3. medflow-referral-agent medflow-referral-agent Public

    Production-grade multi-agent AI system for automated clinical referral processing — LangGraph, gemini-2.5-flash, ChromaDB, Ragas, GCP Cloud Run

    Python

  4. medical-agentic-rag-langgraph medical-agentic-rag-langgraph Public

    Agentic RAG pipeline with LangGraph multi-step reasoning, Qdrant vector search, RAGAS evaluation and LangSmith tracing for medical R&D document intelligence

    Python

  5. clinical-rag clinical-rag Public

    Production RAG pipeline for clinical document search - FastAPI, ChromaDB, LangChain, GCP Cloud Run

    Python

  6. OCT-SAM-FineTuning-LoRA OCT-SAM-FineTuning-LoRA Public

    SAM (Segment Anything Model) fine-tuning with LoRA adapters for OCT biomarker segmentation (AMD & Macular Hole).

    Python