Skip to content
View rajput-t's full-sized avatar

Block or report rajput-t

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rajput-t/README.md
██████╗  █████╗      ██╗██████╗ ██╗   ██╗████████╗    ████████╗
██╔══██╗██╔══██╗     ██║██╔══██╗██║   ██║╚══██╔══╝    ╚══██╔══╝
██████╔╝███████║     ██║██████╔╝██║   ██║   ██║          ██║   
██╔══██╗██╔══██║██   ██║██╔═══╝ ██║   ██║   ██║          ██║   
██║  ██║██║  ██║╚█████╔╝██║     ╚██████╔╝   ██║          ██║   
╚═╝  ╚═╝╚═╝  ╚═╝ ╚════╝ ╚═╝      ╚═════╝    ╚═╝          ╚═╝   

ML Engineer · AI Systems · Data Engineering

Building systems that think, retrieve, stream, and adapt.


⚡ About

I build end-to-end AI/ML systems — from raw data pipelines to production-ready inference APIs. My work spans classical ML, deep learning, NLP, multimodal models, RAG pipelines, and agentic AI. I care about the full stack: not just model accuracy, but experiment tracking, deployment, and explainability.

Based in Nashik, India · Open to ML/Data roles


🔧 Core Stack

ML & AI PyTorch TensorFlow/Keras scikit-learn HuggingFace Transformers CLIP BERT Mistral 7B Phi-3

MLOps & Infra MLflow FastAPI Docker Apache Kafka SQLite ChromaDB

RAG & Agents LangChain Ollama Google ADK Gemini Streamlit Gradio

Data Pandas NumPy Matplotlib Seaborn Plotly SQL

Cloud & Certs Microsoft Azure Oracle Cloud (OCI) Google Advanced Data Analytics


🚀 Featured Projects

BERT + CLIP gated fusion model trained on MVSA-Single. EDA revealed systematic image-label misalignment (53.3% agreement) — model correctly learned to down-weight the image branch. PyTorch BERT CLIP Gated Fusion Gradio · Macro F1: 0.7084


Fully local RAG pipeline — 1,256 pages ingested into 8,568 chunks, retrieved via ChromaDB, answered by Mistral 7B running on-device. Zero API costs, source attribution on every response. LangChain ChromaDB Ollama Mistral 7B Streamlit · Zero inference cost


4 models tracked with MLflow, alias-based champion promotion, served via FastAPI REST endpoint. PR-AUC prioritized over AUC due to 7% class imbalance. MLflow GradientBoosting FastAPI Pydantic · AUC: 0.868 · PR-AUC: 0.397


Kafka 4.2 (KRaft, no ZooKeeper) streaming GBM-simulated stock ticks at 5/sec. Z-score anomaly detection with rolling 20-tick window, live Plotly dashboard auto-refreshing every 2s. Apache Kafka Streamlit Plotly SQLite · 5 ticks/sec · 2.5σ threshold


Agentic AI assistant built with Google ADK and Gemini 2.5 Flash Lite. Tool-use for product lookup, order tracking, and localized return policies. Eval suite: tool trajectory accuracy 1.0. Google ADK Gemini Tool Use Vertex AI · Trajectory accuracy: 1.0


🔬 Currently Building

🧬 LLM Fine-Tuning · QLoRA · Text-to-SQL

github.com/rajput-t/llm-finetuning-text2sql (in progress)

Fine-tuning Phi-3 Mini (3.8B) on the Spider benchmark (7,000+ Text-to-SQL examples) using QLoRA — 4-bit quantization via bitsandbytes, LoRA adapters via PEFT, training on RTX 2070 Super (8GB VRAM).

Target: 78%+ execution accuracy, outperforming zero-shot baseline and benchmarking against GPT-3.5 zero-shot.

HuggingFace PEFT QLoRA bitsandbytes TRL SFTTrainer Spider


📂 Other Repositories

Repo Description
deep-learning-fundamentals ANN, CNN (CIFAR-10), LSTM — early DL experiments in Keras/TensorFlow
ML_algorithms Linear Regression, Decision Trees, Random Forests, KNN, K-Means, PCA
leetcode DSA problem solving — ongoing
certificates Google · Microsoft Azure · Oracle Cloud · BCG Forage

📊 Currently Learning

LLMOps          ████████░░░░  Fine-tuning → eval pipelines → deployment
Data Engineering ██████░░░░░░  Kafka → Spark → Airflow
System Design   █████░░░░░░░  Scaling ML systems for interviews
CV Fine-tuning  ████░░░░░░░░  Unfreeze CLIP, detection, segmentation

"Build systems, not demos."

Pinned Loading

  1. multimodal-sentiment-analyzer multimodal-sentiment-analyzer Public

    Multimodal sentiment analysis combining BERT (text) and CLIP (image) via gated fusion, trained on MVSA-Single. Includes EDA revealing systematic image-label misalignment, per-modality baselines, an…

    Python

  2. rag-finance-assistant rag-finance-assistant Public

    A fully local RAG pipeline for financial document Q&A — Mistral 7B via Ollama, LangChain, ChromaDB vector store, and all-MiniLM-L6-v2 embeddings. Zero API costs, source attribution on every answer,…

    Python

  3. realtime-streaming-analytics realtime-streaming-analytics Public

    Real-time stock market analytics pipeline — Kafka 4.2 (KRaft mode) streams GBM-simulated tick data for 5 equities, Z-score anomaly detection flags price spikes, and a live Streamlit + Plotly dashbo…

    Python

  4. E-Com_Support_Agent_googleADK E-Com_Support_Agent_googleADK Public

    E-commerce support agent built with Google ADK and Gemini 2.5 Flash Lite — handles product inquiries, order tracking, and localized return policies via structured tool calls, in-memory session mana…

    Python

  5. loan-default-mlops loan-default-mlops Public

    End-to-end MLOps pipeline for loan default prediction — 4 models tracked with MLflow, GradientBoosting champion at AUC 0.868 / PR-AUC 0.397 on 7% imbalanced data, alias-based model registry, and a …

    Jupyter Notebook

  6. credentials_and_courses credentials_and_courses Public

    A collection of professional certifications spanning data analytics, cloud platforms, and data science — including Google, Microsoft Azure, Oracle Cloud, and BCG Forage.