Skip to content
View TammineniTanay's full-sized avatar

Block or report TammineniTanay

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
TammineniTanay/README.md

Typing SVG

Portfolio LinkedIn Zenodo ResearchGate Profile Views Last Updated



About Me

#!/usr/bin/env python3
# tanay_tammineni.py

class AIEngineer:

    def __init__(self):
        self.name       = "Tanay Tammineni"
        self.role       = "AI Systems Engineer"
        self.education  = "MS CS @ SEMO · 3.9 GPA"
        self.location   = "Irving, TX · Open to Relocate"
        self.status     = "🟢 Open to Opportunities"

    @property
    def expertise(self):
        return {
            "LLM"  : ["QLoRA", "DeepSpeed ZeRO-3",
                      "vLLM", "Flash Attention 2"],
            "RAG"  : ["Qdrant", "Elasticsearch",
                      "Neo4j", "LangGraph", "CRAG"],
            "Data" : ["PySpark", "Databricks",
                      "SQL", "Pandas", "RAGAS"],
            "Cloud": ["AWS", "Docker",
                      "Terraform", "CI/CD"],
        }

    @property
    def achievements(self):
        return {
            "memory_reduction"  : "41.2% per-GPU",
            "throughput_gain"   : "3.8x on Llama 3 8B",
            "faithfulness_gain" : "+23.7% via RRF",
            "publications"      : 2,
            "gpa"               : 3.9,
        }

    def say_hi(self):
        print("I don't just build AI. I ship it.")


me = AIEngineer()
me.say_hi()





⚡ Key Metrics



🔧 Currently Building

🤖 JobAgent

Zero-cost job application pipeline. Local Ollama · llama3.1:8b · SQLite · LaTeX. No API calls. No cost.

Ollama SQLite Pandas LaTeX

🎙️ LiveWire AI Co-Pilot

Chrome MV3 extension · Tab + mic capture · Whisper STT · Evidence packs.

WebSocket FastAPI Whisper Chrome MV3

📄 UniLLMOps Framework

Unified LLM production framework — fine-tuning to serving. Zenodo · Targeting arXiv cs.AI.

LLM RAG CRAG RAGAS vLLM



🛠️ Tech Stack

Core AI/ML

Infrastructure & Cloud

Databases & Search

Dev Tools


Python PyTorch LangChain SQL PySpark AWS Docker Terraform



🚀 Flagship Projects

QLoRA DeepSpeed vLLM Terraform Scale

Metric Result
💾 Per-GPU memory −41.2% via ZeRO-3
⚡ Throughput 3.8x on Llama 3 8B
🔬 Techniques DPO · TIES · DARE · SLERP
📦 Infra Prometheus · Grafana · CI/CD

Qdrant Elasticsearch Neo4j LangGraph RAGAS

Metric Result
📈 Faithfulness gain +23.7% via RRF
⏱️ Retrieval latency 163.5ms mean
🎯 Context precision 1.0
🔄 CRAG rewrite rate 38%

Ollama SQLite LaTeX Cost

Fully local automated job pipeline. Ingests Excel job feeds, filters roles, generates tailored resumes + cover letters — zero external API calls.

Published Accuracy Frames Award

Published in CVR Journal of Science & Technology · June 2023. Real-time detection + classification for intelligent traffic analysis.



💼 Experience

AI Systems Developer Intern · VoiceBotics AI (formerly Automate365)

Remote · Irving, TX · Chrome MV3 · WebSocket · FastAPI · Whisper STT · Sprints 8–13 delivered

Data Engineer Intern · Globalshala

Azure Databricks · ETL pipelines · Power BI dashboards · SQL optimization · 99.9% uptime



📄 Publications

📘 UniLLMOps

A Unified Framework for End-to-End LLM Production Systems

DOI Target

Distributed fine-tuning · Hybrid RAG · CRAG · RAGAS evaluation · vLLM serving at scale

Verified Metrics: 23.7% faithfulness gain · 41.2% GPU reduction · 3.8x throughput · 163.5ms latency

📗 Computer Vision Paper

Real-Time Vehicle Detection & Classification

ResearchGate Journal

OpenCV · Real-time multi-class detection · Traffic monitoring · Intelligent transportation

Results: 88% accuracy · 5,000+ frames · 🏆 3rd Prize at Project Expo2K23



🏅 Certifications

Columbia UCSC HackerRank Cisco Duke



📊 GitHub Stats






📬 Connect

Portfolio LinkedIn UniLLMOps CV Paper


github-snake

Pinned Loading

  1. face-recognition-attendance-system face-recognition-attendance-system Public

    Face recognition based attendance monitoring system that automatically identifies individuals and records attendance using computer vision and facial recognition techniques.

    Python

  2. hybrid-rag-system hybrid-rag-system Public

    Production-grade RAG system with hybrid retrieval (Qdrant + Elasticsearch + Neo4j), Corrective RAG via LangGraph, feedback-driven reward model, and RAGAS evaluation dashboard

    Python 2

  3. distributed-finetune-pipeline distributed-finetune-pipeline Public

    End-to-end distributed LLM fine-tuning pipeline: data curation → QLoRA + DPO training → TIES/DARE/SLERP model merging → evaluation → vLLM deployment. Built with DeepSpeed ZeRO-3, Flash Attention, P…

    Python

  4. Live-Wire-AI Live-Wire-AI Public

    Forked from H-R-Shubha-shree/Live-Wire-AI

    Jupyter Notebook

  5. realtime-Vechile-Detection-using-AI realtime-Vechile-Detection-using-AI Public

    Jupyter Notebook

  6. JobAgent JobAgent Public

    Zero-cost automated job application pipeline — Excel/GitHub ingestion, Ollama resume generation, Playwright auto-fill

    Python