Skip to content
View Spectraa28's full-sized avatar
🎯
Focusing
🎯
Focusing
  • 08:06 (UTC -12:00)

Block or report Spectraa28

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Spectraa28/README.md

Hey, I'm Sonu (Spectra) 👋

I'm a Backend + ML Engineer who came up through Java and Spring Boot before deliberately moving into ML infrastructure. That background shapes how I think about AI: not just whether the model is accurate, but what happens when a worker crashes mid-embedding, how failures propagate across language boundaries, and what the monitoring should look like at 3am.

I go deep before going wide. I wrote backpropagation in NumPy before touching PyTorch. I built BM25 retrieval from scratch before using a library. Every project I ship has Docker, health endpoints, Prometheus metrics, and documented failure modes — not because someone asked, but because that's what production-ready actually means.


🚀 What I've Built

🔐 LexGuard — Legal Document Intelligence API

Java · Spring Boot · Python · FastAPI · RabbitMQ · pgvector · Prometheus · Grafana · Docker

Enterprise document ingestion + RAG retrieval pipeline with full observability and security hardening.

  • Transactional Outbox pattern across Java + Python — zero document loss under worker crashes
  • Background supervisor with staged rollback (EMBEDDING→PARSED→UPLOADED) using FOR UPDATE SKIP LOCKEDone failure can't block 49 concurrent recoveries
  • SHA-256 API auth, SlowAPI rate limiting, prompt injection defense with Pydantic regex rejection
  • Prometheus Histogram (p99 0.18s idle / 0.48s concurrent), Grafana dashboard, correlation ID threading
  • RAG retrieval score 0.6233 on real ISO 27001 legal text (pgvector HNSW, 148 vectors)

🔗 Live recruiter demo + /demo/query endpoint


📊 Financial RAG API — SEC Document Intelligence

Python · MiniLM · BM25 · Llama 3.3 70B · FastAPI · MLflow · Prometheus · Docker

Layout-aware retrieval pipeline over Apple + Microsoft 10-K SEC filings.

  • Rebuilt retrieval from scratch: pure semantic → BM25 + dense hybrid (alpha=0.7), Context Recall 0 → 1.0
  • Built TableToNaturalLanguage converter — made XBRL financial tables (previously 0% retrievable) semantically searchable
  • Validated across 30 manually curated QA pairs: Faithfulness 0.82, Context Recall 1.0, top score 0.8082
  • Deployed on HuggingFace Spaces with Qdrant Cloud (ephemeral filesystem fix)

🔗 Live on HuggingFace Spaces


LLM Cost Router — 3-Layer Routing Pipeline

Python · FastAPI · FAISS · Redis · TF-IDF · Logistic Regression · Docker

Routes LLM queries through semantic cache → ML classifier → LLM fallback to minimize API spend.

  • 93.19% cost reduction, 87% cache hit rate, 100% routing accuracy across benchmark suite
  • SemanticCache: FAISS + Redis (threshold 0.88), graceful in-memory degradation on Redis failure
  • Classifier: TF-IDF + Logistic Regression on 450 weakly-supervised samples — sub-millisecond inference
  • Deployed on Render

🗂️ TaskFlow — Task Orchestration API

Java · Spring Boot 4.0 · PostgreSQL · Redis · RabbitMQ · WebSockets · JWT · Gemini AI · Docker

Production task management backend — live on Render.

  • JWT auth + refresh tokens, AOP role system, async processing (RabbitMQ), real-time WebSockets, Gemini AI integration

🛠️ Tech Stack

Gen AI / LLMs RAG Pipelines Vector Search pgvector FAISS ChromaDB LLM APIs Prompt Engineering Semantic Caching HuggingFace SentenceTransformers

Classical ML PyTorch XGBoost scikit-learn TF-IDF Logistic Regression MLflow SHAP Drift Detection Backprop from scratch

Backend FastAPI Spring Boot RabbitMQ WebSockets JWT REST APIs

Infra & Observability Docker AWS (EC2, S3) Prometheus Grafana Git Linux (Arch, btw)

Languages Python Java SQL


📊 GitHub Stats

GitHub Stats Streak Top Languages


🌐 Find Me

LinkedIn Portfolio Email


"I find where the system breaks under real conditions and engineer it out."

Pinned Loading

  1. ai-ml-journey ai-ml-journey Public

    Building production ML systems from scratch. Java dev → ML Engineer. Follow the journey.

    Python

  2. TaskManager TaskManager Public

    🚧 Real-time task management system built with Spring Boot, WebSockets & RabbitMQ — building in public

    Java

  3. ecom-spring ecom-spring Public

    A multi-role e-commerce REST API built with Spring Boot 3, JWT auth, Stripe payments, and PostgreSQL — supporting Admin, Seller, and User roles.

    Java

  4. fraud-ml-pipeline fraud-ml-pipeline Public

    XGBoost fraud detection model served as a production REST API — FastAPI + Docker + Railway

    Python

  5. llm-router llm-router Public

    A high-performance, deterministic LLM cost router that optimizes traffic splits between Gemini Flash and Flash-Lite using an additive heuristic matrix. Features regex text sanitization and MLflow t…

    Python

  6. Financial-Rag Financial-Rag Public

    Production RAG API for financial document Q&A — hybrid BM25 + MiniLM retrieval, Docling ingestion, RAGAs evaluation, FastAPI + Docker. Built on Apple 10-K FY2023.

    Python