Skip to content
View AdityaSinghDevs's full-sized avatar

Block or report AdityaSinghDevs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AdityaSinghDevs/README.md

Aditya Pratap Singh

· Mechanistic Interpretability  ·  Transformers  ·  NLP  ·  LLMs · 

Typing SVG


tl;dr

I love building close to the metal, think from first principles, and have a weird thing for transformers and Mech Interp. Horror movies, philosophy, cooking, sketching, lifting weights and a terrible sleep schedule are all recurring themes around here. Building things, learning things and turning chaos into systems is what I do best. Sometimes it's a project, sometimes it's a paper, sometimes it's my own life. Three real projects, one preprint incoming, looking for research that actually matters. Just want to be part of taking something from 0 to 1 or 1 to 100.


I like working on Mech Interp, Language Models, NLP, and everything that lives close to the metal in AI.

Have a weird infatuation with transformers. Every week I feel "ohh, so that's how it works, now I finally get them", and every week they humble me back down. A perfectly healthy relationship, you see.

Research, infra, model internals and architectures, systems, new techniques and the complex maths behind it all, that's where I live. Not really into the shiny layers on top. (You know, all of the Applied-AI and wrapper gold rush? Yeah, not really my cup of tea.)

An itch to understand things from first principles, chasing the "Whys?" after all the "Hows?", Obsess over things until they're finished. I mean, Basically everything that plays a part in my messed up sleep schedule (Trust me when I say, I am trying to fix it.)

Things I built that I'm proud of :

╰─► NanoLens (my current flagship. seriously, check this one out.)

  • Built a character level autoregressive transformer from scratch. Tokenizer to attention heads to optimizer, every component hand coded in plain PyTorch. Made it modular and config driven for scalability up to 150M params. Then added a Mechanistic Interpretability toolkit to visualise attention patterns and hidden states across all heads and layers using simple CLI flags, Built as a research platform, not just a model
  • 25M parameters, 8 layers, 64 attention heads in total, trained on Dostoevsky and then dissected head-by-head through attention circuit and hidden-state analysis, and documented everything.

╰─► Mimir

  • Named after the Norse god of wisdom, Investigated whether structured reasoning actually reduces hallucinations in LLMs, or just moves the problem around. 12 controlled trials, real DevOPS based incident data, Qwen 2.5-3B.
  • Short answer: it depends. Ambiguity is the moderating variable nobody talks about. Long answer: read the repo. ( Also the bridge that led me to this LLM rabbit hole I am falling into )

╰─► Tesseract

  • Started as a random internship assignment. They never got back to me. Built a production-grade text-to-3D inference system around OpenAI's Shape-E anyway.
  • Stateless async FastAPI backend, device-aware GPU/CPU fallback, modular config-driven pipeline. Then added a full benchmarking suite in v1.2, ran it, documented everything. Craziest finding: 330× CPU vs GPU slowdown. Wrote a long-form technical deep dive, got it published in Towards AI . Laid the foundations of every systems and engineering decision I've made since.
  • (their loss, honestly.)

and some earlier work worth mentioning ORCA, a CV assistant embedded in wearable hardware for the visually impaired (face recognition, depth estimation, object detection). Sky Sentinel X, drone vs bird classification using micro-Doppler spectrograms and ResNet. VULKYRIE, chemical testing in carcasses via RGB detection and random forest regression, built for vulture conservation. different domains, same obsession with building things that actually do something real.

Beyond Code

  • Scaled ADVAIT, an AI community in college from ~50 members to 530+ members as President, designed and built its division structure, launched technical events, workshops, speaker sessions, and project sprints and showcases.
  • You might find me watching horror movies, reading philosophy and journaling my own, cooking, sketching or maybe in the gym when i aint on the code.

Currently

  • Working on a preprint, reading papers, researching, understanding, learning and looking for a research internship where the work is real.

Ambitious people, difficult problems, conversations where everyone walks out learning something, Yep, that's where my heart lies. I enjoy leading things. I enjoy learning even more. If the knowledge is real, I don't mind being the dumbest person in the room.



activity


github-snake

mail   linkedin   x

Pinned Loading

  1. nanolens nanolens Public

    Configurable character-level transformer training suite with built-in mechanistic interpretability toolkit — scale to 150M+ parameters and beyond, no ceilings, only hardware limits. Inspect attenti…

    Jupyter Notebook 1

  2. tesseract tesseract Public

    A Production grade modular ML pipeline that uses diffusion driven neural nets, to generate usable 3D Mesh assets from text or image inputs.

    Python 1

  3. Reddit-Persona Reddit-Persona Public

    This repository contains a Reddit profile scraper that builds user personas from posts and comments, using LLMs for analysis and summarization.

    Python

  4. SkySentinel-X SkySentinel-X Public

    Micro-Doppler Target Classification system for the Smart India Hackathon (SIH). Using spectrograms generated from STFT on FMCW radar data, the system employs deep learning models to classify drones…

    Jupyter Notebook 2

  5. mimir-v0 mimir-v0 Public

    Mimir v0 is a controlled research prototype that studies whether enforcing structured diagnostic reasoning in large language models reduces hallucinations and improves root-cause accuracy in log-ba…

    Jupyter Notebook