B.Tech CSE Specializing in AI & ML, graduating 2027. I'm interested in the more mathematical and low-level side of deep learning — building models from scratch, understanding what's inside them, and working on problems where physics and neural networks intersect.
Currently learning CUDA and working on 3D Gaussian Splatting.
Marcella ★ 2
A ~60M parameter decoder-only transformer built entirely from scratch in PyTorch — no Hugging Face, no shortcuts. Implements RoPE, RMSNorm, SwiGLU FFN, Flash SDP attention with custom causal masking, and a per-layer KV cache with pre-allocated fixed-size tensors for zero-overhead inference. Trained on a weighted mix of FineWeb-Edu, Wikipedia, and SlimPajama with a custom SentencePiece tokenizer (32K vocab). Instruction-finetuned with response-only loss masking. Evaluated at perplexity 32.87 on a held-out split. Ships with a FastAPI streaming backend and a Svelte chat UI.
FWI ★ 13
Physics-Informed GAN for Elastic Full Waveform Inversion — reconstructing subsurface Earth properties (Vp, Vs, density, Poisson's ratio, Young's modulus) from multi-component seismic waveforms. The generator is a U-Net that maps waveform inputs [B, 10, 1000, 70] to 70×70 subsurface grids; a Fourier Neural Operator acts as the differentiable elastic wave solver; a WGAN discriminator enforces realism. Total loss combines adversarial, data misfit (MSE), and PDE residual terms. Uses the ECFB dataset from the SMILE team. (In progress)
Vision-Transformer-for-DeepFake-Detection ★ 1
ViT-based deepfake detector with self-supervised pretraining via masked image modeling on CelebA, finetuned on DFDC. Achieves ~85% accuracy and ROC-AUC ~0.93. Includes Grad-CAM to validate that detections focus on manipulated facial regions rather than background artifacts.
4x-Upscaler-ESRGAN
ESRGAN for 4× image super-resolution. Two-phase training: PSNR-optimised first, then adversarial + VGG perceptual loss. RRDBNet with 23 RRDB blocks. Tile-based inference reduces peak GPU memory from ~4.5 GB to ~500 MB (≈89% reduction) without degrading output quality.
NeuralStyleTransfer
Neural style transfer using VGG19 with multi-scale pyramid optimisation and L-BFGS refinement, balancing content, style, and total variation loss.
- Learning CUDA — kernels, memory hierarchies, warp-level operations
- Exploring 3D Gaussian Splatting for real-time radiance field rendering
Python · PyTorch · CUDA · C++ · FastAPI
Computer Vision · Language Modelling · Physics-Informed Neural Networks · Fourier Neural Operators · GANs · Self-Supervised Learning · Flash Attention · KV Cache · SentencePiece
