Deep Learning Engineer — multimodal modeling & efficient LLM inference.
Researcher-leaning DL engineer working at the intersection of vision–language models and LLM inference efficiency. Currently focused on KV cache compression and long-context handling.
- Interests: Vision & Language, multimodal foundation models, LLMs, efficient inference
- Working on multimodal modeling for foundation-scale VLMs
- Working on KV cache compression for efficient long-context LLM inference





