Nguyễn Trung Hiếu HieuNTg

Hi, I'm Hieu (Nguyen Trung Hieu) 👋

AI Engineer · Computer Vision × LLM / Agentic AI × Speech (ASR)

I turn research into production — systems that see, listen, reason, and create.

👀 About Me

AI Engineer with 2 years of experience building and shipping production ML systems across Computer Vision, OCR, LLM / Agentic AI, and Speech (ASR). I own the full lifecycle — fine-tuning (PEFT/LoRA, 4-bit), evaluation, and GPU-optimized inference — and enjoy taking a model from a research paper all the way to a reliable, end-to-end pipeline.

🔭 Currently building OCR, object-detection, and LLM-agent systems @ WorkerBot AI
🧠 Deepest expertise: speech recognition & efficient LLM fine-tuning
⚡ Fun fact: I fine-tuned Gemma 3N for Vietnamese ASR down to 7.21% WER

⭐ Flagship Project — Audio2Text (Vietnamese ASR)

End-to-end Vietnamese speech recognition on a fine-tuned Gemma 3N — built from scratch.

🎯 7.21% WER on a 5,000-sample test set (0 empty predictions, ~97K reference words)

🧩 Production inference pipeline: Demucs → denoise → VAD → overlap-aware chunking → context-aware decoding

⚙️ PEFT/LoRA + 4-bit quantization via Unsloth — trainable on a single consumer GPU

📦 Clean, reproducible codebase: separate train / evaluate / predict modules

🔗 Explore the repo → github.com/HieuNTg/Audio2Text

🚀 Featured Projects

Project	What it does	Tech
🎙️ Audio2Text	Vietnamese ASR toolkit on fine-tuned Gemma 3N — training, eval & production inference. 7.21% WER	`Gemma 3N` `PEFT/LoRA` `Unsloth` `Demucs` `VAD`
📖 StoryForge	Multi-agent story generator — 13-agent drama simulation, LLM-as-judge auto-revision & RAG	`FastAPI` `LLM` `Multi-Agent` `RAG`
🧑‍💼 AI HR Interview	Full-stack AI interviewer with real-time voice/video via Gemini Live + JD↔CV matching	`Next.js` `Gemini Live` `PostgreSQL` `Redis`
💊 MedGraph	Drug-interaction cascade analyzer — knowledge graph over CYP450 pathways on real FDA data	`FastAPI` `React` `Knowledge Graph`
🔢 Date-Recognition	Expiry-date OCR — YOLOv8 detection + CRNN/CTC recognition with Streamlit UI	`YOLOv8` `OCR` `CRNN`
🙂 FaceReg	Real-time face recognition — MTCNN + FaceNet across image, video & live camera	`PyTorch` `MTCNN` `FaceNet`

🛠️ Tech Stack

Languages

ML / LLM

Computer Vision · Speech

Backend · Infra

🔍 Currently Exploring

Efficient LLM fine-tuning & on-device / low-VRAM inference
Multi-agent systems and autonomous research workflows
Advanced OCR & document understanding

"Turn research into systems people can actually use."

📫 Reach me at nt.hieu2207@gmail.com · ⭐ Star anything you find useful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nguyễn Trung Hiếu HieuNTg

Achievements