๐ M.Sc. in Artificial Intelligence & Machine Learning (CGPA: 9.3) ๐ผ Ex-ML Intern @ Tech Mahindra Makers Lab | AI Intern @ Codec Technologies (Current)
๐ค GenAI & NLP Engineer โ I build production-ready AI systems, not just models. ๐ Contributor to vinta/awesome-python โ the #10 most starred repo on GitHub โ ๐ HackerRank โ Active problem solver in Python, SQL & Algorithms
๐ก Focus Areas:
- RAG Systems & LLM Pipelines
- Multilingual NLP (Indian Languages, 10,000+ dialect variations)
- Synthetic Data Generation (10ร scaling)
- Document Intelligence & OCR
| # | Repo | Type | Description | Status |
|---|---|---|---|---|
| 1 | vinta/awesome-python | ๐ PR | fix: Remove duplicate ruff entry from Code Linters section โ Merged โ
into #10 most starred GitHub repo! |
๐ฃ Merged |
| 2 | recodehive/Opensource-practice | ๐ PR | fix: Resolve misplaced names & duplicate J section โ Merged โ
|
๐ฃ Merged |
| 3 | canonical/pycloudlib | ๐ PR | docs: Update contributing guidelines |
๐ก Open |
| 4 | topoteretes/cognee | ๐ PR | Add beginner-friendly example to documentation | ๐ก Open |
| 5 | recodehive/machine-learning-repos | ๐ Issue | [Enhancement] Add Multilingual & Indian Language NLP Resources |
๐ข Raised |
| 6 | recodehive/Opensource-practice | ๐ Issue | [Bug] Multiple names misplaced & duplicate section under letter J |
๐ข Raised |
| 7 | supabase/supabase-py | ๐ก Issue | Suggestion to improve documentation clarity for beginners | ๐ข Raised |
| 8 | topoteretes/cognee | ๐ก Issue | [Docs] Improve documentation with beginner-friendly knowledge graph example |
๐ข Raised |
| 9 | public-apis/public-apis | ๐ Issue | [Bug] Multiple defunct/dead APIs still listed in README (FTX, AnonFiles, MetaWeather & more) |
๐ข Raised |
| 10 | Mohitha1406/text-emotion-classifier | ๐ Issue | [Docs] Add dataset source and download instructions for emotion classification model |
๐ข Raised |
| 11 | shrutiii16/Traffic-Patterns | ๐ Issue | [Docs] Add README with project overview, dataset description and usage instructions |
๐ข Raised |
| 12 | EpistasisLab/tpot | ๐ Issue | [Docs] Fix typo: "feautures" should be "features" in README Tips section |
๐ข Raised |
| 13 | TheAlgorithms/Python | ๐ก Issue | [Feature] Add Naive Bayes Text Classification Algorithm to machine_learning/ |
๐ข Raised |
| 14 | datasciencemasters/go | ๐ Issue | [Docs] Update outdated book prices and mark freely available resources |
๐ข Raised |
| 15 | TheAlgorithms/Python | ๐ Review | [Code Review] Reviewed PR #14665 โ identified type hint bug in predict(), missing edge-case doctests for empty string input, validation order issue in fit(), and missing demo output in __main__ |
๐ก Reviewed |
๐ข Actively solving problems in Python ยท SQL ยท Problem Solving ยท Algorithms
- ๐ Selected through the National Internship Portal (NCS) at Codec Technologies โ a global IT & consultancy platform operating in 27+ countries
- ๐ค Working on applied AI/ML project assignments under Codec Technologies' Google for Education partner platform framework
- ๐ง Applying GenAI and deep learning skills to real-world business challenges
- ๐ Building domain expertise in AI systems, production deployment, and cross-cultural AI problem solving
- ๐ Completed internship as a Data Analyst and Python Developer at Huban Technologies LLP โธ Developed an end-to-end OCR Invoice Data Extraction & Automation system using Python, EasyOCR, and machine learning techniques to digitize and process invoice documents automatically. โธ Built an intelligent pipeline to extract structured fields (invoice number, date, vendor, line items, totals) from scanned and digital invoices using OCR and NLP-based post-processing. โธ Automated data validation and export workflows, reducing manual data entry effort significantly and improving invoice processing accuracy through ML-based field classification. โธ Integrated OpenCV for image preprocessing (deskewing, noise removal, binarization) to improve OCR accuracy on low-quality scanned invoice documents.
- ๐ Built multilingual NLP pipelines handling 10,000+ dialect variations
- ๐๏ธ Developed ASR (Automatic Speech Recognition) systems with WER evaluation
- ๐ท๏ธ Designed automated annotation pipelines using NVIDIA NIM
- ๐ Built real-time translation systems for low-resource Indian languages
- ๐ฆ Built LLM-based Synthetic Data Generation pipelines achieving 10ร dataset expansion
- Contribute to top open-source AI projects (LangChain, HuggingFace, etc.)
- Build and deploy scalable GenAI systems end-to-end
- Strengthen MLOps, system design & distributed training
- Earn HackerRank Gold badges in Python, SQL & Problem Solving ๐
- Land a high-impact AI/ML engineering role ๐
โญ Building AI systems that actually work in the real world
