Skip to content
View pavanmanjunath18's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report pavanmanjunath18

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pavanmanjunath18/README.md

Hi, I'm Pavan Manjunath πŸ‘‹

I studied Computer Science at Arizona State University(graduated May 2026, ~3.8 GPA), with a minor in Data Science.

My strongest background is in analytics and data science β€” working with data, finding patterns in it, and trying to make it legible to people who need to act on it. I've built dashboards, written pipelines, done exploratory analysis, and learned that most data problems are harder than they look.

Lately I've been getting curious about something adjacent: how the systems that produce and serve data actually work. The backend plumbing, the deployment story, what happens when a model hits production and breaks in a way your notebook never predicted.

I'm not an infrastructure engineer. But I'm genuinely trying to understand how modern AI and data systems are built, not just how to use them.

I use AI tools in my development process. They help me move faster, but I try not to use them as a reason to understand less.


🧠 What I'm actually good at

  • Exploratory data analysis and making sense of messy datasets
  • Building dashboards and visualizations that communicate something true
  • Writing SQL that does what you think it does
  • Data workflows, cleaning, and the unglamorous parts of working with real data
  • Applying ML and AI tools to practical problems
  • Thinking through what a metric actually measures

πŸ“– What I'm actively learning

  • How backend systems are structured, and why those choices matter
  • What retrieval-augmented generation actually does under the hood (not just the API)
  • Production patterns for AI applications β€” latency, reliability, observability
  • How deployed systems behave differently than notebook experiments
  • Infrastructure basics β€” containers, CI/CD, what makes systems debuggable

I find learning by building more useful than learning by reading. So most of what's below came from trying to build something real and running into things I didn't understand.


πŸ”¨ Selected projects

ReadmitScope US Β· Live Dashboard

Healthcare analytics project on CMS Medicare hospital readmissions. I built the full workflow: live CMS data pull, cleaning, data quality logs, notebooks, statistical analysis, enrichment with hospital ownership/star ratings, and a deployed React dashboard.

Python pandas scipy scikit-learn React TypeScript Recharts Vercel

ReadmitScope dashboard hero

Signal: Data Analyst / Data Scientist β€” exploratory analysis, statistical testing, healthcare metric framing, deployed BI-style dashboard.

Live Demo Β· Repository


CardioScope 3D Β· Live Demo

Interactive 3D cardiovascular risk explorer using the UCI Cleveland Heart Disease dataset. The app projects 297 patients into PCA space, supports k-means clustering, trains a logistic regression risk model, and lets users simulate a patient to inspect risk and top contributing features.

React TypeScript Three.js PCA k-means Logistic Regression Data Visualization

CardioScope 3D preview

Signal: Data Scientist / Visualization Engineer β€” dimensionality reduction, clustering, predictive modeling, explainable feature contributions, interactive analytics UI.


Enterprise Retail Lakehouse

Post-acquisition retail data integration project that consolidates parent and acquired-company data into a Databricks lakehouse. Implements Bronze/Silver/Gold layers, Delta MERGE upserts, S3 landing-zone ingestion, incremental processing, grain alignment, and a gold star-schema analytics view.

Databricks PySpark Delta Lake AWS S3 SQL Unity Catalog Data Engineering

Lakehouse dashboard

Signal: Data Engineer / Analytics Engineer β€” medallion architecture, schema harmonization, incremental loads, quality remediation, dashboard-ready data modeling.


SaaS Revenue & Churn Intelligence

Subscription analytics platform that turns raw billing events into MRR, ARR, NRR, GRR, churn, cohort retention, and customer health intelligence. The core model is a PostgreSQL MRR state machine using window functions and materialized views.

Python PostgreSQL SQL Streamlit Plotly Data Modeling Analytics Engineering

SaaS dashboard

Signal: Data Analyst / Analytics Engineer β€” SaaS metric design, SQL modeling, synthetic data generation, validation checks, executive dashboarding.


FraudStream AI

Real-time transaction risk platform with an XGBoost fraud model, SHAP explanations, deterministic risk rules, drift monitoring, FastAPI backend, and Next.js dashboard. Designed to show the ML engineering loop from synthetic data to scoring API to monitoring.

Python XGBoost SHAP FastAPI PostgreSQL Redis Next.js ML Engineering

FraudStream dashboard

Signal: Data Scientist / ML Engineer β€” feature engineering, model evaluation, explainability, fraud rules, inference API, monitoring.


Iowa Nitrate Analysis

Geospatial analysis of agricultural intensity and nitrate contamination across Iowa counties from 2010-2017. Combines USDA crop acreage, EPA water-quality data, county shapefiles, correlation analysis, maps, trend plots, and clear limitations.

Python pandas GeoPandas Matplotlib SciPy Geospatial Analysis Jupyter

Iowa nitrate map

Signal: Data Analyst / Data Scientist β€” research framing, geospatial joins, correlation analysis, environmental data storytelling, statistical caveats.


πŸ—‚οΈ Background

At ASU's Community Engagement department ( Social Embeddedness ) , I worked as a data specialist β€” maintaining a database, building dashboards in Tableau and Power BI, and translating operational data into something decision-makers could actually use. That work shaped how I think about analytics: the visualization is the easy part, the hard part is understanding what question you're actually trying to answer.

Before that, I interned at a startup doing data work β€” pipelines, SQL validation, scraping. I learned that data quality problems are mostly discovered after someone has already trusted the bad data.

I'm from India, and I've been navigating school and career in the US as an international student. It's taught me to be resourceful and to not take shortcuts when I can't afford them.


πŸ’‘ Things I find genuinely interesting

  • Why distributed systems fail in non-obvious ways
  • How operational data can improve human decision-making (not replace it)
  • The design of systems that are debuggable, not just functional
  • What makes an AI feature actually useful in practice vs. impressive in a demo
  • The gap between analytics and engineering β€” and what lives in it

πŸ’¬ A few things I think are true

Good analysis and good engineering both require the same thing: understanding the problem well enough to know which simplifications are safe.

The most useful AI tools are the ones that make people better at their jobs. The rest are demos.

"It works on my machine" is not the same as working.


πŸ“¬ Reach me

Email: pvmmallipudi@gmail.com LinkedIn: linkedin.com/in/pavan-mallipudi


Still figuring things out. Enjoying the process.

Pinned Loading

  1. saas-revenue-churn-intelligence saas-revenue-churn-intelligence Public

    B2B SaaS analytics platform for MRR, churn, cohort retention, and customer health intelligence using Python, PostgreSQL, SQL, and Streamlit.

    Python

  2. enterprise-retail-lakehouse-aws-databricks enterprise-retail-lakehouse-aws-databricks Public

    Python

  3. stock-trading-dashboard stock-trading-dashboard Public

    Real-time stock analysis dashboard with Streamlit, technical indicators (SMA, Bollinger Bands, RSI), and a Groq LLM chatbot grounded in live market data

    Python

  4. cardioscope-3d cardioscope-3d Public

    TypeScript

  5. readmitscope readmitscope Public

    Healthcare analytics dashboard for CMS Medicare hospital readmission performance

    Jupyter Notebook