I am a data analyst and aspiring AI engineer passionate about turning data into insights and building machine learning solutions that create real business value.
After several years working in telecommunications and customer-facing roles, I have focused on analytics, automation, and data-driven decision-making using Python, SQL, Power BI, and Machine Learning.
Tech: Python • Pandas • Parquet • Bronze/Silver/Gold Architecture
A complete data engineering project that simulates real-time customer event streams.
The project continuously generates customer events (logins, purchases, cancellations), consumes them in a streaming-style architecture, and builds a complete Lakehouse pipeline consisting of Bronze, Silver, and Gold layers.
- Streaming simulation using JSON microbatches
- Automated ingestion into Bronze (Parquet)
- Silver: cleaned and standardized event table
- Gold: analytics-ready tables prepared for Power BI
- Modern data engineering architecture explained and implemented
Tech: FastAPI • Embeddings • Mock Mode • Retrieval Logic
An AI-powered support API combining retrieval, embeddings, and RAG-inspired reasoning.
The project includes a complete mock mode, allowing the entire system to be tested without any API costs, making it ideal for learning and portfolio development.
- FastAPI application with complete endpoint structure
- Embeddings (OpenAI or cost-free synthetic embeddings)
- RAG-style response generation based on relevant documents
- Structured response models with metadata and similarity scores
- Clean architecture and testability
Tech: Python • Streamlit • TF-IDF • Cosine Similarity
A FAQ chatbot built using NLP retrieval techniques that matches user questions against a knowledge base using TF-IDF vectorization and cosine similarity.
An interactive Streamlit interface makes it easy to test and adjust confidence thresholds and retrieval settings.
- TF-IDF + cosine similarity matching
- Adjustable confidence threshold and top-k retrieval
- Interactive Streamlit web application
- Easily extendable to embeddings and RAG architectures
Tech: Python • Scikit-Learn • Pandas • Seaborn
Customer segmentation using K-Means clustering to identify distinct customer groups based on demographics and purchasing behavior.
- K-Means clustering with StandardScaler
- Elbow Method for optimal cluster selection
- Cluster visualization and interpretation
- Insight-driven customer analysis
Tech: Scikit-Learn • XGBoost • Pandas
A complete churn prediction project comparing multiple machine learning models and evaluation techniques to identify customers at high risk of leaving.
- Logistic Regression, Random Forest, and XGBoost
- ROC-AUC, Precision, Recall, and F1 evaluation
- End-to-end preprocessing and modeling pipeline
- Analysis of churn drivers and customer behavior
Tech: Python • Scikit-Learn • XGBoost
A regression project focused on predicting house prices using a rich set of numerical and categorical features.
- Linear Regression, Random Forest, and XGBoost
- RMSE, MAE, and R² evaluation metrics
- Feature engineering and preprocessing
- Model explainability and interpretation
Tech: Power BI • DAX • Data Cleaning
An interactive Power BI dashboard analyzing Norway's aquaculture industry with a focus on regions, production volume, and long-term trends.
- Data cleaning and modeling
- DAX measures and KPI development
- Geographic visualizations
- Strong focus on storytelling and business insights
Python • SQL • Power BI • Scikit-Learn • Pandas • NumPy • Matplotlib • Seaborn • XGBoost
Git • GitHub • VS Code • Jupyter Notebook
Exploratory Data Analysis (EDA) • Feature Engineering • Classification • Regression • Clustering • Model Evaluation • Data Visualization
FastAPI • Streamlit • NLP • TF-IDF • Embeddings • Retrieval Systems • RAG-Inspired Architectures • API Development
- Learn Python 3
- Analyze Data with SQL
- Analyze Data with Microsoft Excel
- BI Dashboards with Power BI
- Data and Programming Foundations for AI
- Data Scientist: Analytics (Codecademy)
- Data Scientist: Machine Learning Specialist
- Build Chatbots with Python
- Creating AI Applications Using RAG
- Learn How to Build AI Agents
- Time Series Analysis and Forecasting
- AI Agents and LLM-Powered Applications
- Retrieval-Augmented Generation (RAG)
- MLOps and Production Machine Learning
- Data Engineering and Lakehouse Architectures
- Advanced Power BI and PL-300 Certification Preparation
I am building a portfolio that combines business understanding with technical expertise, focusing on demonstrating how data, machine learning, and AI can be transformed into practical insights and real-world business value.
I enjoy working at the intersection of analytics, software development, and product thinking, with a strong interest in AI-powered applications and data-driven decision-making.
💬 Open to collaboration, feedback, and professional discussions within Data, AI, Analytics, and Software Engineering.
GitHub: https://github.com/Runar-Olsen