Skip to content

Nikolife13/machinelearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

machinelearning

Titanic Survival Prediction (Dockerized App) This project is a Python-based data science application that predicts passenger survival on the Titanic using Machine Learning. It is fully containerized using Docker to ensure a consistent environment for training and evaluation.

๐Ÿ“‹ Features Data Processing: Cleans the Titanic dataset, handles missing values, and encodes categorical features (e.g., Sex).

Machine Learning: Trains a LogisticRegression model to predict survival outcomes.

Data Visualization: Automatically generates visual insights:

survival.png: A bar chart showing survival counts.

age.png: A histogram representing the age distribution of passengers.

Containerization: Environment-agnostic execution using Docker.

๐Ÿ›  Tech Stack Language: Python 3.11

Data Analysis: Pandas, Scikit-learn

Visualization: Matplotlib

DevOps: Docker, GitHub Actions

๐Ÿš€ Getting Started Prerequisites Docker installed on your machine.

(Optional) Git to clone the repository.

Local Setup & Execution Build the Docker Image:

Bash docker build -t titanic-app . Run the Container:

Bash docker run --name titanic-container titanic-app View Results: The model accuracy will be printed in your terminal. To view the generated graphs, copy them from the container to your local machine:

Bash docker cp titanic-container:/app/survival.png ./survival.png docker cp titanic-container:/app/age.png ./age.png ๐Ÿค– GitHub Deployment (CI/CD) This project is configured to run automatically via GitHub Actions. Every time you push code to the repository, GitHub will:

Initialize an Ubuntu runner.

Build the Docker image.

Run the container to verify the code and model training.

๐Ÿ“ File Structure app.py: The main script containing data cleaning, model training, and plotting logic.

Dockerfile: Configuration for creating the Docker image.

requirements.txt: Python dependencies (pandas, matplotlib, scikit-learn).

titanic.csv: The dataset used for training and testing.

๐Ÿ’ก Project Insights The model uses features like Passenger Class (Pclass), Sex, and Age to determine survival probability. By using a Dockerized approach, we eliminate the "it works on my machine" problem, making the analysis reproducible anywhere.

Developed as part of a Containerization & Docker learning journey.

About

Titanic Survival Prediction (Dockerized) A Machine Learning app using Python to predict Titanic survival. Features data cleaning, Logistic Regression, and Matplotlib visualizations. Fully containerized with Docker and integrated with GitHub Actions. Reproducible and ready for deployment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors