machinelearning

Titanic Survival Prediction (Dockerized App) This project is a Python-based data science application that predicts passenger survival on the Titanic using Machine Learning. It is fully containerized using Docker to ensure a consistent environment for training and evaluation.

📋 Features Data Processing: Cleans the Titanic dataset, handles missing values, and encodes categorical features (e.g., Sex).

Machine Learning: Trains a LogisticRegression model to predict survival outcomes.

Data Visualization: Automatically generates visual insights:

survival.png: A bar chart showing survival counts.

age.png: A histogram representing the age distribution of passengers.

Containerization: Environment-agnostic execution using Docker.

🛠 Tech Stack Language: Python 3.11

Data Analysis: Pandas, Scikit-learn

Visualization: Matplotlib

DevOps: Docker, GitHub Actions

🚀 Getting Started Prerequisites Docker installed on your machine.

(Optional) Git to clone the repository.

Local Setup & Execution Build the Docker Image:

Bash docker build -t titanic-app . Run the Container:

Bash docker run --name titanic-container titanic-app View Results: The model accuracy will be printed in your terminal. To view the generated graphs, copy them from the container to your local machine:

Bash docker cp titanic-container:/app/survival.png ./survival.png docker cp titanic-container:/app/age.png ./age.png 🤖 GitHub Deployment (CI/CD) This project is configured to run automatically via GitHub Actions. Every time you push code to the repository, GitHub will:

Initialize an Ubuntu runner.

Build the Docker image.

Run the container to verify the code and model training.

📁 File Structure app.py: The main script containing data cleaning, model training, and plotting logic.

Dockerfile: Configuration for creating the Docker image.

requirements.txt: Python dependencies (pandas, matplotlib, scikit-learn).

titanic.csv: The dataset used for training and testing.

💡 Project Insights The model uses features like Passenger Class (Pclass), Sex, and Age to determine survival probability. By using a Dockerized approach, we eliminate the "it works on my machine" problem, making the analysis reproducible anywhere.

Developed as part of a Containerization & Docker learning journey.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

machinelearning

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
titanic.csv		titanic.csv

Folders and files

Latest commit

History

Repository files navigation

machinelearning

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages