A small Python data environment and learning repository for my local developer setup.
This repository verifies that a Python 3.12 data workflow works correctly with a project-specific virtual environment, PyCharm, DataSpell, Jupyter Notebook, pandas, NumPy, matplotlib, scikit-learn, Git, and GitHub.
This is not intended to be a large data science project. It is a compact baseline repository for:
- verifying a clean Python data environment
- practicing Python data analysis fundamentals
- using a project-specific virtual environment
- working with pandas, NumPy, matplotlib, and scikit-learn
- testing PyCharm and DataSpell with the same interpreter
- keeping a clean Git/GitHub workflow for Python projects
Python is one of my core technical skills in my current learning path. My main focus is data and process analysis, SQL, Python, BI, and Microsoft-oriented data tooling.
- iMac Retina 4K, 21.5-inch, Late 2015
- Intel x86_64
- macOS Sonoma 14.8.7 via OpenCore Legacy Patcher
- PyCharm via JetBrains Toolbox
- DataSpell via JetBrains Toolbox
- Python 3.12.13
- Project-specific
.venv - pandas
- NumPy
- matplotlib
- scikit-learn
- Jupyter Notebook
- Git / GitHub
This repository also documents that the Python data stack works on a legacy Intel Mac setup used as a stable learning and development machine.
python-data-basics/
├── main.py
├── dataspell_test.ipynb
├── README.md
├── requirements.txt
├── requirements-core.txt
├── LICENSE
├── .editorconfig
└── .gitignore
Local virtual environments, IDE metadata, cache files, and machine-specific files are intentionally excluded from Git.
.venv/
.idea/
__pycache__/
*.pyc
.DS_Store
Create the virtual environment with Python 3.12:
/usr/local/bin/python3.12 -m venv .venvActivate it:
source .venv/bin/activateInstall the exact tested dependency set:
python -m pip install -r requirements.txtAlternatively, install only the core packages:
python -m pip install -r requirements-core.txtRun:
python main.pyThe script verifies the interpreter and package versions, creates a small example DataFrame, and trains a minimal logistic regression model on synthetic learning data.
Expected output includes:
Python data environment check
Python: 3.12.13
pandas
NumPy
matplotlib
scikit-learn
Minimal logistic regression example
The notebook dataspell_test.ipynb verifies that DataSpell uses the same project-specific Python 3.12 virtual environment and can import the core data stack.
This repository demonstrates a working Python data baseline setup using:
- Python 3.12
- a project-specific virtual environment
- pandas for tabular data handling
- NumPy for numerical work
- matplotlib availability for visualization
- scikit-learn for a minimal machine learning example
- Jupyter Notebook / DataSpell for notebook-based work
- PyCharm as the primary Python IDE
- Git and GitHub for version control
Possible future additions:
- a small CSV analysis example
- a simple matplotlib visualization
- a notebook with a short exploratory data analysis workflow
- a basic logistic regression example with clearer explanation
- a small SQL-to-pandas workflow using a local database
This repository is intentionally small. Its purpose is to document and verify a clean Python data development setup before building larger data analysis and BI-related projects.
No virtual environment, IDE metadata, cache files, or machine-specific files are committed.