spoti

Vibe-based playlist generator over your Spotify liked songs. NLP coursework project — describe a mood, get a playlist drawn from your library.

Live: https://spoti.phirios.com

What it does

Logs you in with Spotify OAuth, syncs your full liked-songs library in the background with progress + ETA.
Enriches every track with Last.fm genre tags, GetSongBPM tempo, and LRCLIB lyrics (all cached in Postgres so each track only hits upstream once globally).
Embeds each track with a local sentence-transformer (all-MiniLM-L6-v2, 384-d) using "<title> by <artists>. Tags: <last.fm tags>".
Vibe playlists — type a free-form mood prompt; backend embeds it, cosine-sims against your library embeddings, returns top N. One click saves the result as a real Spotify playlist.
Smart playlists — k-means over your embeddings, named by the most frequent Last.fm tags in each cluster. One click saves any cluster to Spotify.
Click any track for a subpage with BPM, genres, lyrics, and similar tracks in your library (cosine sim within your own collection).
Search bar + infinite scroll over the cached library.

Stack

backend — Rust, axum, sqlx (Postgres), reqwest. OAuth flow, library sync engine, vibe matcher, auto-clusterer.
embedder — Python, FastAPI, sentence-transformers, scikit-learn. CPU-only PyTorch. Two endpoints: /embed, /cluster.
web — Next.js 16 + React 19 + Tailwind 4. URL-routed track and playlist subpages, polling-based sync progress UI.
infra — k8s namespace nlp-project, plain Postgres deployment, nginx ingress with cert-manager, manifests in kubernetesmanifests.

Architecture

┌────────────────┐  OAuth  ┌────────────────┐
│ user (browser) │ ──────▶ │   spoti-web    │ ◀─ static landing + dashboard
└────────────────┘         └────────┬───────┘
                                    │ /api/* (Next rewrite)
                                    ▼
                           ┌────────────────┐    /embed    ┌─────────────────┐
                           │ spoti-backend  │ ───────────▶ │ spoti-embedder  │
                           │     (Rust)     │ ◀─ embeddings (sentence-       │
                           └────────┬───────┘                 transformers)  │
                            ┌───────┴───────┐                └────────────────┘
                            ▼               ▼
                       ┌─────────┐    ┌──────────┐
                       │postgres │    │ Spotify, │
                       │ (cache) │    │ Last.fm, │
                       └─────────┘    │ LRCLIB,  │
                                      │ GetSong- │
                                      │ BPM      │
                                      └──────────┘

Why per-track caches are global

Tracks (tracks, track_tags, track_embeddings, track_bpm, track_lyrics) are keyed by spotify_track_id, not per-user. The first user who triggers enrichment for a track pays the upstream cost; every subsequent user sees a cache hit. Only the /me/tracks enumeration itself is per-user (Spotify gives no other way to know what's in a specific user's library).

Project layout

spotiList/
├── backend/        Rust axum service
│   ├── src/        modules: auth, sync, library, playlists, auto, …
│   ├── migrations/ sqlx-migrate SQL files
│   └── Dockerfile
├── embedder/       Python FastAPI sidecar (sentence-transformers + sklearn)
│   └── app/main.py
└── web/            Next.js dashboard
    └── src/app/dashboard/
        ├── page.tsx
        ├── components/  Column/section/panel components
        ├── types.ts
        └── dummy.ts     localhost-only mock data for layout previews

Running locally

The web app has a DUMMY mode that activates when served from localhost so you can preview the layout without backend access.

cd web
bun install
bun dev   # → http://localhost:3000/dashboard with mock data

For the full stack (backend + embedder + Postgres) you'll want the k8s manifests or a docker-compose you assemble from the Dockerfiles. The embedder pre-downloads the model into the image; first build is heavy (~1.5 GB image with PyTorch CPU + model weights).

Notes

Spotify deprecated the Audio Features, Audio Analysis, track popularity, artist genres, 30s previews, and recommendations endpoints for new apps in November 2024. We replace what we can: tempo via GetSongBPM (correct host api.getsong.co), genres via Last.fm tags, lyrics via LRCLIB.
GetSongBPM has a 3000 req/hour limit. We cache hits and misses in track_bpm so we never re-query the same track.
Last.fm tags are filtered to drop noise (year-only, decade, artist-name-equal tags) before being saved or used in embedding text.

Data attributions

BPM data: GetSongBPM
Lyrics: LRCLIB
Genre tags: Last.fm

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
backend		backend
embedder		embedder
k8s		k8s
web		web
yt-scraper		yt-scraper
.gitignore		.gitignore
Caddyfile		Caddyfile
NOTES.md		NOTES.md
README.md		README.md
compose.yml		compose.yml
ilerleme_raporu.docx		ilerleme_raporu.docx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spoti

What it does

Stack

Architecture

Why per-track caches are global

Project layout

Running locally

Notes

Data attributions

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

spoti

What it does

Stack

Architecture

Why per-track caches are global

Project layout

Running locally

Notes

Data attributions

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages