Audio Super Resolution is a Python CLI and library for audio super-resolution and bandwidth extension. It ships with a deterministic sinc-resample baseline, optional external AudioSR support, and managed metadata for self-contained model backends.
The baseline package stays lightweight: normal inference is offline, model downloads are explicit, and heavyweight model dependencies live behind optional extras.
- CLI and Python API for single files, directory batches, dry runs, and recursive path-preserving output.
- Pluggable backend registry with
sinc-resample, optionalaudiosr, and experimentallavasr-compat. - Shared inference config for device, precision, chunking, preprocessing, seeds, and model cache paths.
- JSON run manifests, manifest comparison, and quality reports for regression workflows.
- Explicit local weight resolution with multi-file manifests, size/SHA256 checks, and opt-in Hugging Face downloads.
- Pixi tasks for repeatable test, lint, format, and build commands.
Install from PyPI:
pip install audio-super-resolutionInstall optional model/runtime extras only when needed. For example, LavaSR-compatible inference with managed Hugging Face downloads uses:
pip install "audio-super-resolution[lavasr,download]"Install the unreleased repository version from GitHub:
pip install "audio-super-resolution @ git+https://github.com/Tinnci/python-audio-super-resolution.git"GitHub installs can include extras:
pip install "audio-super-resolution[lavasr,download] @ git+https://github.com/Tinnci/python-audio-super-resolution.git"For local development:
git clone https://github.com/Tinnci/python-audio-super-resolution.git
cd python-audio-super-resolution
pixi installOptional extras:
| Extra | Purpose |
|---|---|
audiosr |
External AudioSR wrapper. Use Python 3.10 because upstream dependencies are older. |
download |
Hugging Face model weight downloads. |
weights |
Optional safetensors loading helpers. |
lavasr |
Torch runtime for the experimental LavaSR-compatible backend. |
Enhance one file:
audio-super-res input.wav output.wav --target-sr 48000If output.wav is omitted, the CLI writes next to the input as input-sr48000.wav.
Batch process a directory:
audio-super-res ./low-res-audio ./enhanced-audio --recursive --target-sr 48000Preview or record a run:
audio-super-res ./low-res-audio ./enhanced-audio --recursive --dry-run --manifest plan.json
audio-super-res ./low-res-audio ./enhanced-audio --recursive --manifest run.json
audio-super-res --compare-manifests expected.json actual.jsonList backends and models:
audio-super-res --list-backends
audio-super-res --list-models --list-format jsonRun post-write quality checks:
audio-super-res input.wav output.wav --quality-report --fail-on-quality-issue
audio-super-res input.wav output.wav --quality-report-json quality.jsonThe shorter audiosr command is also available as an alias for audio-super-res.
Current backend status:
| Backend | Status |
|---|---|
sinc-resample |
Default deterministic baseline. |
audiosr |
Optional external package backend; upstream package owns its checkpoint behavior. |
lavasr-compat |
Experimental self-contained LavaSR v2 BWE path with managed weights. Gated real-weight download, torch smoke, and initial upstream parity validation pass. |
Use audio-super-res --list-models --list-format json for machine-readable comparison metadata, including task/domain, input and target sample rates, implementation family, I/O capabilities, accelerator declarations, weight source/size/license, validation evidence, recommended use, and known limitations.
Future model candidates are tracked in the speech and general-audio reviews linked from docs/README.md. Candidate entries are not supported backends until they pass admission and validation.
Managed downloads are explicit. Normal enhancement only uses local verified files unless --download-weights is set:
audio-super-res --backend lavasr-compat --download-weights --prepare-model-cache
audio-super-res --backend lavasr-compat --verify-weightsUse an existing manifest:
audio-super-res input.wav output.wav \
--backend lavasr-compat \
--target-sr 48000 \
--weights-manifest C:\path\to\lavasr-v2-bwe\manifest.jsonRun the optional external AudioSR backend:
audio-super-res input.wav output.wav \
--backend audiosr \
--target-sr 48000 \
--model-name basic \
--device autofrom audio_super_resolution import AudioSuperResolver
resolver = AudioSuperResolver(target_sr=48000)
result = resolver.enhance("input.wav", "output.wav")
print(result.output_path)
print(result.sample_rate)Batch planning and manifests:
from audio_super_resolution import InferenceConfig, build_manifest, plan_enhancements
jobs = plan_enhancements("low-res-audio", "enhanced-audio", recursive=True)
manifest = build_manifest("dry-run", jobs, InferenceConfig(), backend="sinc-resample", target_sample_rate=48000)Managed weights:
from audio_super_resolution import (
InferenceConfig,
download_model_weights,
resolve_model_weights,
verify_model_weights,
)
download_model_weights("lavasr-v2-bwe")
verified = verify_model_weights("lavasr-v2-bwe")
weights = resolve_model_weights("lavasr-v2-bwe", InferenceConfig(model_cache_dir=verified.root_dir.parent))
model_path = weights.path_for("enhancer_v2/pytorch_model.bin")pixi run test
pixi run lint
pixi run format
pixi run buildRun optional real AudioSR integration only when model inference and upstream checkpoint handling are intended:
set AUDIO_SUPER_RESOLUTION_RUN_AUDIOSR_INTEGRATION=1
pixi run pytest tests/test_audiosr_integration.pyReal LavaSR weight download and torch smoke tests are also gated; see tests/README.md.
docker build -t audio-super-resolution .
docker run --rm -v "%cd%":/workdir audio-super-resolution input.wav output.wav --target-sr 48000On Unix-like shells, use -v "$PWD":/workdir.
- docs/README.md: documentation map and ownership rules.
- docs/ARCHITECTURE.md: package layers, backend contract, and weight-management boundaries.
- ROADMAP.md: milestone state and next implementation tracks.
- CHANGELOG.md: release history and unreleased changes.
- tests/README.md: default and optional test strategy.
- examples/: Python examples and sample JSON artifacts.
- Python 3.10 or newer
- Pixi for development
- libsndfile-compatible audio files for the default reader/writer
This project is licensed under the MIT License. See LICENSE for details.
Inspired by the project structure and user experience of python-audio-separator.