Audio Super Resolution

Audio Super Resolution is a Python CLI and library for audio super-resolution and bandwidth extension. It ships with a deterministic sinc-resample baseline, optional external AudioSR support, and managed metadata for self-contained model backends.

The baseline package stays lightweight: normal inference is offline, model downloads are explicit, and heavyweight model dependencies live behind optional extras.

Features

CLI and Python API for single files, directory batches, dry runs, and recursive path-preserving output.
Pluggable backend registry with sinc-resample, optional audiosr, and experimental lavasr-compat.
Shared inference config for device, precision, chunking, preprocessing, seeds, and model cache paths.
JSON run manifests, manifest comparison, and quality reports for regression workflows.
Explicit local weight resolution with multi-file manifests, size/SHA256 checks, and opt-in Hugging Face downloads.
Pixi tasks for repeatable test, lint, format, and build commands.

Installation

Install from PyPI:

pip install audio-super-resolution

Install optional model/runtime extras only when needed. For example, LavaSR-compatible inference with managed Hugging Face downloads uses:

pip install "audio-super-resolution[lavasr,download]"

Install the unreleased repository version from GitHub:

pip install "audio-super-resolution @ git+https://github.com/Tinnci/python-audio-super-resolution.git"

GitHub installs can include extras:

pip install "audio-super-resolution[lavasr,download] @ git+https://github.com/Tinnci/python-audio-super-resolution.git"

For local development:

git clone https://github.com/Tinnci/python-audio-super-resolution.git
cd python-audio-super-resolution
pixi install

Optional extras:

Extra	Purpose
`audiosr`	External AudioSR wrapper. Use Python 3.10 because upstream dependencies are older.
`download`	Hugging Face model weight downloads.
`weights`	Optional safetensors loading helpers.
`lavasr`	Torch runtime for the experimental LavaSR-compatible backend.

CLI Quick Start

Enhance one file:

audio-super-res input.wav output.wav --target-sr 48000

If output.wav is omitted, the CLI writes next to the input as input-sr48000.wav.

Batch process a directory:

audio-super-res ./low-res-audio ./enhanced-audio --recursive --target-sr 48000

Preview or record a run:

audio-super-res ./low-res-audio ./enhanced-audio --recursive --dry-run --manifest plan.json
audio-super-res ./low-res-audio ./enhanced-audio --recursive --manifest run.json
audio-super-res --compare-manifests expected.json actual.json

List backends and models:

audio-super-res --list-backends
audio-super-res --list-models --list-format json

Run post-write quality checks:

audio-super-res input.wav output.wav --quality-report --fail-on-quality-issue
audio-super-res input.wav output.wav --quality-report-json quality.json

The shorter audiosr command is also available as an alias for audio-super-res.

Models And Weights

Current backend status:

Backend	Status
`sinc-resample`	Default deterministic baseline.
`audiosr`	Optional external package backend; upstream package owns its checkpoint behavior.
`lavasr-compat`	Experimental self-contained LavaSR v2 BWE path with managed weights. Gated real-weight download, torch smoke, and initial upstream parity validation pass.

Use audio-super-res --list-models --list-format json for machine-readable comparison metadata, including task/domain, input and target sample rates, implementation family, I/O capabilities, accelerator declarations, weight source/size/license, validation evidence, recommended use, and known limitations.

Future model candidates are tracked in the speech and general-audio reviews linked from docs/README.md. Candidate entries are not supported backends until they pass admission and validation.

Managed downloads are explicit. Normal enhancement only uses local verified files unless --download-weights is set:

audio-super-res --backend lavasr-compat --download-weights --prepare-model-cache
audio-super-res --backend lavasr-compat --verify-weights

Use an existing manifest:

audio-super-res input.wav output.wav \
  --backend lavasr-compat \
  --target-sr 48000 \
  --weights-manifest C:\path\to\lavasr-v2-bwe\manifest.json

Run the optional external AudioSR backend:

audio-super-res input.wav output.wav \
  --backend audiosr \
  --target-sr 48000 \
  --model-name basic \
  --device auto

Python API

from audio_super_resolution import AudioSuperResolver

resolver = AudioSuperResolver(target_sr=48000)
result = resolver.enhance("input.wav", "output.wav")

print(result.output_path)
print(result.sample_rate)

Batch planning and manifests:

from audio_super_resolution import InferenceConfig, build_manifest, plan_enhancements

jobs = plan_enhancements("low-res-audio", "enhanced-audio", recursive=True)
manifest = build_manifest("dry-run", jobs, InferenceConfig(), backend="sinc-resample", target_sample_rate=48000)

Managed weights:

from audio_super_resolution import (
    InferenceConfig,
    download_model_weights,
    resolve_model_weights,
    verify_model_weights,
)

download_model_weights("lavasr-v2-bwe")
verified = verify_model_weights("lavasr-v2-bwe")
weights = resolve_model_weights("lavasr-v2-bwe", InferenceConfig(model_cache_dir=verified.root_dir.parent))
model_path = weights.path_for("enhancer_v2/pytorch_model.bin")

Development

pixi run test
pixi run lint
pixi run format
pixi run build

Run optional real AudioSR integration only when model inference and upstream checkpoint handling are intended:

set AUDIO_SUPER_RESOLUTION_RUN_AUDIOSR_INTEGRATION=1
pixi run pytest tests/test_audiosr_integration.py

Real LavaSR weight download and torch smoke tests are also gated; see tests/README.md.

Docker

docker build -t audio-super-resolution .
docker run --rm -v "%cd%":/workdir audio-super-resolution input.wav output.wav --target-sr 48000

On Unix-like shells, use -v "$PWD":/workdir.

Project Docs

docs/README.md: documentation map and ownership rules.
docs/ARCHITECTURE.md: package layers, backend contract, and weight-management boundaries.
ROADMAP.md: milestone state and next implementation tracks.
CHANGELOG.md: release history and unreleased changes.
tests/README.md: default and optional test strategy.
examples/: Python examples and sample JSON artifacts.

Requirements

Python 3.10 or newer
Pixi for development
libsndfile-compatible audio files for the default reader/writer

License

This project is licensed under the MIT License. See LICENSE for details.

Credits

Inspired by the project structure and user experience of python-audio-separator.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github		.github
docs		docs
examples		examples
src/audio_super_resolution		src/audio_super_resolution
tests		tests
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
pixi.lock		pixi.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Super Resolution

Features

Installation

CLI Quick Start

Models And Weights

Python API

Development

Docker

Project Docs

Requirements

License

Credits

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Audio Super Resolution

Features

Installation

CLI Quick Start

Models And Weights

Python API

Development

Docker

Project Docs

Requirements

License

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages