Skip to content

jLantxa/mapache

Repository files navigation

mapache — Fast, Encrypted, Deduplicating Backup Tool

Badge

Mapache is a fast, secure, deduplicating, incremental backup tool written in Rust.

You can find more in-depth documentation.


Table of Contents


About

mapache (Spanish for raccoon 🦝) is a high-performance, deduplicating backup tool designed for speed, reliability, and uncompromising security. Inspired by restic and built with Rust, it provides a modern approach to incremental backups.

At its core, mapache operates on a content-addressable repository model. Every file, directory, and piece of metadata is decomposed into binary objects identified by their cryptographic hash. This architecture naturally enables global deduplication: if multiple files share the same content—even across different snapshots or machines — mapache stores that data only once. To ensure storage efficiency and high I/O throughput, these objects are bundled into "pack files" and tracked via a central index that allows for near-instant lookups and atomic repository updates.

Each backup is captured as a "Snapshot" representing a complete, point-in-time state of your file system. Unlike traditional backup tools that rely on complex "full vs. incremental" chains, every mapache snapshot is technically independent but shares underlying data blobs with others. This means you can delete any old snapshot at any time without risking the integrity of newer ones. All data, from file contents to directory structures, is compressed with zstd and protected by AES-GCM-SIV authenticated encryption, ensuring your repository remains a "black box" to anyone without the master key.

Project Status

Mapache is a feature-complete backup solution. While the architecture is designed for reliability and has extensive test coverage, it is a relatively new project. As with any tool managing critical data, users should perform their own validation before relying on it for primary backups.


Key Features

  • Deduplication: FastCDC (Content-Defined Chunking) identifies shifted data to minimize storage footprint.
  • Security: Mandatory AES-GCM-SIV encryption and Argon2 KDF — your data is never stored or transmitted in the clear.
  • Compression: Zstd compression with adjustable levels to balance backup speed and storage usage.
  • Terminal UI: Rich interactive TUI with dashboard, snapshot/restore screens, file explorer, diff viewer, and live search across snapshots.
  • Backends: Native support for Local FS, SFTP, and S3.
  • Portable: A single, statically linked binary with zero external dependencies.
  • Verifiable: Verify all snapshots, packs, and blobs to make sure your data can be restored at any time.
  • TOML Config: Centralized repository settings via a .toml configuration file, overridable with CLI flags.
  • Bundle Files: Self-contained .mapache bundle format with deduplication, encryption, and FUSE mount support for secure data transfer.
  • Flexible Retention: Policy-based snapshot retention with hourly, daily, weekly, monthly, yearly rules, plus host and tag filtering.

Benchmarks

This is a non-exhaustive set of benchmarks run on my development hardware. They serve as a baseline for comparing performance between versions, using restic v0.19.0 as a base.

Test environment: Fedora 44, AMD Ryzen 9 3900X (24 threads), SanDisk Extreme PRO NVMe.

Each result is the average of 3 runs following a warmup run, all on local storage. Both tools are run with default settings and 8 readers (read-concurrency) for backup.

Mapache has traditionally been slower with datasets made of many small files, so this benchmark test addresses that area specifically.

Workloads:

  • kernel — Linux kernel source tree (~1.6 GB, 99'131 objects)
  • enron — Enron email corpus (~1.4 GB, 520'901 objects)

kernel

Tool Action Avg Time (s) Max Time (s) Avg PSS (MB) Peak PSS (MB) Avg CPU (%) Repo (MB)
mapache backup 2.06 2.16 303.23 311.23 1344.65 304.15
mapache restore 8.51 8.61 415.20 429.18 412.23 --
restic backup 4.13 4.65 825.88 849.96 1197.60 308.84
restic restore 15.77 15.81 256.42 272.48 149.65 --

enron

Tool Action Avg Time (s) Max Time (s) Avg PSS (MB) Peak PSS (MB) Avg CPU (%) Repo (MB)
mapache backup 4.47 4.51 425.24 442.99 1322.17 717.26
mapache restore 40.10 41.14 566.07 572.32 380.97 --
restic backup 10.86 11.26 859.05 863.23 1124.77 725.07
restic restore 71.04 71.36 449.26 465.44 156.66 --

Getting Started

Installation

Quick install (Linux, macOS, Windows):

curl -fsSL https://github.com/jLantxa/mapache/raw/main/tools/install.sh | sh

Or compile from source with the Rust toolchain:

cargo build --release
cargo install --path core

cargo build compiles binaries with some dynamically linked dependencies. While this is fine for testing and development on the same hardware, if you need a statically linked binary (which I strongly recommend for portability), run make release-static or use the binaries provided in the Releases page for a specific released version.

Note for Linux users: The mount command requires FUSE development headers (e.g., libfuse-dev). To build without FUSE support, use --no-default-features.

Quick Start

Initialize a repository (local, S3, or SFTP)

# Local directory
mapache init -r /path/to/repo

# SFTP server
mapache init -r sftp://user@host/backup-folder

# S3 Bucket
mapache init -r s3://my-bucket/backup-folder

Create your first snapshot

mapache snapshot ~/Documents -r /path/to/repo

List snapshots

mapache log -c -r /path/to/repo

Restore data

mapache restore --target /tmp/restore-folder -r /path/to/repo

About

A secure, de-duplicating backup tool written in Rust.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages