devbox-based development stack for dserver and the dtool-lookup-webapp. Services run as native processes (managed by process-compose) on top of Nix packages — no containers required.
This repository provides a complete development environment for:
- dserver - REST API for registering, looking up, and searching dtool dataset metadata
- dtool-lookup-webapp - Vue.js web frontend for searching datasets
- MinIO - S3-compatible object storage for datasets
- PostgreSQL - SQL database for dserver admin metadata
- MongoDB - NoSQL database for dataset search and retrieval
The service topology is defined in process-compose.yaml; the toolchain and
environment are defined in devbox.json. The launch/init scripts live in
devbox/.
- devbox (installs Nix on first use)
- Git
git clone --recursive git@github.com:your-org/dserver-development-stack.git
cd dserver-development-stackIf you already cloned without --recursive, initialize the submodules:
git submodule update --init --recursive# MongoDB is distributed under the SSPL and is "unfree" in Nixpkgs,
# so unfree packages must be allowed for the install.
export NIXPKGS_ALLOW_UNFREE=1
devbox shellOn first entry this:
- installs all Nix packages (Python, Node, PostgreSQL, MongoDB, MinIO, …),
- builds the Python virtual environment (
.venv) with every dserver package in editable mode, - installs the Node dependencies for the webapp, and
- generates JWT keys for authentication.
Subsequent entries are fast.
devbox services upThis brings up PostgreSQL, MongoDB, MinIO (with bucket initialization), the
dserver API, and the webapp, respecting health checks and start-up order.
Runtime data is stored under .devbox-data/ (gitignored); every service binds
to 127.0.0.1 on the ports listed below. Stop the stack with Ctrl-C (or
devbox services stop from another shell).
| Service | Port | Description |
|---|---|---|
| dserver | 5000 | REST API for dataset metadata (includes token generator) |
| webapp | 8080 | Vue.js frontend |
| minio | 9000 (API), 9001 (Console) | S3-compatible storage |
| postgres | 5432 | PostgreSQL database |
| mongo | 27017 | MongoDB database |
| Command | Description |
|---|---|
devbox shell |
Enter the dev environment (builds venv/node deps on first run) |
devbox services up |
Start the full stack |
devbox services up index-s3 |
Also run the on-demand dataset indexer |
devbox run index |
Index s3://dtool-bucket into dserver |
devbox run create-test-dataset |
Create, push and index a sample dataset |
devbox run psql |
Open a psql shell on the dserver database |
devbox run rebuild-venv |
Delete and rebuild .venv |
devbox run clean-data |
Remove the .devbox-data/ runtime data |
OAuth2 credentials are read from a local .env file — copy .env.template to
.env and fill it in.
- dserver API: http://127.0.0.1:5000
- API Documentation: http://127.0.0.1:5000/doc/swagger (requires authentication)
- Webapp: http://127.0.0.1:8080
- MinIO Console: http://127.0.0.1:9001 (credentials:
minioadmin/minioadmin)
To create a sample dataset, push it to MinIO and index it in dserver:
devbox run create-test-datasetIf you have datasets in the MinIO bucket, index them with:
devbox run indexYou can push datasets directly to the MinIO S3 storage using the dtool
command line tool. Inside devbox shell the dtool CLI is already on the
PATH; on a separate host machine, install it with pip install dtool dtool-s3.
Copy the provided configuration file to your dtool config directory:
cp dtool.json ~/.config/dtool/dtool.jsonOr set the environment variables directly:
export DTOOL_S3_ENDPOINT_dtool-bucket="http://127.0.0.1:9000"
export DTOOL_S3_ACCESS_KEY_ID_dtool-bucket="minioadmin"
export DTOOL_S3_SECRET_ACCESS_KEY_dtool-bucket="minioadmin"
export DTOOL_S3_DISABLE_BUCKET_VERSIONING_dtool-bucket=trueThe per-bucket variable names contain a hyphen (the bucket name). Inside the stack these are injected for you by
devbox/with-dtool-s3.sh; for manualdtooluse, the~/.config/dtool/dtool.jsonroute above is simplest.
dtool create my-dataset
cp some-file.txt my-dataset/data/
dtool freeze my-dataset
dtool cp my-dataset s3://dtool-bucket/Then index it so it appears in the webapp:
devbox run indexdtool ls s3://dtool-bucket/
dtool cp s3://dtool-bucket/<uuid> ./local-copy/The dtool-dserver storage broker allows you to access datasets through
dserver without requiring direct S3/Azure credentials.
Install it on the client (pip install dtool-dserver) and configure access via
~/.config/dtool/dtool.json (set DSERVER_TOKEN, see below) or the
DSERVER_TOKEN environment variable.
URIs have the form dserver://<server>/<backend>/<bucket>[/<uuid>], where
<backend>/<bucket> maps to the storage backend base URI
<backend>://<bucket> on the server:
dtool ls dserver://127.0.0.1:5000/s3/dtool-bucket/
dtool cp dserver://127.0.0.1:5000/s3/dtool-bucket/<uuid> ./local-copy/
dtool cp my-dataset dserver://127.0.0.1:5000/s3/dtool-bucket/Benefits:
- No need for S3/Azure credentials on client machines
- Centralized access control through dserver
- Automatic dataset registration on upload
Authentication is provided by the OAuth2 token generator plugin
(dserver-token-generator-plugin-oauth2). There are three ways to obtain a
token:
1. Browser (OAuth2 / ORCID login) — the webapp's login button drives this.
It redirects through GET /auth/login → the provider → GET /auth/callback,
and requires OAUTH2_CLIENT_ID / OAUTH2_CLIENT_SECRET in your .env.
2. Headless API key — POST /auth/token exchanges an API key for a JWT.
This requires OAUTH2_API_KEY to be set in the environment:
curl -X POST http://127.0.0.1:5000/auth/token \
-H "Content-Type: application/json" \
-d '{"api_key": "<your-OAUTH2_API_KEY>", "username": "admin"}'3. Local development (no OAuth provider needed) — sign a JWT directly with
the server's private key. This is what indexall.sh does:
TOKEN=$(python - <<'PY'
import jwt, datetime
key = open("jwt/jwt_key").read()
print(jwt.encode({
"sub": "admin", "username": "admin",
"iat": datetime.datetime.now(datetime.timezone.utc),
"exp": datetime.datetime.now(datetime.timezone.utc) + datetime.timedelta(hours=1),
"fresh": True,
}, key, algorithm="RS256"))
PY
)
curl -H "Authorization: Bearer $TOKEN" http://127.0.0.1:5000/config/infoUse the returned token in the Authorization: Bearer <token> header.
Note: Indexing datasets does not need a token —
devbox run index(or theindex-s3service) uses theflask base_uri indexCLI directly.
The following packages are installed in editable mode (from the submodules), so changes to the code are reflected immediately:
dtoolcore- Core dtool librarydtool-s3- S3 storage backend for dtooldservercore- dserver core applicationdserver-search-plugin-mongo- MongoDB search plugindserver-retrieve-plugin-mongo- MongoDB retrieve plugindserver-dependency-graph-plugin- Dependency graph extensiondserver-signed-url-plugin- Signed URL generation for direct S3/Azure accessdserver-token-generator-plugin-oauth2- OAuth2 (e.g. ORCID) JWT token generator
The dtool CLI meta-package (and dtool-dserver) are installed from PyPI. See
devbox/setup-venv.sh for the full list.
If you add new dependencies or want to rebuild from scratch:
devbox run rebuild-venvdevbox services up streams all service logs. To attach to the process-compose
TUI (status, per-process logs, restart controls) from another shell:
devbox services attachPress Ctrl-C in the devbox services up terminal, or from another shell:
devbox services stopTo also discard the runtime data (databases, object storage):
devbox run clean-dataStack-wide environment variables are set in devbox.json (the env block);
per-service settings live in process-compose.yaml. The main ones are:
| Variable | Description |
|---|---|
SQLALCHEMY_DATABASE_URI |
PostgreSQL connection string |
SEARCH_MONGO_URI |
MongoDB URI for search plugin |
RETRIEVE_MONGO_URI |
MongoDB URI for retrieve plugin |
JWT_PRIVATE_KEY_FILE |
Path to JWT private key (jwt/jwt_key) |
JWT_PUBLIC_KEY_FILE |
Path to JWT public key (jwt/jwt_key.pub) |
The per-bucket DTOOL_S3_*_dtool-bucket settings contain hyphens, which a
shell export (and therefore the devbox.json env block) cannot represent.
They are injected into the relevant processes by devbox/with-dtool-s3.sh.
The stack creates a bucket named dtool-bucket on MinIO and makes it publicly
readable. Because everything runs on 127.0.0.1, datasets are reachable at the
same URLs from both the server and the host — no /etc/hosts entry needed.
This repository includes the following submodules:
| Submodule | Description |
|---|---|
dtool |
dtool CLI meta-package |
dtoolcore |
Core Python API for managing datasets |
dtool-s3 |
S3 storage backend for dtool |
dtool-dserver |
Storage broker for accessing datasets via dserver |
dservercore |
dserver Flask application |
dserver-search-plugin-mongo |
MongoDB search plugin |
dserver-retrieve-plugin-mongo |
MongoDB retrieve plugin |
dserver-dependency-graph-plugin |
Dependency graph extension |
dserver-notification-plugin |
Notification extension |
dserver-signed-url-plugin |
Signed URL generation plugin |
dserver-token-generator-plugin-oauth2 |
OAuth2 (ORCID) JWT token generator plugin |
dtool-lookup-webapp |
Vue.js web frontend |
dserver-client-js |
JavaScript/TypeScript client library |
devbox services up prints each service's logs inline (prefixed with the
service name). Look for the failing process there, or use the process-compose
TUI via devbox services attach.
Common issues:
- Database not ready: dserver waits on the postgres/mongo health checks; on the very first run these take a few extra seconds.
- Stale
.venv: rebuild withdevbox run rebuild-venv. - Corrupt runtime data: reset with
devbox run clean-data, then restart.
MongoDB is unfree in Nixpkgs. Export NIXPKGS_ALLOW_UNFREE=1 before
devbox shell / devbox install.
See the LICENSE file for details.