- Framework: Roda (routing tree web toolkit) with Sequel ORM and SQLite
- Auth: Rodauth (login, email auth/magic links, password reset, lockout)
- Server: Falcon (async Ruby web server), using
falcon servewith--threaded - CSS: Tailwind CSS, compiled via
tailwindcss-rubygem - Assets: Roda assets plugin with precompilation (
assets/compiled_assets.json) - Ruby version: Defined in
.ruby-version
git clone git@github.com:stephaniewilkinson/yonderbook.git
cd yonderbook
cp .env-example .env # if you msg me I can share my api keys
bundle install
rake db:migrate
falcon
Production (Render):
sqlite3 /var/data/production.dbDevelopment:
sqlite3 db/development.dbbundle exec rake test
Tests require environment variables — copy .env-example to .env and fill in values.
app.rb— Main Roda application class with routing, plugins, and Rodauth configconfig.ru— Rack config; loads Sentry, sets up env-specific middlewareRakefile— Definesprecompile,tailwind:build,tailwind:watch, and loadslib/tasks/*.rakelib/database.rb— Sequel/SQLite setup; creates DB constant, path depends onRACK_ENVlib/tasks/db.rake— Database rake tasks (migrate, reset, create_migration)
TODO: Clearly display the Goodreads name or logo on any location where Goodreads data appears. For instance if you are displaying Goodreads reviews, they should either be in a section clearly titled "Goodreads Reviews", or each review should say "Goodreads review from John: 4 of 5 stars..."
TODO: Link back to the page on Goodreads where the data data appears. For instance, if displaying a review, the name of the reviewer and a "more..." link at the end of the review must link back to the review detail page. You may not nofollow this link.
The signup form uses a honeypot field to block bot registrations. A hidden name field is rendered off-screen — humans never see it, but bots parsing the form will fill it in. If the field has a value on POST, the request is silently redirected to the /check-email page without any database interaction. The bot thinks the signup succeeded.
BookMooch is a book trading community where users can give away books they no longer need and receive books they want.
The BookMooch API allows up to 10 requests/second. Exceeding this results in 302 redirect responses (not standard 429s). In practice, keeping requests concurrent with a connection pool limit (rather than throttling with a rate limiter) works best — a leaky bucket limiter causes timeouts and connection issues with BookMooch's server.
All API calls accept parameters via either GET (URL params) or POST (body). Use POST for large payloads like bulk ASIN/ISBN submissions — GET has a ~2048 character URL limit, so large ISBN lists must be batched. POST can send arbitrarily large fields in a single request.
Errors are indicated by a negative result_code field in the XML response, with a result_text description:
<?xml version="1.0" encoding="UTF-8"?>
<userids>
<userid>
<id>john_smith</id>
<result_code>-1</result_code>
<result_text>no data found</result_text>
</userid>
</userids>The /api/userbook endpoint uses HTTP Basic Auth. A 302 response means rate limiting; a 401 or HTML error page means invalid credentials (users should use their BookMooch username, not email).
OverDrive provides APIs for searching library digital collections and checking availability.
Uses OAuth2 client credentials flow via https://oauth.overdrive.com/token. The returned bearer token is used for all subsequent API calls. Tokens are short-lived and should be fetched per-session.
Library info — GET /v1/libraries/{consortiumId}
Returns collection token, website ID, and homepage URL. The collectionToken is required for all product/availability queries.
Product search — GET /v1/collections/{collectionToken}/products?q={query}
Searches the library's digital catalog. Accepts a single query string (ISBN, title, or author). Does not support batch/bulk queries — there is no way to search multiple ISBNs in one call. Pagination via limit (default 25) and offset.
Availability (v2) — GET /v2/collections/{collectionToken}/availability?products={id1},{id2},...
Accepts up to 25 comma-separated product IDs per request. Returns copiesAvailable, copiesOwned, and hold counts. Product IDs (reserveId) come from search results. Cannot accept ISBNs directly — must resolve ISBN to product ID via search first.
- No bulk search: Each book requires its own search API call. For a shelf of 500 books, that's 500+ search calls. This is the main bottleneck.
- Print ISBNs are not searchable: Goodreads shelves contain print ISBNs, but only digital ISBNs (ebook/audiobook format) are searchable via the
identifiersparameter. Print ISBNs appear inotherFormatIdentifiersin responses but cannot be used as search input. This is why the code falls back to title+author matching when ISBN search returns no results. - Rate limits are undocumented: The API Usage Requirements say "honor any limitations we set" but don't publish specific numbers. The code uses
Async::Semaphore.new(16)for concurrent requests. - Availability is product-ID-only: The v2 availability endpoint requires OverDrive product IDs, not ISBNs. A two-phase lookup (search then availability) is unavoidable without a local index.
Cache ISBN-to-product-ID mappings in the database. After the first lookup, store the mapping so repeat shelf checks skip the expensive search phase and go straight to availability batches. This would reduce repeat visits from O(n) search calls to O(new_books) searches + O(n/25) availability calls.
Local collection index (future). The products endpoint supports ?lastUpdateTime={timestamp} for incremental sync. Could paginate the entire library collection into a local table, then match ISBNs locally. Initial sync: 400-3,200 calls for a typical library (10k-80k titles at 25/page), then incremental updates. Eliminates per-book search calls entirely.
Books are processed in chunks of 100 to bound memory. Each chunk completes the full pipeline (search -> expand editions -> fetch availability) before the next starts. Raw JSON response bodies are discarded after parsing. Timing and RSS memory usage are logged per-chunk for monitoring.
The app runs on Render's Starter plan (512MB RAM). The process starts at ~100MB and grows steadily until OOM kill at 512MB.
There are two layers to the problem:
Layer 1: Per-request memory allocations that are never returned to the OS. Every GET / request leaked ~0.2-0.4MB of RSS, even though the homepage is a static marketing page with no DB queries or API calls. The leak came from middleware and analytics running on every request, including bot/monitor traffic hitting / every minute:
- Sentry transaction tracing (
traces_sample_rate = 0.1): TheCaptureExceptionsmiddleware clones the Sentry hub, creates a scope, stores the full Rackenvhash in the scope, and creates transaction/span objects for 10% of requests. Under Falcon's fiber-based concurrency, hub clones stored inThread.currentmay not clean up properly between fibers. - PostHog analytics on homepage:
Analytics.trackqueued a PostHog event with a unique distinct_id (new session UUID) for every bot request. Useless analytics noise that allocated objects into PostHog's internal queue. - Session writes for bots:
session['session_id'] ||= SecureRandom.uuidforced the Roda sessions plugin to encrypt and set a cookie on every request, even for bots that never send cookies back.
Layer 2: Memory that GC cannot reclaim. Even after Ruby's major GC collects objects (old_objects drops from 549k to 50k), RSS doesn't decrease -- it stays at 506MB and keeps climbing. This happens even with MALLOC_ARENA_MAX=2 set, ruling out simple glibc arena fragmentation. The retained memory likely comes from C-level allocations in OpenSSL (used by Sentry's HTTP transport and session encryption) and object-slot fragmentation in Ruby's heap pages.
Server starts at ~100MB. At 0.3MB/request with bot traffic every minute:
- ~23 hours to reach 512MB and trigger SIGKILL
- SIGKILL cannot be caught -- no Ruby error handler, no Sentry, nothing runs
Homepage served before middleware (app.rb) -- r.root is now matched before enrich_sentry, session['session_id'] assignment, and identify_user. Bot traffic to / no longer creates sessions, Sentry scopes, or PostHog events. This eliminates the primary source of per-request allocations.
Sentry::Rack::CaptureExceptions middleware removed (app.rb) -- This middleware cloned the Sentry hub, created a scope storing the full Rack env, and ran session tracking on every request. Under Falcon's fiber/thread model, these allocations leaked ~0.2-0.4MB/request that was never reclaimed. Errors are still captured via Sentry.capture_exception in the app's rescue block and error_handler plugin. Also set traces_sample_rate = 0 in config.ru to disable transaction tracing.
Periodic GC.compact (lib/memory_logger.rb) -- When RSS exceeds 400MB, GC.compact runs every 100 requests. This consolidates the Ruby heap so free pages can be returned to the OS. Won't fully solve malloc fragmentation but helps with Ruby-level fragmentation.
MALLOC_ARENA_MAX=2 (set in Render dashboard) -- Limits glibc to 2 memory arenas instead of 8 per thread. Heroku made this the default for all Ruby apps. Already set; insufficient on its own to prevent OOM -- the Sentry middleware removal was the critical fix.
Process.warmup (config.ru, production only) -- Ruby 3.3+ API that compacts the heap and optimizes GC after boot, before serving requests.
RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=1.3 (optional) -- Triggers major GC more frequently. Sam Saffron measured ~22% RSS reduction. Causes more GC pauses, acceptable at low traffic.
MemoryLogger middleware (lib/memory_logger.rb) logs RSS and GC stats on every request. Runs in production only, skips /health and static assets. Logs twice per request -- START and END -- so the killing request is identifiable even after SIGKILL.
[mem] #42 START GET /goodreads/shelves rss=294.2MB
[mem] #42 END GET /goodreads/shelves status=200 duration=1234.5ms rss=312.4MB delta=+18.2MB heap_live=1823456 old_objects=982341 major_gc=1 minor_gc=3
[mem] #43 START GET /login rss=312.4MB
<-- process killed here, no END line
How to read the logs: A START with no matching END is the request that caused OOM. Large positive delta values on END lines show which requests grow memory. The WARNING line fires when RSS exceeds 400MB. After the fix, look for [mem] GC.compact lines showing compaction results.
The mitigations above eliminated the biggest leak (bot traffic on /), but RSS still creeps up on non-homepage requests. Every authenticated request runs through this pipeline (app.rb lines 103-121):
- Session decryption/encryption -- Rodauth decrypts the incoming session cookie and re-encrypts the outgoing one via OpenSSL. Cipher contexts are C-level
mallocallocations. - Sentry scope calls --
enrich_sentrycallsSentry.set_userandSentry.set_tagson every request, creating scope objects on the Sentry hub even without the middleware. - PostHog identify on every request --
identify_usercallsAnalytics.alias_userandAnalytics.identifyfor every logged-in request, pushing events onto PostHog's internal queue. - DB query --
Account[rodauth.session_value]runs a database query on every authenticated request.
The problem isn't Ruby objects -- GC collects those fine (old_objects drops from 549k to 50k). The problem is glibc malloc fragmentation from C-level allocations. OpenSSL cipher contexts, Sentry internals, and database buffers are allocated via malloc(). When freed, they leave holes in the heap that glibc can't return to the OS. Falcon's fiber concurrency makes this worse -- fibers interleave allocations across memory pages, so no page is ever fully free.
GC.compact only helps Ruby heap pages. MALLOC_ARENA_MAX=2 limits arenas but doesn't prevent fragmentation within them.
malloc_trim gem -- Calls malloc_trim() after each major GC cycle to return freed glibc pages to the OS. ~1% CPU overhead, Linux only (which Render uses). This is the lowest-effort next step. Typical RSS reduction: 10-30%.
jemalloc -- A drop-in malloc replacement that returns memory to the OS far more aggressively. Used by GitLab, Discourse, and Mastodon. However, it requires a Docker deploy on Render (apt-get install libjemalloc2 + LD_PRELOAD), which is overkill unless malloc_trim proves insufficient. Typical RSS reduction: 25-40%.
Health-check-based restart -- Write a custom /health that returns 500 when RSS > 450MB. Render restarts after 60s of failed checks. This is a fallback, not a fix.
- Sentry / error_handler plugin -- SIGKILL terminates the process before any Ruby code can execute. These only catch Ruby exceptions.
- Reducing TupleSpace TTL -- Cached entries are ~1-2KB each, negligible at this scale.
- Wrapping OAuth calls in
Sync do-- Inside Falcon,Sync dois a no-op (already in an async task). Net::HTTP calls are automatically non-blocking via Ruby's fiber scheduler. - Removing
--verbosefrom Falcon -- Falcon's verbose middleware writes to stdout and doesn't buffer in memory.
Observed GET / requests every ~1 minute growing RSS by 0.2-0.4MB with major_gc=0 minor_gc=0 on most requests. Key data points:
02:58 rss=500.7MB (WARNING threshold)
03:09 rss=506.3MB old_objects drops 549201 -> 50934 (major GC ran, but RSS didn't shrink)
03:10 rss=507.4MB old_objects=183943 (climbing back up)
03:13 rss=510.0MB -> OOM kill, Render restarts process
03:14 rss=100.3MB (fresh start, first request)
The fact that RSS didn't decrease after major GC -- even with MALLOC_ARENA_MAX=2 already set -- pointed to the Sentry middleware as the primary culprit. Sentry::Rack::CaptureExceptions clones the hub, creates scopes, and stores the Rack env on every request. Under Falcon's fiber/thread model, these allocations aren't properly reclaimed. Fix: remove Sentry middleware (keep manual error capture), move homepage route before session/analytics middleware, add GC.compact safety net.
Deployed on Render with a persistent disk for SQLite at /var/data/production.db.
Render does not use the Procfile — commands are set in the dashboard under Settings:
Build command:
bundle install && bundle exec rake precompile
Start command:
bundle exec rake db:migrate && bundle exec falcon --verbose serve --threaded -n 2 -b http://0.0.0.0:${PORT}
- Render's persistent disk (
/var/data) is only mounted at runtime, not during builds. Migrations must run in the start command. - Rake tasks in
lib/tasks/must notrequiredatabase.rbat the top level — it callsFileUtils.mkdir_p('/var/data')which fails during builds. Require it lazily inside task bodies that need it. - The
precompiletask uses a bare Roda class (not the full App) to avoid loading all app dependencies during the build.app.rbalso callscompile_assetsat startup. tailwindcss-rubymust stay in the top-level Gemfile group (not:development) because it's needed by the build step.
This app uses the roda-route-list plugin. This makes all the routes available in a /routes.json file.
openssl req -x509 -out localhost.crt -keyout localhost.key \
-newkey rsa:2048 -nodes -sha256 \
-subj '/CN=localhost' -extensions EXT -config <( \
printf "[dn]\nCN=localhost\n[req]\ndistinguished_name = dn\n[EXT]\nsubjectAltName=DNS:localhost\nkeyUsage=digitalSignature\nextendedKeyUsage=serverAuth")