Add Credo and pay down tech debt: dedup generation paths, narrow rescues#65
Merged
Conversation
- Add credo ~> 1.7 (dev/test) and a .credo.exs that excludes the NIF entry points from FunctionArity (arities mirror the fixed C ABI) - Replace length/1 emptiness checks and fix number formatting in tests - Alphabetize alias groups - Swap negated if-else branches in Budget.distribute/3 and MTP.init/2 - Flatten deep nesting: guard clauses in Strategy.Batch reduces, pattern-matched embed helpers in Schema, split_weights/check_gpu in Budget, and run_forward_pass/emit_tick_telemetry/continue_if_active extracted from Server.run_tick/1 - Dedupe HuggingFace request handling in Hub behind hf_get/3 and hf_api_error/2 with parse_search_results/parse_gguf_tree helpers
- llama_cpp_ex.ex: extract shared generation plumbing (gen_config, create_gen_resources, spawn_generator/stop_generator, chunk builder, finish_reason mapping) — the four entry points were ~90% copy-paste. generate/3 now honors all @context_opt_keys like the other entry points (it previously forwarded only 4 of them), and chat_completion/3 now kills/drains its generator after collection so a timeout can't leave a runaway process and stray mailbox messages. - server.ex: single idle_slot_fields/2 source of truth for slot resets (init/reset_slot/fail_all_active_slots each spelled out all 18 fields); shared request_measurements/2 behind the :done/:exception telemetry. - model_manager.ex: with_route/3 dispatch skeleton for generate/stream/ chat; merged the two build_ready_entry clauses; narrowed safe_backend_init's rescue-all to ErlangError/UndefinedFunctionError with a debug log. - hub.ex: do_download_to rescues only File.Error/ErlangError so programming errors propagate. - tokenizer.ex/chat.ex: NIF ErlangError wrappers now re-raise :not_loaded (packaging bug, not bad input) via NIF.error_tuple/3. - embedding.ex: map_while_ok/2 replaces three hand-rolled short-circuit folds (also fixes O(n²) accumulation in decode_groups); NIF wrappers are thin delegates instead of no-op case passthroughs. - tests: fix MTP n_rs_seq assertion — the draft context is intentionally created with n_rs_seq=0 (rollback uses cached hidden states); the old assertion contradicted the implementation and always failed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds Credo to the project and fixes every
--strictfinding, then addresses thelarger technical-debt items a full-codebase review surfaced: heavy copy-paste in
the generation entry points, triplicated slot-reset field lists in the server,
and overly broad
rescueclauses that could swallow real bugs.Tooling
credo ~> 1.7(dev/test) and a.credo.exs; the NIF entry points areexcluded from
FunctionAritysince their arities mirror the fixed C ABI.mix credo --strictnow reports 0 issues.Refactors
LlamaCppEx— the four generation entry points (generate,stream,chat_completion,stream_chat_completion) were ~90% copy-paste. They nowshare
gen_config/1,create_gen_resources/3,spawn_generator/4/stop_generator/2, achunk/3builder, and onefinish_reason/1mapping.Server— the 18-field slot map is defined once inidle_slot_fields/2(previously spelled out verbatim in
init/1,reset_slot/2, andfail_all_active_slots/2— a new field missed in one spot would silentlycarry stale state across requests).
:done/:exceptiontelemetry sharerequest_measurements/2.run_tick/1split into focused helpers.ModelManager—with_route/3dispatch skeleton forgenerate/stream/chat; merged the twobuild_ready_entryclauses.Hub— HuggingFace requests sharehf_get/3+hf_api_error/2(the 401/403/404 handling was copy-pasted across four functions).
Embedding—map_while_ok/2replaces three hand-rolled short-circuitfolds; NIF wrappers are thin delegates instead of no-op
casepassthroughs.Fixes
generate/3now honors all documented context options — it previouslyforwarded only 4 of the 20
@context_opt_keys(e.g.flash_attnwassilently ignored), unlike the other three entry points.
chat_completion/3now kills and drains its generator after collection —a timeout previously leaked a running generator process and left stray
messages in the caller's mailbox permanently.
decode_groups/3no longer accumulates with O(n²)acc ++ embs.flattened into error strings:
Hub.do_download_to/4(wasrescue e ->),ModelManager.safe_backend_init/0(wasrescue _ -> :ok+catch), andthe tokenizer/chat NIF wrappers now re-raise
:not_loaded(a packagingbug, not bad input) via a shared
NIF.error_tuple/3.n_rs_seq >= n_draft, but the draft context is intentionally created withn_rs_seq: 0(rollback uses cached hidden states, matching upstream).Verification
mix compile --warnings-as-errors,mix format --check-formatted,mix credo --strict(0 issues),mix dialyzer(0 errors) all pass.the previously always-failing MTP assertion.
Net: 13 files changed, +560 / −787 across the two commits.