From 84f5e6a5fbcaf4feccd37ac224d4fd272c61a83c Mon Sep 17 00:00:00 2001 From: postoso Date: Fri, 19 Jun 2026 00:37:09 -0400 Subject: [PATCH 1/3] fix: validate downloaded model content + surface onboarding errors (#353, #355) Reject HTML/markup proxy/block pages before persisting them as model files so a corrupt payload (e.g. a corporate proxy notification page returned with HTTP 200) is never cached permanently as a model artifact, and surface model download/load failures during onboarding instead of silently logging them. Content validation (#353): reject any payload whose Content-Type is HTML/XML, or whose leading bytes (after stripping a UTF-8 BOM + ASCII whitespace) begin with `<` followed by a markup-ish byte (an ASCII letter, `!`, `?`, or `/`). No artifact this downloader fetches legitimately begins with `<` (CoreML `.mlmodelc`/`.mlpackage` payloads are binary or JSON, `.mil` starts with `program`, vocab is JSON, the tokenizer is a SentencePiece binary), so this catches ``, ``, `