Skip to content

Amirali/overlay#7

Open
aircode610 wants to merge 6 commits into
mainfrom
amirali/overlay
Open

Amirali/overlay#7
aircode610 wants to merge 6 commits into
mainfrom
amirali/overlay

Conversation

@aircode610

Copy link
Copy Markdown
Owner

No description provided.

aircode610 and others added 6 commits June 6, 2026 19:05
Emphasize pixel-perfect preservation of original document. The model
should only ADD red handwritten overlays on blank fields, never
modify or regenerate any existing content.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two-step deterministic approach:
1. Qwen-VL detects blank form fields + coordinates (% of image)
2. Pillow draws red placeholder boxes + English text at those positions

No image regeneration — original document preserved pixel-perfect.
Robust JSON parsing handles LLM quirks (single quotes, trailing commas,
malformed coordinate pairs).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ction

Two-step approach:
1. Qwen-VL-Max detects blank form fields → bbox_2d [x1,y1,x2,y2] in 0-1000 range
2. Pillow draws red highlight boxes + English placeholder text at exact positions

Tested on tax form (data/fields.png): all 8 form fields detected and annotated.
Original image preserved pixel-perfect — only red overlays added.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lution

Qwen-VL returns bbox_2d in its internal processing resolution, not
a fixed 0-1000 range. Now derives scale factors from the max coordinate
values in the response, mapping accurately to actual image pixels.

Tested on data/fields.png: all 8 form fields correctly overlaid.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Boxes now measured from the actual text width instead of the full
detected bbox — no more covering existing printed labels.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant