Skip to content

feat: Improve link matching#3

Open
KHROTU wants to merge 1 commit into
exefer:masterfrom
KHROTU:master
Open

feat: Improve link matching#3
KHROTU wants to merge 1 commit into
exefer:masterfrom
KHROTU:master

Conversation

@KHROTU

@KHROTU KHROTU commented Jun 10, 2026

Copy link
Copy Markdown

added normalized_url, which does:

  • lowercase the scheme and host (e.g. HTTPS -> https, .COM -> .com)
  • strip www.
  • remove ports
  • remove fragment (e.g. /page#thing -> /page)
  • remove tracking params (e.g. ?utm_source=smth&share_id=xyz&id=abc -> ?id=abc)
  • strip index page, or at least the ones i could think of
  • remove trailing /
  • canonicalize percent encoding

also did some perf work, tho it probs wont be very noticeable

added normalized_url, which does:
- lowercase the scheme and host (e.g. HTTPS -> https, .COM -> .com)
- strip www.
- remove ports
- remove fragment (e.g. /page#thing -> /page)
- remove tracking params (e.g. ?utm_source=smth&share_id=xyz&id=abc -> ?id=abc)
- strip index page, or at least the ones i could think of
- remove trailing /
- canonicalize percent encoding

also did some perf work, tho it probs wont be very noticeable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant