feat: add switch RFC3986 / IRI-Style for @TrackLink#3077
Conversation
|
There should be no |
|
Good point regarding the environment switch. I agree this can be handled directly in the regexp and doesn't need an additional configuration option. The underlying issue is that The current character-class approach requires continuously enumerating allowed URL characters. Using a delimiter-based approach is simpler because Furthermore, in real-world conditions, some platforms like SharePoint or similar systems generate very long and complex URLs (parameters, nested paths, tokens, etc.), making strict or overly restrictive approaches even less suitable. Rather than managing multiple variants, it's possible to simplify by using a single, more permissive regular expression that works correctly: |
Response for Issue #3076
Summary
The root cause was identified in
models/common.go(line 55): the existing regex pattern only accepted a strict subset of RFC 3986 characters (ASCII letters, digits, and a limited set of symbols). As a result, URLs containing emojis or non-ASCII Unicode characters caused the pattern to break before reaching@TrackLink, preventing the shortcut from being converted into the expected{{ TrackLink ... }}template tag — leaving the raw, corrupted URL in the final email output.Changes
models/common.go(line 45) — configurable regex mode viaLISTMONK_TRACKLINK_REGEX_MODEThe fix introduces an environment variable to control the URL matching strategy, allowing operators to choose between improved compatibility and strict legacy behaviour:
iri(default)",', whitespace,<,>,{,}) — handles emojis, Unicode, encoded sequences, and long query stringsunicodeRegex comparison
Before (strict RFC 3986):
After (IRI-style, default):
This pattern captures everything up to the first structural HTML or template delimiter, making it robust against the full range of real-world client URLs.
Backward compatibility
This change is non-breaking. The new
irimode is the default, but the previous behaviour can be fully restored by setting:Testing
✅,📊,🚀)%20,%2F, etc.)unicodemode — behaviour identical to previous implementation