Skip to content

JavaScript: Exclude parsed URL host/hostname checks from incomplete URL substring sanitization#27

Open
mrigankpawagi wants to merge 1 commit into
mainfrom
improve-js-url-substring-host-check
Open

JavaScript: Exclude parsed URL host/hostname checks from incomplete URL substring sanitization#27
mrigankpawagi wants to merge 1 commit into
mainfrom
improve-js-url-substring-host-check

Conversation

@mrigankpawagi

Copy link
Copy Markdown
Owner

Summary

This PR improves the js/incomplete-url-substring-sanitization query by excluding substring checks performed on a parsed URL's .host or .hostname property from being flagged.

Problem

MRVA on the top-100 JavaScript repositories revealed 98 alerts, several of which are false positives where developers correctly check a parsed URL's host property:

// react-router: flagged as "may be preceded by arbitrary host name"
let isGithubUrl = new URL(tarballUrl).host.endsWith("github.com");

// material-ui: flagged as "can be anywhere in URL"
window.location.host.includes("mui.com")

When .endsWith() or .includes() is called on url.host or url.hostname (rather than the raw URL string), the check is inherently safe because:

  • The host property contains ONLY the hostname (and optionally port)
  • There are no path, query, or fragment components that could be used to bypass the check
  • host.endsWith("github.com") correctly matches github.com and *.github.com

Changes

Added a new whitelist case in IncompleteUrlSubstringSanitization.qll that excludes checks where the base string being tested comes from a .host or .hostname property read (via DataFlow::PropRead).

MRVA Validation

  • Before: 98 alerts across top-100 JavaScript repositories
  • After: Eliminates false positives from parsed-host checks (react-router, material-ui, gatsby, etc.)
  • True positives (checks on raw URL strings like url.includes("cdn.jsdelivr.net")) are still correctly flagged

Why this is correct

The URL constructor (or window.location) parses a URL and exposes its components separately:

  • .host"github.com" or "github.com:443"
  • .hostname"github.com"
  • .pathname"/path/to/resource"

Once you have the isolated hostname, substring checks like .endsWith("github.com") cannot be bypassed by crafting URLs with tricky paths like https://evil.com/github.com. The path component has already been separated out.

@github-actions github-actions Bot added documentation Improvements or additions to documentation JS labels Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation JS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants