Skip to content

Limitations section#441

Open
martinthomson wants to merge 6 commits into
mainfrom
limitations-section
Open

Limitations section#441
martinthomson wants to merge 6 commits into
mainfrom
limitations-section

Conversation

@martinthomson

@martinthomson martinthomson commented May 29, 2026

Copy link
Copy Markdown
Member

There is a lot of this document that talks about what it can do, but that fails to account for potential misapprehension about what is possible.

This section attempts to enumerate limitations when it comes to using this API for the measurement of advertising effectiveness, particularly when it comes to producing information that is helpful in making decisions about where to invest in marketing.

I've put this up front, so the disclaimer is clear. The section is longer than many of the adjoining sections; I hope that conveys the right sort of message.

Thanks to @rickcentralcontrolcom for raising the underlying issue.


Preview | Diff

There is a lot of this document that talks about what it can do,
but that fails to account for potential misapprehension about what is
possible.

This section attempts to enumerate limitations when it comes to using
this API for the measurement of advertising effectiveness, particularly
when it comes to producing information that is helpful in making
decisions about where to invest in marketing.

I've put this up front, so the disclaimer is clear.  The section is
longer than many of the adjoining sections; I hope that conveys the
right sort of message.
@rickcentralcontrolcom

rickcentralcontrolcom commented May 29, 2026 via email

Copy link
Copy Markdown

@bmcase

bmcase commented May 29, 2026

Copy link
Copy Markdown
Contributor

@rickcentralcontrolcom you don't see the changes in the spec yet because the PR has not been merged yet. If you look at this preview link https://pr-preview.s3.amazonaws.com/w3c/attribution/pull/441.html#limitations you can see the new limitations section.

@rickcentralcontrolcom

rickcentralcontrolcom commented May 29, 2026 via email

Copy link
Copy Markdown

These are designed to avoid overegging the pudding, by implying that
simple attribution (comparing sites or creatives) is the entire story.
@AramZS

AramZS commented Jun 9, 2026

Copy link
Copy Markdown
Member

We will merge this on Monday unless there are any reasonable objections.

@bmayd

bmayd commented Jun 10, 2026

Copy link
Copy Markdown

I want to add a voice to the concerns articulated by @rickcentralcontrolcom and suggest that the document be revised to accurately characterize:

  • What browser-based attribution is more broadly.
  • What intelligence it can provide and what the limits of that intelligence are.
  • What can be reasonably inferred from browser-based attribution more generally.
  • Where browser-based attribution fits into larger frameworks that can provide evidence suggestive of advertising effectiveness.
  • How the outputs from this API, which are an extremely narrow, purposefully limited and generalized subset of the evidence that can be gathered from browsers, compare with more common browser-derived datasets.
  • What the outputs of this API can reliably convey.
  • Where the outputs from this API meaningly fit into, and what they contribute to, ad-effectiveness measurement frameworks.

As it stands, the document directly asserts in myriad direct and indirect instances, many outlined by Rick above, that the API enables "the measurement of advertising performance"; that's inaccurate.

What the API enables is:

  • Recording of very dimensionally limited and nondeterministically incomplete data about observable ad-related events.
  • Outputs of a subset of those events, correlated using fixed, client-side rulesets constrained by the limits of available data that are unable to account for confounding factors and don't natively incorporate counterfactuals.
  • Reporting of limited, noised, aggregate statistics about the correlated outputs.

In other words, the output of the API provides an extremely limited, approximate and incomplete snapshot of what happened during a campaign, while by design providing insufficient information to make meaningful inferences about why it happened or adjust for user intent or platform optimization bias.

At best it allows marketers to make limited assumptions about the effectiveness of their targeting and inventory supply in correlating with conversions, but it doesn't provide any insight into if or why the targeting or inventory was or was not effective and it shouldn't imply that it does. The API cannot differentiate between ads that induced a conversion (causation) and those that were merely incidental to it (correlation); it is strictly an observational tracking tool, not an incrementality tool.

Per Rick's comment above, the W3C, via this standard, "should not unintentionally endorse attribution reporting as a scientifically valid measure of advertising effectiveness unless the specification is much clearer about what the API can and cannot establish."

I recognize that an ask to adequately frame this proposal's relationship with rigorous frameworks for ad effectiveness is significant and probably beyond the reasonable scope of this primarily technical standard. If that's the case, I suggest as an alternative that the document be revised to clearly indicate the limits of the API regarding campaign effectiveness measurement, that it include references to resources that can clarify those limits and inform the proper use of this API and that any language suggesting it can directly measure ad effectiveness or campaign performance be modified to clearly indicate it can, at best, provide inputs as part of a properly controlled framework for the assessment of campaign effectiveness.

Suggested introductory abstract:

This document defines a client-side API designed to quantify observable, cross-site associations between digital advertising impressions and subsequent conversion outcomes while preserving user privacy. Crucially, the API does not measure advertising effectiveness or establish a direct causal link between an ad exposure and a user's behavior. Instead, it aggregates data on the chronological sequencing of a limited set of observable ad-related events within a single browser instance. The scope and utility of these insights are inherently bounded by the configurations chosen for measurement—specifically what events are tracked, when lookback windows are applied, and how assignment logic correlates those events. To approximate actual advertising efficacy or incrementality, implementations must supplement these associative baselines with rigorous external experimental designs, such as randomized control trials.

@AramZS

AramZS commented Jun 11, 2026

Copy link
Copy Markdown
Member

@bmayd Hi! Do you think the current PR's revisions covers those points? I had hoped we made that clear with those revisions and that merging them in would cover these concerns.

@rickcentralcontrolcom

rickcentralcontrolcom commented Jun 12, 2026 via email

Copy link
Copy Markdown

@bmayd

bmayd commented Jun 12, 2026

Copy link
Copy Markdown

@AramZS Thanks for asking directly. I've reviewed the current PR preview. This PR is an improvement and I don't object to it merging — but I don't think merging it should be treated as resolving the underlying concern, and I'd ask for two small additional edits before it does.

The d4ae7d9 commit is welcome but narrow. The passages Rick enumerated remain in the current preview, including the two most direct causal claims:

From the Abstract:

This specifies a browser API for the measurement of advertising performance. The goal is to produce aggregate statistics about how advertising leads to conversions...

From §1.4 (End-User Benefit):

Support for attribution enables more effective advertising, largely by helping advertisers understand which ads perform best, and in what circumstances.

The new limitations section directly contradicts both. It states that attribution measures associations, that attributing outcomes to ads can create "a false impression of their efficacy," and that randomized control trials are necessary to measure causal effects. A spec whose abstract claims to measure "advertising performance" and "how advertising leads to conversions," and whose benefits section claims it identifies "which ads perform best," while its limitations section explains it can do none of those things unaided, is internally inconsistent. I'm not asking the group to adopt my view of attribution here — only to make these passages and the rest of the document consistent with what the new section itself says.

Concretely, I'd suggest:

  1. Before merge: fix the two passages above. Rick has already drafted suggested replacement language for both in his May 29 comment; either version would do. These are one-line edits.
  2. After merge: open a follow-up issue to track the systematic terminology review of the remaining passages Rick enumerated, per his comments and my comment above.

I also want to add context for why this is worth the effort, beyond statistical pedantry. When correlational attribution metrics are taken at face value as measures of effectiveness, they don't just produce noisy answers — they produce systematically biased ones, and the bias has a well-understood structural cause.

The retargeting fallacy and attribution arbitrage

A system optimized to maximize conversions attributed through correlation learns to identify users with high baseline purchase propensity. Once enough data exists to anticipate a conversion, the economically rational strategy is to follow those users across contexts and serve them ads whose marginal influence is minimal, because the purchase decision has substantially already been made. The more effective a strategy is at identifying consumers already on the path to conversion — which is best accomplished with invasive cross-context tracking — the better it looks under correlational attribution. Attributed CPA and ROAS look excellent; incremental value can be near zero. This is not hypothetical: large-scale randomized experiments have repeatedly found exactly this divergence between attributed and incremental outcomes (e.g., Blake, Nosko & Tadelis's eBay paid-search experiments; Gordon et al.'s comparisons of attribution against experimental lift across Facebook campaigns).

If a W3C standard's framing encourages these metrics to be read as effectiveness measures, the incentive structure it legitimizes is destructive for every constituency:

  • Advertisers pay a tax on conversions that would have happened anyway, and are pushed into an arms race of handing over first-party customer identity data with the promise it will optimize targeting and the threat that advertiser who don't will lose out to competitors who do.
  • Consumers — even if the browser mechanism itself is privacy-preserving — face an ecosystem incentivized to collect data outside the API to identify users already on a conversion path, then serve them ads primarily to capture attribution credit. They end up overexposed to products they've already decided to buy and underexposed to anything new: the opposite of what advertising is for.
  • Premium publishers that respect their audiences are penalized. The genuine purchase intent and brand equity that quality sites generate ends up financially credited to low-quality inventory built to harvest attribution credit, and these publishers similarly pressured into surrendering first-party audience data under threat of losing budgets to competitors who do.

Why this belongs in the W3C's remit

The W3C's mandate is to guide the web toward its full potential in the interest of users and the health of the ecosystem. By framing an observational, correlation-based mechanism as a tool for measuring "performance" or "effectiveness," the document — however unintentionally — validates a methodology that rewards privacy-invasive tracking, wastes advertiser capital, and starves quality publishers of funding. The limitations section is the right first step; aligning the rest of the document with it is the necessary second one.

Happy to contribute to the follow-up review.

@martinthomson

Copy link
Copy Markdown
Member Author

Maybe @rickcentralcontrolcom, @bmayd, you could help me understand: If Attribution can be used correctly, to produce aggregated measurements that helps understanding of what advertising is working, why would you object to those claims in the abstract? The assumption that this is going to be used badly seems implicit in that objection, but the only attribution I've ever seen in practice is exactly the sort of controlled trial.

There might be a well-earned bias from advertising natives that says "attribution is bunkum". However, for a relative neophyte such as myself, who has only ever dealt with controlled trials, that seems pretty unnecessary.

My reaction to Rick's initial pushback was "well, of course if you use this wrong you'll get useless results".

The other pushback I can see is that the claims are overblown. That any measurement is imperfect and provides only a narrow read on the truth (like the blind men and elephants, if you will). I don't think that this perspective is disqualifying either, is it?

@rickcentralcontrolcom

Copy link
Copy Markdown

You write, "If Attribution can be used correctly, to produce aggregated measurements that helps understanding of what advertising is working..." But that is precisely the point I dispute.

The controlled trial is what tells you what is working. The attribution is incidental. The causal evidence comes from the randomization, not from the attribution mechanism.

I ran advertising research at DoubleClick and Google from 2004-2009, when many of these ideas were first taking shape. Attribution was never originally intended to measure advertising effectiveness. It emerged as a way to allocate fractional credit, and therefore payment, among multiple media companies that touched a conversion path. Over time, the industry began treating it as a measure of ROI, even though it never solved the fundamental causality problem.

If someone sees an ad and later makes a purchase, attribution records that sequence and assigns credit, which was meant to mean payment credit. But it cannot tell us whether the purchase would have happened anyway. That requires a counterfactual, which only a randomized control group can provide.

The W3C proposal improves privacy around attribution reporting, but there is nothing in it that facilitates randomized experiments. In fact, any meaningful experiment would happen entirely outside this mechanism, so claims that it supports experimentation are ill-founded. Whether the randomization is based on user IDs, household IDs, cookies, clean rooms, or geographic regions, the assignment of test and control groups is performed independently of the Attribution API. The proposal may coexist with experiments, but it does not enable them.

That distinction matters because standards have a way of legitimizing ideas. My overarching concern is that the proposal positions attribution reporting in the realm of advertising effectiveness when attribution has never been capable of measuring causal impact.

The danger is not technical. It is conceptual. By having an august body like the W3C be perceived to legitimize and institutionalize the claim that attribution is a tool for understanding what advertising works, the industry risks perpetuating a fallacy that has already distorted media investment for decades. The result is predictable: advertisers are steered toward what is easiest to observe rather than what actually drives incremental business outcomes. That perpetuates waste and misallocation of advertising budgets while slowing the industry's adoption of genuinely causal measurement methods. At the same time, spending becomes increasingly concentrated among the handful of platforms best positioned to generate attribution signals (i.e., the biggest tech giants), to the detriment of thousands of other media companies that are already struggling to survive.

That's why I object to claims that attribution helps determine advertising effectiveness. Attribution measures sequences of events. Experiments measure causal impact.

I laid out my broader concerns about the proposal here:

https://www.adexchanger.com/data-driven-thinking/the-w3c-is-making-a-critical-mistake-about-measuring-advertising-effectiveness/

@bmayd

bmayd commented Jun 15, 2026

Copy link
Copy Markdown

@martinthomson as I was responding @rickcentralcontrolcom expressed the core concern better than what I was writing — causal evidence comes from the randomization, which determines whether what's measured can be read causally, not the attribution mechanism that does the measuring, and the randomization and assignment of test and control groups is external to this API. The "used correctly" framing is the crux: when measuring causality, the correct use of the API is as a means of counting correlated events within experimental subgroups, at least one of which is a properly defined, randomly selected control group.

Your experience — attribution used inside a controlled trial — is the right approach; the problem is it's the exception. The reality in adtech is the vast majority of digital ad spend is focused exclusively on campaigns consisting only of treatment groups: groups targeting inventory optimized automatically by algorithms trained on simple, observational last-interaction or multi-touch correlation. There are rarely control groups (outside of platforms running discrete lift studies) and almost never on the long tail of inventory. In other words, in the vast majority of cases campaign performance is measured exclusively based on correlation observations derived from a deliberately optimized, highly biased sample, the raw output of which is unfit for use in measuring causality. When the W3C Attribution draft includes statements that directly assert this API is for "the measurement of advertising performance" and produces statistics about "how advertising leads to conversions," and other similar statements, it validates false market assumptions that the API's raw output equals causal impact. Adtech, which has spent the past two decades optimizing to intercept the path to conversion and reporting it as proof of campaign success, will take that endorsement at face value and as justification to perpetuate the practice.

Regarding the blind-men-and-the-elephant framing: if the objection were "all measurement is imperfect and gives a narrow read," I'd agree it wouldn't be disqualifying and I wouldn't be raising it. That's not the issue; imperfection is about fidelity — a noisy or partial read on the right quantity. This is about identity — association and causation are categorically different and no matter how precise and complete observational attribution data is, it doesn't measure causal effect. A caveat mentioning this in any one section isn’t sufficient when the rest of the document strongly implies, or outright states, the API measures performance.

And to your suggestion "the objection assumes it'll be used badly", it doesn't assume bad faith. The default use of an attribution API is observational — impressions matched to conversions, no control arm — and as the limitations section states, that use creates "a false impression of [advertising’s] efficacy." The concern is the ordinary case the spec itself flags, not a hypothetical abuse. Nor is the implication "attribution is bunkum": attribution is genuinely useful for what it measures — which placements, creatives, and contexts are associated with conversions, i.e. diagnostics, optimization inputs, and counting inside an experiment. The objection is to labeling that association "effectiveness" or "performance" and implying the API measures more than it can.

As suggested in previous posts, the instances where language clearly asserts or implies that the API measures performance ought to be revised, if not before this PR is merged, then in a follow-up terminology review – happy to help with that.

@martinthomson

Copy link
Copy Markdown
Member Author

This is helpful. My initial reaction to some of the feedback from Rick was that this was an overreaction. That is, the claims in the document were about the value of measurement more generally, which I think is still defensible. But your comments have highlighted that this was being directly linked to attribution, rather than saying that attribution provides information as part of a larger measurement strategy that can then contribute to those outcomes. I've made a few more changes, notably in the introduction, goals, and user benefit sections. I hope those are in the right direction.

I still think it is necessary to take some liberties with the user benefit section, but I hope that the line is clearer now: attribution -> measurement -> better advertising.

Comment thread api.bs Outdated
@bmayd

bmayd commented Jun 17, 2026

Copy link
Copy Markdown

@martinthomson Good updates which improve things overall. I found a typo ("enables" should be "enable") I flagged in a comment. I still think a terminology review is in order before finalizing, some of the things Rick pointed out are still outstanding, but don't think it should hold this up so @AramZS I'm OK with merging this.

As the limitations section now observes, "the web platform presently does not have a facility for providing consistent allocation of users into control and treatment groups across different sites." I've been working on a proposal for filling that gap so the API can support incrementality measurement. I'll post in separate issue a suggestion for a privacy-preserving way to provide that facility, and I'll link it here when it's up — intended as the constructive complement to this PR, not anything that should slow it down.

Co-authored-by: Brian May <61555125+bmayd@users.noreply.github.com>
@bmayd

bmayd commented Jun 18, 2026

Copy link
Copy Markdown

I posted issue #451 with suggested additions for incrementality measurement support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants