Title: Topics API
Status: w3c/CG-DRAFT
ED: https://github.com/patcg-individual-drafts/topics
Shortname: topics
Level: 1
URL: https://github.com/patcg-individual-drafts/topics
Editor: Yao Xiao, Google, yaoxia@chromium.org
Editor: Josh Karlin, Google, jkarlin@chromium.org
Abstract: This specification describes a method that could enable ad-targeting based on a person's general browsing interests without exposing their exact browsing history.
!Participate: GitHub patcg-individual-drafts/topics (new issue, open issues)
Group: patcg
Repository: patcg-individual-drafts/topics
Markup Shorthands: markdown yes
spec: html; urlPrefix: https://html.spec.whatwg.org/multipage/
    type: dfn
        text: node navigable; url: document-sequences.html#node-navigable
        text: relevant settings object; url: webappapis.html#relevant-settings-object
        text: top-level traversable; for:navigable; url: document-sequences.html#nav-top
        text: active document; for:navigable; url: document-sequences.html#nav-document
        text: navigable; for: Window; url: nav-history-apis.html#window-navigable
spec: html; urlPrefix: https://wicg.github.io/nav-speculation/
    type: dfn
        text: prerendering navigable; url: prerendering.html#prerendering-navigable
        text: post-prerendering activation steps list; url: prerendering.html#platform-object-post-prerendering-activation-steps-list
spec: html; urlPrefix: https://www.rfc-editor.org/rfc/
    type: dfn
        text: HMAC algorithm; url: rfc6234#section-8.3
spec: html; urlPrefix: https://www.rfc-editor.org/rfc/
    type: dfn
        text: Structured Fields Token; url: rfc8941.html#name-tokens
    type: dfn
        text: Structured Fields Parameters; url: rfc8941.html#name-parameters

Introduction

In today's web, people's interests are typically inferred based on observing what sites or pages they visit, which relies on tracking techniques like third-party cookies or less-transparent mechanisms like device fingerprinting. It would be better for privacy if interest-based advertising could be accomplished without needing to collect a particular individual's browsing history. This specification provides an API to enable ad-targeting based on a person's general browsing interests, without exposing their exact browsing history.
Creating an ad based on browsing interests, using the {{Document/browsingTopics()|document.browsingTopics()}} JavaScript API: (Inside an `https://ads.example` iframe)
      // document.browsingTopics() returns an array of BrowsingTopic objects.
      const topics = await document.browsingTopics();

      // Get data for an ad creative.
      const response = await fetch('https://ads.example/get-creative', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
        },
        body: JSON.stringify(topics)
      });

      // Get the JSON from the response.
      const creative = await response.json();

      // Display the ad.
    
Creating an ad based on browsing interests, based on the [:Sec-Browsing-Topics:] HTTP request header sent by this invocation of {{WindowOrWorkerGlobalScope/fetch()}}: (Inside the top level context)
      // A 'Sec-Browsing-Topics: [topics header value]' header will be sent in
      // the HTTP request.
      const response = await fetch('https://ads.example/get-creative', {browsingTopics: true});
      const ad_creative = await response.json();
      // Display the ad.
    

Terminology and types

A taxonomy comprises a list of advertising topic ids as integers. A [=browsing topics types/taxonomy=] is identified by a taxonomy version string. A [=browsing topics types/topic id=] is no smaller than 1. The taxonomy must be in a tree hierarchy, where an ancestor [=browsing topics types/topic id=] always represents something more general than its descendant [=browsing topics types/topic ids=]. The browser should implement an get descendant topics algorithm, which takes in a [=browsing topics types/topic id=], and returns its descendants [=browsing topics types/topic ids=] as a [=list=]. The model version is a string that identifies the model used to classify a string into [=topic ids=]. The meaning may vary across browser vendors. The classification result [=topic ids=] should be relevant to the input string's underlying content. The configuration version identifies the algorithm (other than the model part) used to calculate the topic. It should take the form of "<browser vendor identifier>.<an integer version>". The meaning may vary across browser vendors. Given [=browsing topics types/configuration version=] |configurationVersion|, [=browsing topics types/taxonomy version=] |taxonomyVersion|, and [=browsing topics types/model version=] |modelVersion|, the version is the result of [=string/concatenating=] « |configurationVersion|, |taxonomyVersion|, |modelVersion| » using ":". The maximum version string length is the maximum possible string length of a [=browsing topics types/version=] that a user agent could possibly generate in a given software release. For example, in Chrome's experimentation phase, 13 was used for the [=browsing topics types/maximum version string length=] to account for a version like chrome.1:1:11. A user topics state is a struct with the following fields and default values: - epochs: a list of [=epoch=]s, default to an empty list. - hmac key: 128 bit number, default to 0. An epoch is a struct with the following fields: - taxonomy: a list of integers. - taxonomy version: a string. - model version: a string. - config version: a string. - top 5 topics with caller domains: a list of [=topic with caller domains=]. - time: a {{DOMHighResTimeStamp}} (from Unix epoch). A topic with caller domains is a struct with the following fields: - topic id: an integer. - caller domains: a set of [=domains=]. A topics history entry is a struct with the following fields and default values: - document id: an integer, default to 0. - topics calculation input data: a string, default to an empty string. - time: a {{DOMHighResTimeStamp}} (from Unix epoch). - topics caller domains: an ordered set of [=domains=], default to an empty set. A topics caller context is a struct with the following fields: - caller domain: a [=domain=]. - top level context domain: a [=domain=]. - timestamp: a {{DOMHighResTimeStamp}} (from Unix epoch).
All [=domains=] used in this API will be result of obtaining the [=registrable domain=] from some [=host=].

User agent associated state

Each [=user agent=] has an associated [=browsing topics types/user topics state=] user topics state with [=user topics state/epochs=] initially empty, and [=user topics state/hmac key=] initially a randomly generated 128 bit number. Each [=user agent=] has an associated topics history storage to store the information about the visited pages that are needed for topics calculation. It is a [=list=] of [=topics history entries=], initially empty. Each [=user agent=] has an associated [=browsing topics types/taxonomy=] taxonomy (identified by [=browsing topics types/taxonomy version=] taxonomy version) and [=browsing topics types/model=] model (identified by [=browsing topics types/model version=] model version). The [=user agent/taxonomy=] and [=user agent/model=] may be shipped to the browser asynchronously w.r.t. the browser release, and may be unavailable at a given point. They must be updated atomically w.r.t. algorithms that access them (e.g. the [=calculate user topics=] algorithm). Note: The initial taxonomy used in Chrome is taxonomy_v1.md and the expectation is that it will change over time. Each [=user agent=] has an associated topics algorithm configuration (identified by [=browsing topics types/configuration version=] configuration version). The initial value and meaning is browser defined. Note: The [=browsing topics types/configuration version=] allows the browser vender to provide algorithms different from the ones specified in this specification. For example, for some of the algorithms in this specification, it may be possible to use a different constant value, while the system overall still has utility and meets the privacy goals. When [=user agent/configuration version=] is updated, the browser must properly migrate or delete data in [=user agent/user topics state=] and [=user agent/topics history storage=] so that the state and the configuration are consistent.

BrowsingTopic dictionary

The {{BrowsingTopic}} dictionary is used to contain the IDL correspondences of [=browsing topics types/topic id=], [=browsing topics types/version=], [=browsing topics types/configuration version=], [=browsing topics types/taxonomy version=], and [=browsing topics types/model version=].
  dictionary BrowsingTopic {
    [EnforceRange] unsigned long long topic;
    DOMString version;
    DOMString configVersion;
    DOMString modelVersion;
    DOMString taxonomyVersion;
  };
  
An example {{BrowsingTopic}} object from Chrome: { configVersion: "chrome.1", modelVersion: "1", taxonomyVersion: "1", topic: 43, version: "chrome.1:1:1" }.
A {{BrowsingTopic}} dictionary |a| is code unit less than a {{BrowsingTopic}} dictionary |b| if the following steps return true: 1. If |a|["{{BrowsingTopic/version}}"] is [=/code unit less than=] |b|["{{BrowsingTopic/version}}"], then return true. 1. If |a|["{{BrowsingTopic/topic}}"] < |b|["{{BrowsingTopic/topic}}"], then return true. 1. Return false.

document ID

Each {{Document}} has a document id, which is an [=implementation-defined=] unique identifier shared with no other {{Document}} objects within or across browser sessions for a user agent.

Determine topics calculation input data

Given a {{Document}}, the browser must have a way to determine the topics calculation input data. [=determine-topics-calculation-input-data-header/topics calculation input data=] is a string that encodes the attributes to be used for topics classification, as determined by the browser vendor. By default, the attributes should be scoped to the document's [=Document/URL=] and metadata. Note: unless specifically allowed, data beyond the document shouldn't be included, such as data from localStorage or cookies. Note: In Chrome's experimentation phase, the [=host=] of a {{Document}}'s [=Document/URL=] is used as the [=determine-topics-calculation-input-data-header/topics calculation input data=], and the model is trained with human curated hostnames and topics.

Collect page topics calculation input data

To collect page topics calculation input data, given a {{Document}} |document|: 1. If |document|'s [=node navigable=] is a [=prerendering navigable=], then append the following steps to |document|'s [=post-prerendering activation steps list=] and return. Else, run the following steps [=in parallel=]: 1. Let |documentId| be |document|'s [=document-id-header/document id=]. 1. If user agent's [=user agent/topics history storage=] contains a [=topics history entry=] whose [=topics history entry/document id=] is |documentId|, return. 1. Let |topicsHistoryEntry| be a [=topics history entry=]. 1. Set |topicsHistoryEntry|'s [=topics history entry/document id=] to |documentId|. 1. Set |topicsHistoryEntry|'s [=topics history entry/topics calculation input data=] to the [=determine-topics-calculation-input-data-header/topics calculation input data=] for |document|. 1. Let |unsafeMoment| be the [=wall clock=]'s [=wall clock/unsafe current time=]. 1. Let |moment| be the result of running [=coarsen time=] algorithm given |unsafeMoment| and [=wall clock=] as input. 1. Let |fromUnixEpochTime| be the [=duration from=] the [=Unix epoch=] to |moment|. 1. Set |topicsHistoryEntry|'s [=topics history entry/time=] to |fromUnixEpochTime|. 1. [=list/Append=] |topicsHistoryEntry| to user agent's [=user agent/topics history storage=].

Collect topics caller domain

To collect topics caller domain, given a {{Document}} |document| and a [=domain=] |callerDomain|: 1. Run the following steps [=in parallel=]: 1. Let |documentId| be |document|'s [=document-id-header/document id=]. 1. If user agent's [=user agent/topics history storage=] does not contain a [=topics history entry=] whose [=topics history entry/document id=] is |documentId|, return. 1. Let |topicsHistoryEntry| be the [=topics history entry=] in user agent's [=user agent/topics history storage=] whose [=topics history entry/document id=] is |documentId|. 1. [=set/Append=] |callerDomain| to |topicsHistoryEntry|'s [=topics caller domains=].

Derive top 5 topics

Given a [=list=] of [=topics history entries=] historyEntriesForUserTopics, the browser should provide an algorithm to derive top 5 topics, that are believed to be valuable for the Topics callers. The algorithm should return a [=list=] of 5 [=topic ids=].
Chrome's initial release scores topics by the frequency of page loads with that topic.
Given a [=list=] of [=topics history entries=] |historyEntriesForUserTopics|: 1. Let |topicsCount| be an empty map. 1. For each [=topics history entry=] |historyEntry| in |historyEntriesForUserTopics|: 1. Let |topicIds| be the result of [=classifying=] |historyEntry|'s [=topics history entry/topics calculation input data=]. 1. For each |topicId| in |topicIds|: 1. If |topicsCount|[|topicId|] does not exist: 1. Initialize |topicsCount|[|topicId|] to 0. 1. Increment |topicsCount|[|topicId|] by 1. 1. Let |top5Topics| be a list containing the top up to 5 |topicId|s in |topicsCount|'s [=map/keys=], where the |topicId|s with more count are retrieved first. 1. If |top5Topics| has less than 5 entries: 1. Pad |top5Topics| with random topic ids from user agent's [=user agent/taxonomy=], until |top5Topics| has 5 entries. 1. Return |top5Topics|.

Periodically calculate user topics

At the start of a browser session, run the [=schedule user topics calculation=] algorithm.
This roughly schedules topic calculation every 7 days, unless the browser is inactive at the scheduled time(s), in which case a topic calculation will occur as soon as the browser restarts.
To schedule user topics calculation, perform the following steps: 1. Let |unsafeMoment| be the [=wall clock=]'s [=wall clock/unsafe current time=]. 1. Let |moment| be the result of running [=coarsen time=] algorithm given |unsafeMoment| and [=wall clock=] as input. 1. Let |fromUnixEpochTime| be the [=duration from=] the [=Unix epoch=] to |moment|. 1. Let |presumedNextCalculationDelay| be a [=duration=] of 0. 1. If user agent's [=user agent/user topics state=]'s [=user topics state/epochs=] is not empty: 1. Let |numEpochs| be user agent's [=user agent/user topics state=]'s [=user topics state/epochs=]'s [=list/size=]. 1. Let |lastTopicsCalculationTime| beuser agent's [=user agent/user topics state=]'s [=user topics state/epochs=][|numEpochs| − 1]. 1. Let |presumedNextCalculationDelay| be |lastTopicsCalculationTime| + (a [=duration=] of 7 days) − |fromUnixEpochTime|. 1. If |presumedNextCalculationDelay| < (a [=duration=] of 0), then set |presumedNextCalculationDelay| to (a [=duration=] of 0). 1. Else if |presumedNextCalculationDelay| ≥ (a [=duration=] of 14 days), then set |presumedNextCalculationDelay| to (a [=duration=] of 0). Note: This could happen if the machine time has gone backward since the last topics calculation. Recalculate immediately to align with the expected schedule rather than potentially stop calculating for a very long time. 1. Schedule the [=calculate user topics=] algorithm to run at [=Unix epoch=] + |fromUnixEpochTime| + |presumedNextCalculationDelay|.
To calculate user topics, perform the following steps: 1. Let |unsafeMoment| be the [=wall clock=]'s [=wall clock/unsafe current time=]. 1. Let |moment| be the result of running [=coarsen time=] algorithm given |unsafeMoment| and [=wall clock=] as input. 1. Let |fromUnixEpochTime| be the [=duration from=] the [=Unix epoch=] to |moment|. 1. If either user agent's [=user agent/model=] or [=user agent/taxonomy=] isn't available: 1. Let |epoch| be an [=epoch=] struct with default initial field values. 1. Set |epoch|'s [=epoch/time=] to |fromUnixEpochTime|. 1. [=list/Append=] |epoch| to user agent's [=user agent/user topics state=]'s [=user topics state/epochs=]. 1. If user agent's [=user agent/user topics state=]'s [=user topics state/epochs=] has more than 4 entries, remove the oldest epoch (i.e. the epoch with index 0). 1. Schedule this [=calculate user topics=] algorithm to run at [=Unix epoch=] + |fromUnixEpochTime| + (a [=duration=] of 7 days). 1. Return. 1. Let |historyEntriesForUserTopics| be an empty list. 1. Let |topicsCallers| be an empty map. 1. Let |userTopicsDataStartTime| be |fromUnixEpochTime| − (a [=duration=] of 7 days). 1. Let |topicsCallerDataStartTime| be |fromUnixEpochTime| − (a [=duration=] of 21 days). 1. For each [=topics history entry=] |topicsHistoryEntry| in user agent's [=user agent/topics history storage=]: 1. Let |visitTime| be |topicsHistoryEntry|'s [=topics history entry/time=]. 1. If |visitTime| is before |topicsCallerDataStartTime|, then continue. 1. Let |topicIds| be the result of [=classifying=] |topicsHistoryEntry|'s [=topics history entry/topics calculation input data=]. 1. If |visitTime| is greater than |userTopicsDataStartTime|: 1. [=list/Append=] |topicsHistoryEntry| to |historyEntriesForUserTopics|. 1. For each |topicId| in |topicIds|: 1. If |topicsCallers|[|topicId|] does not exist: 1. Initialize |topicsCallers|[|topicId|] to be an empty [=list=]. 1. For each |callerDomain| in |topicsHistoryEntry|'s [=topics history entry/topics caller domains=]: 1. [=list/Append=] |callerDomain| to |topicsCallers|[|topicId|]. 1. Let |top5Topics| be the result of running [=derive top 5 topics=] algorithm, given |historyEntriesForUserTopics|. 1. Let |top5TopicsWithCallerDomains| be an empty [=list=]. 1. For each |topTopicId| in |top5Topics|: 1. Let |topicWithCallerDomains| be a [=topic with caller domains=] struct with [=topic with caller domains/topic id=] initially 0 and [=topic with caller domains/caller domains=] initially empty. 1. If |topTopicId| is allowed by user preference setting: 1. Set |topicWithCallerDomains|'s [=topic with caller domains/topic id=] to |topicId|. 1. Let |topicWithDescendantIds| be the result of running [=get descendant topics=] given |topTopicId|. 1. Add |topTopicId| to |topicWithDescendantIds|. 1. For each |topicId| in |topicWithDescendantIds|: 1. If |topicId| is allowed by user preference setting: 1. Insert all elements in |topicsCallers|[|topicId|] to |topicWithCallerDomains|'s [=topic with caller domains/caller domains=]. 1. [=list/Append=] |topicWithCallerDomains| to |top5TopicsWithCallerDomains|. 1. Let |epoch| be an [=epoch=] struct with default initial field values. 1. Set |epoch|'s [=epoch/taxonomy=] to user agent's [=user agent/taxonomy=]. 1. Set |epoch|'s [=epoch/taxonomy version=] to user agent's [=user agent/taxonomy version=]. 1. Set |epoch|'s [=epoch/model version=] to user agent's [=user agent/model version=]. 1. Set |epoch|'s [=epoch/config version=] to user agent's [=user agent/configuration version=]. 1. Set |epoch|'s [=epoch/top 5 topics with caller domains=] to |top5TopicsWithCallerDomains|. 1. Set |epoch|'s [=epoch/time=] to |fromUnixEpochTime|. 1. [=list/Append=] |epoch| to user agent's [=user agent/user topics state=]'s [=user topics state/epochs=]. 1. If user agent's [=user agent/user topics state=]'s [=user topics state/epochs=] has more than 4 entries, remove the oldest epoch. 1. Schedule this [=calculate user topics=] algorithm to run at [=Unix epoch=] + |fromUnixEpochTime| + (a [=duration=] of 7 days).

Epochs for caller

To calculate the epochs for caller, given a [=topics caller context=] |callerContext|, perform the following steps. They return a list of [=epoch=]. 1. Let |epochs| be user agent's [=user agent/user topics state=]'s [=user topics state/epochs=]. 1. If |epochs| is empty, then return an empty [=list=]. 1. Let |numEpochs| be |epochs|'s [=list/size=]. 1. Let |lastEpochTime| be |epochs|[|numEpochs| − 1]'s [=epoch/time=]. 1. Let |epochSwitchTimeDecisionMessageArray| be the concatenation of "epoch-switch-time-decision|" and |callerContext|'s [=topics caller context/top level context domain=]. 1. Let |epochSwitchTimeDecisionHmacOutput| be the output of the [=HMAC algorithm=], given input parameters: whichSha=SHA256, key=user agent's [=user agent/user topics state=]'s [=user topics state/hmac key=], and message_array=|epochSwitchTimeDecisionMessageArray|. 1. Let |epochSwitchTimeDecisionHash| be 64 bit truncation of |epochSwitchTimeDecisionHmacOutput|. 1. Let |epochSwitchTimeDelayIntroduction| be a [=duration=] of (|epochSwitchTimeDecisionHash| % 172800) seconds (i.e. 172800 is 2 days in seconds). 1. Let |timestamp| be |callerContext|'s [=topics caller context/timestamp=]. 1. Let |result| be an empty [=list=]. 1. Let |startEpochIndex| be -1. 1. Let |endEpochIndex| be -1. 1. If |timestamp| ≤ |lastEpochTime| + |epochSwitchTimeDelayIntroduction|: 1. Set |startEpochIndex| to max(|numEpochs| − 4, 0). 1. Set |endEpochIndex| to |numEpochs| − 2. 1. Else: 1. Set |startEpochIndex| to max(|numEpochs| − 3, 0). 1. Set |endEpochIndex| to |numEpochs| − 1. 1. If |endEpochIndex| ≥ 0: 1. Let |i| be |startEpochIndex|. 1. While |i| ≤ |endEpochIndex|: 1. [=list/Append=] |epochs|[|i|] to |result|. 1. Set |i| to |i| + 1. 1. Return |result|.
This roughly returns 3 recently calculated epochs, either counting back from the last epoch, or from the second to the last epoch. The decision depends on whether some fixed duration (between 0 and 2 days, sticky to a user agent & site) has passed since the last epoch was calculated. This essentially adds a per-site fixed delay to the epoch switch time, to make it harder to correlate the same user across sites via the time that topics are changed. The HMAC helps to compute the per-site delay on the fly, without needing to store extra data for each site.

Get the number of distinct versions in epochs

To get the number of distinct versions in epochs, given a [=topics caller context=] |callerContext|, perform the following steps. They return an integer. 1. Let |epochs| be the result of running the [=calculate the epochs for caller=] algorithm given |callerContext| as input. 1. Let |distinctVersions| be an empty set. 1. For each |epoch| in |epochs|: 1. If |epoch|'s [=epoch/taxonomy version=] is empty (implying that the topics calculation for that epoch didn't occur), then continue. 1. Insert tuple (|epoch|'s [=epoch/taxonomy version=], |epoch|'s [=epoch/model version=]) to distinctVersions. 1. Return |distinctVersions|'s [=list/size=].

Topics for caller

To calculate the topics for caller, given a [=topics caller context=] |callerContext|, perform the following steps. They return a list of {{BrowsingTopic}}s. 1. Let |epochs| be the result of running the [=calculate the epochs for caller=] algorithm given |callerContext| as input. 1. Let |result| be an empty [=list=]. 1. For each |epoch| in |epochs|: 1. If |epoch|'s [=epoch/top 5 topics with caller domains=] is empty (implying the topics calculation failed for that epoch), then continue. 1. Let |topic| be null. 1. Let |topTopicIndexDecisionMessageArray| be the concatenation of "top-topic-index-decision|", |epoch|'s [=epoch/time=], and |callerContext|'s [=topics caller context/top level context domain=]. 1. Let |topTopicIndexDecisionHmacOutput| be the output of the [=HMAC algorithm=], given input parameters: whichSha=SHA256, key=user agent's [=user agent/user topics state=]'s [=user topics state/hmac key=], and message_array=|topTopicIndexDecisionMessageArray|. 1. Let |topTopicIndexDecisionHash| be 64 bit truncation of |topTopicIndexDecisionHmacOutput|. 1. Let |topTopicIndex| be |topTopicIndexDecisionHash| % 5. 1. Let |topTopicWithCallerDomains| be |epoch|'s [=epoch/top 5 topics with caller domains=][|topTopicIndex|]. 1. If |topTopicWithCallerDomains|'s [=topic with caller domains/caller domains=] contains |callerContext|'s [=topics caller context/caller domain=]: 1. Set |topic| to an empty {{BrowsingTopic}} dictionary. 1. Set |topic|["{{BrowsingTopic/topic}}"] to |topTopicWithCallerDomains|'s [=topic with caller domains/topic id=]. 1. If |topic| is null, or if |topic|'s {{BrowsingTopic/topic}} is 0 (i.e. the candidate topic was cleared), then continue. 1. Let |randomOrTopTopicDecisionMessageArray| be the concatenation of "random-or-top-topic-decision|", |epoch|'s [=epoch/time=], and |callerContext|'s [=topics caller context/top level context domain=]. 1. Let |randomOrTopTopicDecisionHmacOutput| be the output of the [=HMAC algorithm=], given input parameters: whichSha=SHA256, key=user agent's [=user agent/user topics state=]'s [=user topics state/hmac key=], and message_array=|randomOrTopTopicDecisionMessageArray|. 1. Let |randomOrTopTopicDecisionHash| be 64 bit truncation of |randomOrTopTopicDecisionHmacOutput|. 1. If |randomOrTopTopicDecisionHash| % 100 < 5: 1. Let |randomTopicIndexDecisionMessageArray| be the concatenation of "random-topic-index-decision|", |epoch|'s [=epoch/time=], and |callerContext|'s [=topics caller context/top level context domain=]. 1. Let |randomTopicIndexDecisionHmacOutput| be the output of the [=HMAC algorithm=], given input parameters: whichSha=SHA256, key=user agent's [=user agent/user topics state=]'s [=user topics state/hmac key=], and message_array=|randomTopicIndexDecisionMessageArray|. 1. Let |randomTopicIndexDecisionHash| be 64 bit truncation of |randomTopicIndexDecisionHmacOutput|. 1. Let |randomTopicIndex| be |randomTopicIndexDecisionHash| % |epoch|'s [=epoch/taxonomy=]'s [=list/size=]. 1. Set |topic|'s {{BrowsingTopic/topic}} to |epoch|'s [=epoch/taxonomy=][|randomTopicIndex|]. 1. Set |topic|["{{BrowsingTopic/configVersion}}"] to to |epoch|'s [=epoch/config version=]. 1. Set |topic|["{{BrowsingTopic/modelVersion}} to"] |epoch|'s [=epoch/model version=]. 1. Set |topic|["{{BrowsingTopic/taxonomyVersion}}"] to |epoch|'s [=epoch/taxonomy version=]. 1. Determine the [=browsing topics types/version=] |version|, given |topic|'s {{BrowsingTopic/configVersion}}, {{BrowsingTopic/modelVersion}} and {{BrowsingTopic/taxonomyVersion}} as input. 1. Set |topic|["{{BrowsingTopic/version}}"] to |version|. 1. Add |topic| to |result|. 1. Sort entries in |result| given the less-than comparator for the {{BrowsingTopic}} dictionary. 1. Remove duplicate entries in |result|. Two {{BrowsingTopic}} dictionaries |a| and |b| are considered equal if |a| is not [=browsing-topic/code unit less than=] |b| and |b| is not [=browsing-topic/code unit less than=] |a|. 1. Return |result|.
This roughly selects one random topic from each of the previous epochs (to limit cross-site reidentification capabilities), and only returns those that were observed by the caller (so that this provides roughly only a subset of the capabilities of third-party cookies). For each epoch, there is a 5% chance to return a random topic from the full taxonomy, rather than returning the real top topic, so as to provide some amount of plausible deniability. This random topic will only be returned if the caller would have received the real top topic (i.e. observed by the caller). This makes it non-trivial to detect which topics are the random topics (see github issue). All the randomnesses involved in this process are sticky to the user agent, epoch, and site. The HMAC helps to compute the random sticky values on the fly, without needing to store extra data for each epoch and site.

The JavaScript API

The Topics API lives under the {{Document}} interface, and is only available if the document is in [=secure context=].
    dictionary BrowsingTopicsOptions {
      boolean skipObservation = false;
    };

    partial interface Document {
        [SecureContext] Promise<sequence<BrowsingTopic>> browsingTopics(optional BrowsingTopicsOptions options = {});
    };
  
The browsingTopics(options) method steps are: 1. Let |document| be [=this=]. 1. Let |topLevelDocument| be |document|'s [=node navigable=]'s [=navigable/top-level traversable=]'s [=navigable/active document=]. 1. Let |promise| be [=a new promise=]. 1. Let |topicsCallerContext| be a [=topics caller context=]. 1. Set |topicsCallerContext|'s [=topics caller context/caller domain=] to |document|'s [=Document/origin=]'s [=origin/host=]'s [=host/registrable domain=]. 1. Set |topicsCallerContext|'s [=topics caller context/top level context domain=] to |topLevelDocument|'s [=Document/origin=]'s [=origin/host=]'s [=host/registrable domain=]. 1. Let |unsafeMoment| be the [=wall clock=]'s [=wall clock/unsafe current time=]. 1. Let |moment| be the result of running [=coarsen time=] algorithm given |unsafeMoment| and [=wall clock=] as input. 1. Let |fromUnixEpochTime| be the [=duration from=] the [=Unix epoch=] to |moment|. 1. Set |topicsCallerContext|'s [=topics caller context/timestamp=] to |fromUnixEpochTime|. 1. If any of the following is true: - |document|'s [=Document/origin=] is an [=opaque origin=]. - |document| is not [=allowed to use=] the browsing-topics feature. - |document| is not [=allowed to use=] the interest-cohort feature. - The user preference setting disallows the access to topics from |topLevelDocument| given |document|'s [=Document/origin=]. Note: In Chrome's experimentation phase, it will additionally require a valid Origin Trial token to exist in |document|. then: 1. [=Queue a global task=] on the browsing topics task source given |document|'s [=relevant global object=] to [=reject=] |promise| with a "{{NotAllowedError}}" {{DOMException}}. 1. Abort these steps. 1. Run the following steps [=in parallel=]: 1. Let |topics| be the result of running the [=calculate the topics for caller=] algorithm, with |topicsCallerContext| as input. 1. If options["{{BrowsingTopicsOptions/skipObservation}}"] is false: 1. Run the [=collect page topics calculation input data=] algorithm with |topLevelDocument| as input. 1. Run the [=collect topics caller domain=] algorithm with |topLevelDocument| and |topicsCallerContext|'s [=topics caller context/caller domain=] as input. 1. [=Queue a global task=] on the [=browsing topics task source=] given |document|'s [=relevant global object=] to perform the following steps: 1. [=Resolve=] |promise| with |topics|. 1. Return |promise|.

fetch() and iframe integration

Topics can be sent in the HTTP header for {{WindowOrWorkerGlobalScope/fetch()}} requests and for iframe navigation requests. The response header for a topics related request can specify whether the caller should to be recorded.

send browsing topics header boolean associated with Request

A [=request=] has an associated send browsing topics header boolean. Unless stated otherwise it is false. TODO: make the modification directly to the fetch spec.

browsingtopics content attribute for HTMLIframeElement

The iframe element contains a browsingtopics content attribute. The IDL attribute browsingTopics reflects the <{iframe/browsingtopics}> content attribute.
  partial interface HTMLIFrameElement {
    [CEReactions] attribute boolean browsingTopics;
  };
  
TODO: make the modification directly to the html spec.

browsingTopics attribute in RequestInit

The RequestInit dictionary contains a browsingTopics attribute:
  partial dictionary RequestInit {
    boolean browsingTopics;
  };
  
TODO: make the modification directly to the fetch spec.

Modification to request constructor steps

The following step will be added to the new Request(input, init) constructor steps, before step "Set this's [=Request/request=] to |request|": 1. If init["{{RequestInit/browsingTopics}}"] exists, then set |request|'s [=request/send browsing topics header boolean=] to it. TODO: make the modification directly to the fetch spec.

Modification to "create navigation params by fetching" steps

The following step will be added to the create navigation params by fetching steps, after step "Let |request| be a new [=Request/request=], with ...": 1. If navigable's [=container=] is an <{iframe}> element, and if it has a <{iframe/browsingtopics}> content attribute, then set |request|'s [=request/send browsing topics header boolean=] to true. TODO: make the modification directly to the html spec.

The \`Sec-Browsing-Topics\` HTTP request header

This specification defines a \`Sec-Browsing-Topics\` HTTP request header. It is used to send the topics.

Modification to HTTP-network-or-cache fetch algorithm

The following step will be added to the HTTP-network-or-cache fetch algorithm, before step "Modify |httpRequest|'s [=request/header list=] per HTTP. ...": 1. Append or modify a request \`Sec-Browsing-Topics\` header for |httpRequest|. TODO: make the modification directly to the fetch spec.

Append or modify a request `Sec-Browsing-Topics` header

To append or modify a request \`Sec-Browsing-Topics\` header, given a [=request=] |request|, run these steps: 1. If |request|'s [=request/send browsing topics header boolean=] is not true, then return. 1. [=header list/Delete=] [:Sec-Browsing-Topics:] from |request|'s [=header list=].

The topics a request is allowed to see can change within its redirect chain. For example, different caller domains may receive different topics, as the callers can only get the topics about the sites they were on. The timestamp can also affect the candidate epochs where the topics are derived from, thus resulting in different topics across redirects.

1. Let |initiatorWindow| be |request|'s [=request/window=]. 1. Let |requestOrigin| be |request|'s [=request/URL=]'s [=url/origin=]. 1. If |requestOrigin| is not a [=potentially trustworthy origin=], then return. 1. If |initiatorWindow| is not an [=environment settings object=], then return. 1. If |initiatorWindow| is not a [=secure context=], then return. 1. For each feature |f| in « "browsing-topic", "interest-cohort" »: 1. Run the Should request be allowed to use feature? algorithm with feature set to |f| and request set to |request|. If the algorithm returns false, then return. Note: the above algorithm should include the pending update, i.e. the |request| should be considered to contain the equivalent opt-in flags for both "browsing-topic" and the "interest-cohort" feature. 1. Let |topLevelDocument| be |initiatorWindow|'s [=environment settings object/global object=]'s [=Window/navigable=]'s [=navigable/top-level traversable=]'s [=navigable/active document=]. 1. Let |topicsCallerContext| be a [=topics caller context=] with default initial field values. 1. Set |topicsCallerContext|'s [=topics caller context/caller domain=] to |requestOrigin|'s [=origin/host=]'s [=host/registrable domain=]. 1. Set |topicsCallerContext|'s [=topics caller context/top level context domain=] to |topLevelDocument|'s [=Document/origin=]'s [=origin/host=]'s [=host/registrable domain=]. 1. Let |unsafeMoment| be the [=wall clock=]'s [=wall clock/unsafe current time=]. 1. Let |moment| be the result of running [=coarsen time=] algorithm given |unsafeMoment| and [=wall clock=] as input. 1. Let |fromUnixEpochTime| be the [=duration from=] the [=Unix epoch=] to |moment|. 1. Set |topicsCallerContext|'s [=topics caller context/timestamp=] to |fromUnixEpochTime|. 1. If the user preference setting disallows the access to topics from |topLevelDocument| given |requestOrigin|, then return. 1. Let |topics| be the result of running the [=calculate the topics for caller=] algorithm, with |topicsCallerContext| as input. 1. Let |numVersionsInEpochs| be the result of running the [=get the number of distinct versions in epochs=] algorithm, with |topicsCallerContext| as input. 1. Let |versionsToTopics| be an [=ordered map=]. 1. For each |topic| of |topics|: 1. Let |version| be |topic|["{{BrowsingTopic/version}}"]. 1. Let |topicInteger| be |topic|["{{BrowsingTopic/topic}}"]. 1. If |versionsToTopics|[|version|] does not exist, then set it to an empty list. 1. Append |topicInteger| to |versionsToTopics|[|version|]. 1. Let |topicsStructuredFieldsList| be an empty Structured Fields List. 1. For each |version| → |topicIntegers| of |versionsToTopics|: 1. Let |innerList| be an empty Structured Fields Inner List. 1. Append all items from |topicIntegers| to |innerList|. 1. Let |topicParameters| be an empty [=Structured Fields Parameters=]. 1. Set |topicParameters|["v"] to a [=Structured Fields Token=] with value |version|. 1. Associate |topicParameters| with |innerList|. 1. Append |innerList| to |topicsStructuredFieldsList|. 1. If |numVersionsInEpochs| is 0, then set |numVersionsInEpochs| to 1. 1. Let |maxNumberOfEpochs| be 3 (i.e. topics are selected from the last 3 epochs). 1. Let |topicMaxLength| be number of base-10 digits in the maximum [=browsing topics types/topic id=] (e.g. for Chrome's initial taxonomy, |topicMaxLength| is 3, as the [=browsing topics types/topic id=] has maximum 3 digits). 1. Let |versionMaxLength| be the length of the current [=browsing topics types/maximum version string length=]. 1. Let |listItemsSeparatorLength| be 2 (i.e. structured fields use a two characters (", ") to separate list items). 1. Let |perVersionedTopicsInnerListOverhead| be 5 (i.e. for "();v=") 1. Let |maxPaddingLength| be |maxNumberOfEpochs| * |topicMaxLength| + |maxNumberOfEpochs| - |numVersionsInEpochs| + |numVersionsInEpochs| * |perVersionedTopicsInnerListOverhead| + |numVersionsInEpochs| * |versionMaxLength| + (numVersionsInEpochs - 1) * |listItemsSeparatorLength|. 1. Let |paddingLength| be |maxPaddingLength|. 1. If |topicsStructuredFieldsList| is not empty: 1. Let |serializedTopicsList| be the result of executing the serializing structured fields algorithm on |topicsStructuredFieldsList|. 1. Decrement |paddingLength| by |serializedTopicsList|'s length. 1. Else: 1. Increment |paddingLength| by |listItemsSeparatorLength| (i.e. to account for the separator characters that would be added when |topics| are not empty). 1. If |paddingLength| < 0, then set |paddingLength| to 0. Note: the padding should generally be ≥ 0. It may be negative in certain circumstances: when historically stored topic versions are greater (and use more digits) than the current [=browsing topics types/maximum version string length=]; or when there is a race between getting topics and getting the number of distinct topic versions. Clamp to 0 to prevent breakage in these rare circumstances. 1. Let |paddedToken| be "P". 1. Append |paddingLength| "0" characters to the end of |paddedToken|. 1. Let |paddedEntryParameters| be an empty [=Structured Fields Parameters=]. 1. Set |paddedEntryParameters|["p"] to a [=Structured Fields Token=] with value |paddedToken|. 1. Let |emptyInnerList| be an empty Structured Fields Inner List. 1. Associate |paddedEntryParameters| with |emptyInnerList|. 1. Append |emptyInnerList| to |topicsStructuredFieldsList|. 1. [=Set a structured field value=] given ([:Sec-Browsing-Topics:], |topicsStructuredFieldsList|) in |request|'s [=request/header list=].
This algorithm transforms the topics list into structured fields format, which contains paddings to make the total length consistent for different topics callers.
Empty returned topics, and underlying epochs have same versions: ();p=P0000000000000000000000000000000
One returned topic, and underlying epochs have same versions: (1);v=chrome.1:1:2, ();p=P00000000000
Two returned topics, and underlying epochs have same versions: (1 2);v=chrome.1:1:2, ();p=P000000000
Two returned topics, and underlying epochs have two different versions: (1);v=chrome.1:1:2, (1);v=chrome.1:1:4, ();p=P0000000000
Three returned topics, and underlying epochs have three different versions: (100);v=chrome.1:1:20, (200);v=chrome.1:1:40, (300);v=chrome.1:1:60, ();p=P
Why adding paddings: servers typically have a GET request size limit e.g. 8KB, and will return an error when the limit is reached. An attacker can rely this to learn the number of topics for a different domain, and/or a small amount of information about the topics themselves (e.g whether the [=browsing topics types/topic ids=] are < 10, < 100, etc.) The various lengths being returned (that depends on the number of distinct versions) could leak which epochs the user had disabled topics or didn't use the browser, if it coincided with the version change. But this leak is minor. The most common cases (i.e. returning same version topics, or no topics) will have the same length.
In Chrome's experimentation phase, it will additionally require a valid Origin Trial token to exist in |initiatorWindow|'s associated document for the request to be eligible for topics.

The \`Observe-Browsing-Topics\` HTTP response header

The \`Observe-Browsing-Topics\` HTTP response header can be used to record a caller's topics observation.
To handle topics response, given a [=response=] |response| and a [=request=] request: 1. If |request|'s [=request/header list=] does not [=list/contain=] [:Sec-Browsing-Topics:] (implying the |request|'s [=request/current URL=] is not eligible for topics), then return. 1. Let |topLevelDocument| be |request|'s [=request/window=]'s [=environment settings object/global object=]'s [=Window/navigable=]'s [=navigable/top-level traversable=]'s [=navigable/active document=]. 1. Let |callerDomain| be |request|'s [=request/current URL=]'s [=url/origin=]'s [=origin/host=]'s [=host/registrable domain=]. 1. Let |list| be |response|'s [=response/header list=]. 1. Let |observe| be the result of running [=get a structured field value=] algorithm given [:Observe-Browsing-Topics:], "item", and |list| as input. 1. If |observe| is true: 1. Run the [=collect page topics calculation input data=] algorithm with |topLevelDocument| as input. 1. Run the [=collect topics caller domain=] algorithm with |topLevelDocument| and |callerDomain| as input.

Modification to HTTP fetch steps

The following step will be added to the [=HTTP fetch=] steps, before checking the redirect status (i.e. "If |actualResponse|'s status is a redirect status, ..."): 1. [=Handle topics response=], given [=response=] |actualResponse| and [=request=] |request| as input. TODO: make the modification directly to the fetch spec.

Permissions policy integration

This specification defines a [=policy-controlled feature=] identified by the string "browsing-topics". Its default allowlist is *.

For backward compatibility, this specification also defines a [=policy-controlled feature=] identified by the string "interest-cohort". Its default allowlist is *.

Privacy considerations

The Topics API attempts to provide just enough relevant interest information for advertisers to be able to personalize their ads for the user while maintaining user privacy. Some privacy safeguards include: usage in secure contexts only, topic limitation to a human curated taxonomy, different topics given to different sites in the same epoch to prevent cross-site reidentification, noised topics, a limited number of topics provided per epoch, user opt outs, site opt outs, and a suggestion that user agents provide UX to give users choice in which Topics are returned.