Proposal: declarativeNetRequest: matching based on response headers

Question

Proposal: declarativeNetRequest: matching based on response headers

Celsius273 opened this issue 10 months ago · comments

Hello!

My name is Kelvin Jiang and I am part of the Chrome Extensions team and I'll be working on adding the ability for declarativeNetRequest (DNR) rules to match based on response headers which is tracked in this crbug

From what I've gathered (requirements):

Add the ability to match on:

the existence of a response header
the non-existence of a response header
if a given response header includes/does not include a specified value

^to support this, I propose adding the following fields to RuleCondition:

// New type
HeaderCondition {
  // The name of the header.
  string header;
  
  // If specified, match this condition if the header's value contains at least one element in this list
  string[]? values; 
 
  // If specified, do not match this condition if the header exists but its value contains at least one element in this list.
  string[]? excludedValues;
}

RuleCondition: {
  // add the following 2 fields (below):

  // rule matches if the request matches any one condition in this list (if specified)
  HeaderCondition[]? responseHeaders;
  
  // rule does not match if the request matches any one condition in this list (if specified)
  // This is essentially a negation of responseHeaders above
  HeaderCondition[]? excludedResponseHeaders; 
}

Context: request stage for webRequest and DNR

Currently, DNR rules are matched during the onBeforeRequest stage. For rules that match on response headers, the earliest stage that they can be matched is onHeadersReceived when we receive the response headers from the request.

A few more details:

DNR actions that happen during the onBeforeRequest stage (block/redirect) will prevent the request from proceeding to the onHeadersReceived stage and so it will not get matched based on response headers
^this means that rule priorities only pertain to the request stage that the rule will get matched in
rules that match on response headers cannot modify request headers (since the request headers have already been sent in an earlier request stage)
rules that match on response headers will examine headers in the onHeadersReceived stage BEFORE they are modified by the webRequest API or any other DNR rules (like modifyHeaders rules that were matched in the onBeforeRequest stage)
matching based on value/excludedValues will be case insensitive

Let me know if this looks good or if there should any changes to the above proposal. I'm looking forward to working with all of you and I know that this feature has been requested for quite a while!

nir-walkme · Answer 1 · Sun Oct 01 2023 17:20:45 GMT+0800 (China Standard Time)

Hi @Celsius273

Thank you for consulting with the community.

We need to be able to modify the response headers by using rules that take into account the existing response header.
For example: Edit CSP response header and add a domain to the list of domains.
More information about the use case can be read here.

From what I understand, your proposal will not allow this, but sounds like it makes it more possible to happen in the future as DNR rules will now also evaluate during the onHeadersReceived.

Did you think about this requested functionality?
Any plans to implement it?

Kelvin Jiang · Answer 2 · Mon Oct 02 2023 09:06:46 GMT+0800 (China Standard Time)

Hi @nir-walkme

DNR currently already has the ability to modify headers: not just remove them, but override their value or append a value onto them, which could be useful for CSP. See the ModifyHeaderInfo object for how to do this. However, it does not currently have the ability to substitute headers (i.e. replace parts of a header's value with something else).

Note that even though modifyHeaders rules can be matched in the onBeforeRequest stage, their actions are still executed in the onHeadersReceived stage.

This proposal is essentially a "v1" of matching on response headers: we want to add/implement a viable base that satisfies some use cases before exploring something more complex such as header substitution, which seems like the use case you're suggesting.

Thanks

nir-walkme · Answer 3 · Sun Oct 08 2023 20:46:10 GMT+0800 (China Standard Time)

Thanks @Celsius273

I have one request regarding your existing proposal: Add the ability to match a header by value exactly and not by "contains".

Use case example

In order to solve partially the CSP problem described before, we could use predefined DNR rules that replaces a CSP header value completely with another value.

Let's assume we want to set a rule that adds newdomain.com to allowed script-src domains.
The existing header is Content-Security-Policy: script-src example.com 'self'

We will create the following DNR rule:
If Content-Security-Policy header equals Content-Security-Policy: script-src example.com 'self'
Change it to Content-Security-Policy: script-src example.com newdomain.com 'self'

The reason that a 'contains' rule is not enough is that if we had set the following rule:
If Content-Security-Policy header contains Content-Security-Policy: script-src example.com 'self'
Then if the Content-Security-Policy header would change in the future to Content-Security-Policy: script-src example.com 'self' example2.com; style-src 'self' then the rule would match but we would change it to Content-Security-Policy: script-src example.com newdomain.com 'self' and miss the example2.com domain.

Summary

The above request would help improve your "v1" matching proposal.
We would still be very happy to see "v2" per my previous comment.

Kelvin Jiang · Answer 4 · Fri Oct 13 2023 16:16:06 GMT+0800 (China Standard Time)

Copying a few examples here: examples 1 is relatively basic, 2a/2b deals with allow rules, 3 deals with modifyHeaders rules:

Going to add a few examples here:

Note about modifyHeaders (MH) rules: MH rule interactions are a bit difficult to reason about since multiple rules can match and rules can specify different operations.

MH rules that match on responseHeaders will still match on the request's original response header values, before they are modified by extensions.
MH rules that match on responseHeaders have a lower functional priority than MH rules that match on just a url/regex filter even if they have a higher specified priority (specified rule priorities only apply to the request stage that the rule matches in). See point 4 here: https://developer.chrome.com/docs/extensions/reference/declarativeNetRequest/#rule-prioritization-within-an-extension

Example 1

{
  id: 1,
  priority: 1,
  action : { type : block },
  condition: { urlFilter: "abc" }
}, {
  id: 2,
  priority: 99,
  action : { type : upgradeScheme },
  condition: {
    urlFilter: abc,
    responseHeaders: [{ header: set-cookie }]
  }
}

In this (trivial) example, a request from abc.com will get blocked even though rule 2 has a higher priority, since it matches on a later request stage.

Example 2a

{
  id: 3,
  priority: 99,
  action : { type : allow },
  condition: { urlFilter: "abc" }
}, {
  id: 4,
  priority: 1,
  action : { type : block },
  condition: {
    urlFilter: abc,
    responseHeaders: [{ header: set-cookie }]
  }
}

A request from abc.com with a set-cookie header will go through since rule 3 (allow) has a higher priority than rule 4 (block) and prevents rule 4 from matching.

Example 2b

{
  id: 3,
  priority: 1,
  action : { type : allow },
  condition: { urlFilter: "abc" }
}, {
  id: 4,
  priority: 99,
  action : { type : block },
  condition: {
    urlFilter: abc,
    responseHeaders: [{ header: set-cookie }]
  }
}

A request from abc.com with a set-cookie header will get blocked from rule 4, since rule 3 (allow) has a lower priority and does not prevent rule 4 (higher priority) from matching.

Example 3

{
  id: 5,
  priority: 1,
  action : {
    type : modifyHeaders,
    responseHeaders: [{ header: set-cookie, operation: set, value: "asdf" }]
  },
  condition: { urlFilter: "abc" }
}, {
  id: 6
  priority: 1,
  action : { type : block },
  condition: {
    urlFilter: abc,
    responseHeaders: [{ header: set-cookie, values: [ "bad-cookie" ] }]
  }
}

A request from abc.com with the set-cookie header “bad-cookie” will be blocked by rule 6 since it matches based on the header’s value. Note that this match is done on the header’s original value before rule 5 has a chance to modify it (block actions have higher precedence than modifyHeaders actions).

Thanks,
Kelvin

Kelvin Jiang · Answer 5 · Thu Oct 19 2023 06:12:22 GMT+0800 (China Standard Time)

Hello, one piece of the design for response header matching rules that we’d like to get some feedback from is the execution of modifyheaders rules.

Option 1:

When a request is initiated, any rules that are possible to run at this stage (those which do not modify, or match based on response headers) are run in normal priority order.
When a response is received, the remaining rules are run. We first run any rules that matched purely based on the request but were delayed since they perform response header modifications. We then run any with response header conditions. In both cases we sort based on the normal priority rules.

Option 2:

When a request is initiated, any rules that are possible to run at this stage (those which do not modify, or match based on response headers) are run in normal priority order (same as option 1).
When a response is received, the remaining rules are run in normal priority order. This means that a rule matching purely based on the request could still run after a rule matching on response headers if the rule matching on response headers is higher priority.

In both cases, response header conditions match based on the original headers before any modifications by DNR or webRequest.
Additionally, for both options, a block rule with a response headers condition will not actually block a request - we can only cancel it once we have received the headers.

We believe that option 2 produces more intuitive behavior that can be better adapted to more use cases.

Feedback and example use cases would be greatly appreciated, thanks!

Adrien CONSTANCIN · Answer 6 · Fri Nov 17 2023 22:47:55 GMT+0800 (China Standard Time)

Using DNR, any slightly complex modification strategy is impossible.

We have the same need as @nir-walkme of being able to modify the CSP directives (add specific URLs in some directives, add 'unsafe-inline' when needed, remove 'none' in some case...).

In Manifest v2 we implemented it simply using blocking webRequest.onHeadersReceived.
In Manifest v3 it is not possible, excepted for enterprises deploying via GPO (maybe 95% of the deployement of our extension, but not all).

Would is be possible to get CSP modification tools? Or a way to delegate some treatement of DNR rules to some code? If we could tell "modify this header using this helper function (defined in the webextension), with this parameter", we could simply tailor the CSP as we would like. You could put some restriction on the code to not have access to any API excepted some simple ones (no web extension API call, no fetch, no async...) to ensure it remains fast.

Kelvin Jiang · Answer 7 · Wed Jan 10 2024 08:41:18 GMT+0800 (China Standard Time)

Hello,

I have proposed an edit to the RuleCondition and HeaderCondition schema in the opening comment

Namely, excludedResponseHeaders will be specified as a list of HeaderCondition instead of just a list of header names (strings). The allows a rule to be not matched if a request contains a header with a specific value (vs before where the rule is not matched if a request just contains that header).

e.g. if a rule specifies the following in excludedResponseHeaders:

excludedResponseHeaders: [{
  header: 'Foo',
  values: ['included-value'],
  excluded_values: ['excluded-value']
}, {
  header: 'Bar'
}]

^the rule is not matched if:

it contains the header "bar", or
it contains the header "foo" and "included-value" but not "excluded-value"

FAQ: What's the difference between specifying a header's value in responseHeaders/excluded values vs excludedResponseHeaders/values ?

A:

responseHeaders: [{
  header: 'h1',
  excluded_values: ['bar']
}, {
  header: 'h2'
}]

excludedResponseHeaders: [{
  header: 'h3',
  values: ['bar']
}]

In this example, if the request contains the header "h1: [bar]", then other conditions in responseHeaders are still evaluated, and the rule will match if the request contains the header "h2".

However, the request contains the header "h3: [bar]", then conditions in responseHeaders will not be evaluated and the rule will not match.

Alexei · Answer 8 · Fri Jan 19 2024 06:09:03 GMT+0800 (China Standard Time)

@Celsius273 If DNR gains matching during the onHeadersReceived stage, we should also be able to match on response status codes.

For example, to block particular redirects, something like:

{
  id: 1,
  action: { type: "block" },
  condition: {
    responseHeaders: [{ header: "Location" }],
    // response status code conditions combine via AND, not OR?
    // however it's done, the big idea is we need to be able to specify status code ranges
    responseStatusCode: [
      { operator: ">=", value: 300 },
      { operator: "<", value: 400 }
    ],
    urlFilter: "abc",
    resourceTypes: ["xmlhttprequest"]
  }
},

Rob Wu · Answer 9 · Tue Jan 23 2024 09:45:52 GMT+0800 (China Standard Time)

Re #460 (comment) about the order of modifyHeader rule matching. It took some attempts to read it before I understood what you meant. To rephrase, the proposed options were:

Option 1: Two passes: run rules that are independent of header conditions first, followed by running rules that are dependent on header conditions. Priorities are only enforced as usual within each of the two runs, but "priority" cannot be used to get a header-conditional rule action (e.g. modifyHeaders) to apply before a rule that has no header conditions.
Option 2: Evaluate rules in the defined order of priority. The precedence if rules follow the usual definition and are not affected by the presence of a header condition.

I too agree that option 2 is preferred over option 1.

P.S. I am currently exploring edge cases and will post another comment later this week.

Daniel Jacobs · Answer 10 · Wed Jun 05 2024 21:46:01 GMT+0800 (China Standard Time)

With the API at https://crsrc.org/extensions/common/api/declarative_net_request.idl, is there a way to check if a response lacks a specific header altogether?

Not lacks a value in a header, but lacks the presence of a header entirely.

Daniel Jacobs · Answer 11 · Wed Jun 05 2024 22:25:08 GMT+0800 (China Standard Time)

Actually, I just realized this could be used:

                condition: {
                    regexFilter: ...,
                    excludedResponseHeaders: [
                        { header: headerName ] }
                    ],
                }

This would match anything matching the regex filter but lacking the header headerName, correct? So I suppose this use-case is already covered. Sorry for the confusion.

Daniel Jacobs · Answer 12 · Thu Jun 06 2024 02:45:32 GMT+0800 (China Standard Time)

Mozilla bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1877486
I opened this WebKit bug: https://bugs.webkit.org/show_bug.cgi?id=275158

Rob Wu · Answer 13 · Fri Jun 07 2024 01:01:13 GMT+0800 (China Standard Time)

(I typed this many weeks ago, just submitting now for context)

@Celsius273 I have tried the in-progress implementation in Chrome 127.0.6503.0 (associated with https://crbug.com/40727004 ), and it looks like the condition only matches if the header value is (case-insensitively) equal to any of the specified values. This is too limited for the PDF viewer use case because the Content-Type header that it tries to parse is delimited by ;, e.g. application/pdf; charset=utf-8. To support Content-Type matching, it must be possible to match part of the value (the leading part at least).

The input is specified by: https://source.chromium.org/chromium/chromium/src/+/main:extensions/common/api/declarative_net_request.idl;l=173-184;drc=24a80999fbc083276e0fbef7f9e2eec36429f6a8

  [nodoc] dictionary HeaderInfo {
    // The name of the header. This condition matches on the name
    // only if both `values` and `excludedValues` are not specified.
    DOMString header;
    // If specified, this condition matches if the header's value
    // contains at least one element in this list.
    DOMString[]? values;
    // If specified, this condition is not matched if the header
    // exists but its value contains at least one element in this list.
    DOMString[]? excludedValues;
  };

And the implementation is in MatchesHeaderConditions at https://source.chromium.org/chromium/chromium/src/+/main:extensions/browser/api/declarative_net_request/request_params.cc;l=65-110;drc=3152449b79f22a5cfc3dcd81cd5ca19fdca8e49e

According to the implementation, a request only matches if the header is (case-insensitively) identical to any value in values.

Example: Test case

I confirmed the behavior by loading an extension with the declarativeNetRequest permission + "host_permissions": ["*://*/*"], and specifying the following rule that is supposedly matching all responses with MIME-type "application/json":

await chrome.declarativeNetRequest.updateSessionRules({
  removeRuleIds: [1],
  addRules:[
    {
      id: 1,
      condition: {
        regexFilter: ".*",
        resourceTypes: ["main_frame", "sub_frame"],
        responseHeaders: [
          { header: "content-Type", values: ["application/json"] },
        ],
      },
      action: { type: "redirect", redirect: { regexSubstitution: "https://example.com/?\\0." } }
    },
  ],
});

Then, consider the following test cases:

https://httpbingo.org/response-headers?content-type=application/json
✔️ Expected: matched. Actual: matched.
https://httpbingo.org/response-headers?content-type=%20application/json
✔️ Expected: matched. Actual: matched.
https://httpbingo.org/response-headers?content-type=application/json%20
✔️ Expected: matched. Actual: matched.
https://httpbingo.org/response-headers?content-type=application/json%3B%20charset=utf-8
✖️ Expected: matched. Actual: not matched.
https://httpbingo.org/response-headers?content-type=application/json%20%3B%20charset=utf-8
✖️ Expected: matched. Actual: not matched.
https://httpbingo.org/response-headers?content-type=not-application/json
✔️ Expected: not matched. Actual: not matched.
https://httpbingo.org/response-headers?content-type=application/json-not
✔️ Expected: not matched. Actual: not matched.
https://httpbingo.org/response-headers?content-type=not-application/json-not
✔️ Expected: not matched. Actual: not matched.

My expectation is that the first 5 requests are matched (and the last three not), because these are application/json requests. Case 4 + 5 are however not matched by your current implementation.

More examples

For a better picture, here are more examples:

Semicolons:

Content-Type: text/html; charset=utf-8
Content-Security-Policy: default-src 'self'; script-src 'nonce-xxx123'; frame-src 'none'
Set-Cookie: foo=bar; HttpOnly; max-age=3600

Semicolons as delimiter mixed with semicolons that are not a delimiter:

Sec-CH-UA: " Not A;Brand";v="99", "Chromium";v="96", "Google Chrome";v="96"
Content-Disposition: attachment; filename="hello; world.txt" (example link)

Commas:

Access-Control-Expose-Headers: name1,name2,name3
Cache-Control: max-age=604800, must-revalidate

Proposed enhancement

To support the use case, it should be possible to match a substring of header values. This can range from supporting to match wildcards, left/right anchoring (startsWith/endsWith), or even regular expressions. Another option could be to expose the parsed header values for specific headers (e.g. Content-Type).

Kelvin Jiang · Answer 14 · Tue Jun 11 2024 08:01:00 GMT+0800 (China Standard Time)

Hello,

The feature is now enabled by default on Canary and dev. Please try it out and reach out for any feedback, future enhancement ideas and bugs.

re: Rob, more flexible matching on header values is next on my plate: a similar issue has been raised and I can perhaps build off of that? Unfortunately more flexible matching will require the use of a new filter type (likely regex), mainly because in Chromium's HTTPResponseHeaders class, convenience methods which retrieve multiple values from a header assume the use of a comma as a delimiter. The addition of a regex value filter should be able to satisfy your use case?

potential schema:

  [nodoc] dictionary HeaderInfo {
    // The name of the header. This condition matches on the name
    // only if both `values` and `excludedValues` are not specified.
    DOMString header;
    // If specified, this condition matches if the header's value
    // contains at least one element in this list.
    DOMString[]? values;
    // If specified, this condition is not matched if the header
    // exists but its value contains at least one element in this list.
    DOMString[]? excludedValues;
   
    // If specified, this condition matches if the header's value
    // matches the provided regex string. Only one of this field,
    // or (values and excludedValues) may be specified.
    DOMString? regexValueFilter;  // NEW FIELD!
  };

Thanks,
Kelvin

Rob Wu · Answer 15 · Tue Jun 11 2024 20:46:27 GMT+0800 (China Standard Time)

re: Rob, more flexible matching on header values is next on my plate: a similar issue has been raised and I can perhaps build off of that? Unfortunately more flexible matching will require the use of a new filter type (likely regex), mainly because in Chromium's HTTPResponseHeaders class, convenience methods which retrieve multiple values from a header assume the use of a comma as a delimiter.

Your linked HttpResponseHeaders::GetNormalizedHeader method is not generic. In fact if you look at its implementation, it has a debug assertion (DCHECK) to make sure that the method is not used on some headers such as Set-Cookie (which is listed among my examples and #439).

HttpResponseHeaders::EnumerateHeader is the generic method that offers access to each individual value.

Chromium's DNR implementation currently uses HasHeaderValue, whose internal implementation also relies on EnumerateHeader.

Based on this, I think that there is no implementation constraint to use the same (consistent) value for matching by values / regexValueFilter. If there are multiple lines (such as Set-Cookie), then each line value would be matched separately.

The addition of a regex value filter should be able to satisfy your use case?

Yes, it would satisfy the Content-Type use case that is needed by PDF.js

    // If specified, this condition matches if the header's value
    // matches the provided regex string. Only one of this field,
    // or (values and excludedValues) may be specified.
    DOMString? regexValueFilter;  // NEW FIELD!

I suggest to not make them mutually exclusive, but rather stack its effect on top of values (still required), AND to make it possible to match by substring instead of an exact match. Otherwise the API design forces the use of regexps on every request. By enabling the two conditions to stack on each other, it is possible to have a fast pass to skip further (expensive regexp) matching, like this:

Content-Type contains "application/pdf"? Most likely not -> skip rest of logic.
Content-Type matches ^ *application/pdf($| *;) -> likely yes (this condition mainly exists to rule out not-application/pdf and application/pdfnotthis).

Oliver Dunk · Answer 16 · Tue Jun 11 2024 21:24:56 GMT+0800 (China Standard Time)

I had a discussion with @Rob--W on various ways to avoid doing a full string match on values. This would expand the number of use cases that could be solved without regular expressions.

Option 1: Splitting headers based on delimiters

Imagine a rule values: ["application/pdf"] and header Content-Type: application/pdf; charset=utf-8. Rather than checking that the header is an exact match to an item in values, we could split the header into ["application/pdf", "charset=utf-8"] and just check that each value in values is contained within this array.

Option 2: Substring match up to first delimiter

Imagine a rule values: ["application/pdf"] and header Content-Type: application/pdf; charset=utf-8. Trim the header to the first delimiter "application/pdf" and check that this includes each value in values as a substring.

Option 3: General substring match

Imagine a rule values: ["application/pdf"] and header Content-Type: application/pdf; charset=utf-8. Check that each value is included as a substring in the header.

Note: In option 1 and 2, we would need a mapping of headers to delimiters. In Chromium, the append header allow list contains some of this but it would need to be more complete.

Comparison

Matching by Content-Type

As described, option 1 and 2 would support a condition for the application/pdf Content-Type (or similar use cases for JSON or epub extensions). Option 3 would provide an early exit for this but a regular expression would also be required to check the value is not actually notapplication/pdf for example.

Matching based on a specific cookie attribute

For the header Set-Cookie: foo=bar; Secure, the first option would be able to match with values: ["Secure"]. The second option would not be able to match based on this since the occurrence would be after the delimiter. Option 3 is not really appropriate as secure could occur in many parts of a Set-Cookie header (e.g path) and not be an attribute on its own.

Matching based on cookie name

This is trivially possible with option 2 since it always appears at the start of a string, followed by an equals sign. It is not possible with option 1 or 3 since the name may appear elsewhere in the header and not actually be the cookie name.

Matching based on CSP

It is unlikely that any of these options are sufficient and a regular expression would likely be required. There are simply too many ways in which directives can be ordered.

Matching based on referrer header

This is a use case which has come up, particularly in order to trim the header and remove (for example) a path from the URL. However, to do replacement like this a regular expression would be required so this is unlikely to be addressed solely by any of the options.

Conclusion

I lean slightly towards option 1. Like the other options, it solves the use cases we know about, including application/pdf matching without requiring a regular expression. In the future, I could definitely see us getting requests to match against specific header attributes and this would support that.

In conversation with @Rob--W, I believe he prefers option 2 or 3. Option 3 partly addresses the application/pdf use case, allowing a way to do an initial check, and then requiring a regular expression in addition as described in the previous comment. Option 2 should be sufficient to match based on application/pdf without another condition. Notably, the implementation may be simpler as neither require matching against multiple values and option 3 does not require an understanding of delimiters.

Daniel Jacobs · Answer 17 · Wed Jun 12 2024 01:20:09 GMT+0800 (China Standard Time)

Does anyone know what's causing this not to work on Chrome Dev:

ruffle-rs/ruffle@master...danielhjacobs:ruffle:swf-takeover-dnr

No URL seemed to redirect, though the only ones I tested were http://i.notdoppler.com/files/axon.swf and https://new.weedtowonder.org/w2w-home.swf. The axon file has the Content-Type header "application/vnd.adobe.flash.movie" and the W2W file has the Content-Type header "application/x-shockwave-flash", which I'd expect to both match rule 1.

With Chrome stable, every URL in existence redirects, since they all match the regexFilter ^.*$ and the responseHeaders RuleCondition is unsupported so ignored.

My first rule is as follows:

const playerPage = chrome.runtime.getURL("/player.html");
...
            {
                id: 1,
                action: {
                    type: chrome.declarativeNetRequest.RuleActionType.REDIRECT,
                    redirect: { regexSubstitution: playerPage + "#\\0" },
                },
                condition: {
                    regexFilter: "^.*$",
                    responseHeaders: [
                        {
                            header: "content-type",
                            values: [
                                "application/x-shockwave-flash",
                                "application/futuresplash",
                                "application/x-shockwave-flash2-preview",
                                "application/vnd.adobe.flash.movie",
                            ],
                        },
                    ],
                    resourceTypes: [
                        chrome.declarativeNetRequest.ResourceType.MAIN_FRAME,
                    ],
                },
            },

I'm probably just missing something.

Kelvin Jiang · Answer 18 · Wed Jun 12 2024 11:57:50 GMT+0800 (China Standard Time)

Hi... I apologize for this oversight but it should be enabled by default on canary and in the next dev build after crrev.com/c/5624732 lands.

For now, you'll still need to enable the DeclarativeNetRequestResponseHeaderMatching feature flag (i.e. --enable-features=DeclarativeNetRequestResponseHeaderMatching )

I tested your rule with the flag enabled and it works as intended

Daniel Jacobs · Answer 19 · Wed Jun 12 2024 21:23:08 GMT+0800 (China Standard Time)

I tested your rule with the flag enabled and it works as intended

I didn't include my second rule header before, but it was as follows:

            {
                id: 2,
                action: {
                    type: chrome.declarativeNetRequest.RuleActionType.REDIRECT,
                    redirect: { regexSubstitution: playerPage + "#\\0" },
                },
                condition: {
                    regexFilter: "^.*\\.s(?:wf|pl)(\\?.*|#.*|)$",
                    responseHeaders: [
                        {
                            header: "content-type",
                            values: [
                                "application/octet-stream",
                                "application/binary-stream",
                                "",
                            ],
                        },
                    ],
                    resourceTypes: [
                        chrome.declarativeNetRequest.ResourceType.MAIN_FRAME,
                    ],
                },
            },

The idea was to match any URL ending with .swf or .spl (before any query parameters or fragment identifiers) that has the Content-Type header "application/octet-stream", "application/binary-stream", or an empty string as the Content-Type header. I'll admit I'm not fully sure how common a URL with an empty string as its Content-Type header is, but I think it's technically possible. I get this error though from the unpacked extension:

Uncaught (in promise) Error: Rule with id 2 must specify a valid header value for "condition.responseHeaders" key

Is that because I'm using an empty string as a value? How can I do this?

Daniel Jacobs · Answer 20 · Wed Jun 12 2024 21:55:49 GMT+0800 (China Standard Time)

Also, is there a way to feature detect this?

For example, Mozilla just added the MAIN ExecutionWorld to the scripting API in Firefox 128, and I was able to feature-detect that support by checking (browser || chrome).scripting.ExecutionWorld.MAIN. This is possible to check since it's an ENUM (according to https://developer.chrome.com/docs/extensions/reference/api/scripting#type-ExecutionWorld), and when the enum contains MAIN is when the browser supports using content scripts with world "main".

On the other hand, looking at https://source.chromium.org/chromium/chromium/src/+/main:extensions/common/api/declarative_net_request.idl;l=174, HeaderInfo is a dictionary definition which I can't check is supported. Even if a Rule defined by updateDynamicRules has an unsupported RuleCondition, Chrome still applies it, just ignoring the unsupported RuleCondition, so I can't just check if the Rule has a RuleCondition with a responseHeaders key and if it doesn't remove the Rule, since when I define such a condition it's unconditionally added to the dynamic rule even on browsers that ignore it.

Opened #638.

Daniel Jacobs · Answer 21 · Thu Jun 13 2024 02:56:46 GMT+0800 (China Standard Time)

Uncaught (in promise) Error: Rule with id 2 must specify a valid header value for "condition.responseHeaders" key

Is !value.empty() && from https://chromium.googlesource.com/chromium/src/+/master/extensions/browser/api/declarative_net_request/indexed_rule.cc#494 really what should be used?

https://www.rfc-editor.org/rfc/rfc9110.html#name-content-type says "A sender that generates a message containing content SHOULD generate a Content-Type header field in that message unless the intended media type of the enclosed representation is unknown to the sender. If a Content-Type header field is not present, the recipient MAY either assume a media type of "application/octet-stream" ([RFC2046], Section 4.5.1) or examine the data to determine its type."

Kelvin Jiang · Answer 22 · Thu Jun 13 2024 11:55:23 GMT+0800 (China Standard Time)

Hi Daniel,

re: feature detection: the current intention is for the feature to be available by default on canary/dev for testing, and it will be enabled in the next stable release (M128). Unfortunately we don't have a great way to detect said features in DNR unless the extension can access feature flag values from within Chrome. I do agree that a way to give feedback for the extension whether certain fields are supported or not, would be helpful.

re: empty header value: in the example listed, it mentions the absence of the content-type header. That said, I can change that code to allow for empty header values (header is present, but value is empty), though ideally, header values shouldn't be empty when possible.

Thanks,
Kelvin

Daniel Jacobs · Answer 23 · Thu Jun 13 2024 19:53:23 GMT+0800 (China Standard Time)

Unfortunately we don't have a great way to detect said features in DNR

Yeah, this what I'm asking about and why I opened #638. In the scripting API I can check chrome.scripting.ExecutionWorld.MAIN but checking chrome.declarativeNetRequest.RuleCondition.responseHeaders isn't possible since chrome.declarativeNetRequest.RuleCondition isn't an enum like chrome.scripting.ExecutionWorld. The extension I want to add these rules to currently supports Chrome 87 and above, but the way these rules work on old Chrome versions would require upping that to Chrome 128, which is an unacceptable jump in requirements.

ideally, header values shouldn't be empty when possible.

This is true but I believe it's allowed by the spec and I can't control what pages my extension users visit.

Kelvin Jiang · Answer 24 · Sat Jun 15 2024 09:08:53 GMT+0800 (China Standard Time)

crrev.com/c/5634852 will allow for empty header values to be specified in response header conditions...

Perhaps not an ideal solution but checking the user agent of the browser for a Chrome version may help detect if response header conditions are supported?

@Rob--W @oliverdunk One solution for more flexible header value matching without resorting to regex could be to use the MatchPattern function for strings? It supports ? and * matches, so specifying substring matches (i.e. for certain directives on headers) is easy.

Let me know if this will satisfy your use cases, or if there are any I've missed that won't require regex.

Thanks,
Kelvin

Daniel Jacobs · Answer 25 · Sat Jun 15 2024 09:17:24 GMT+0800 (China Standard Time)

Perhaps not an ideal solution but checking the user agent of the browser for a Chrome version may help detect if response header conditions are supported?

Edit: Done by adding const chrome127OrLess = /Chrome\/([0-9]|[0-9][0-9]|1[0-1][0-9]|12[0-7])\./.test(navigator.userAgent); and only adding the rule if (chrome.declarativeNetRequest && !chrome127OrLess). I'd still prefer to use a detectable WebExtensions Chrome addition with strictly higher version requirements than the responseHeaders RuleCondition. That check may end up typeof browser !== "undefined" if that gets added to Chrome soon (https://issues.chromium.org/issues/40556351).

Daniel Jacobs · Answer 26 · Sat Jun 15 2024 09:21:51 GMT+0800 (China Standard Time)

As I noted in https://issues.chromium.org/issues/347186592 Firefox won't register rules with an unsupported RuleCondition in updateDynamicRules, allowing me to add these rules in a try block without needing feature detection (as the code in the try block throws as desired). Even if that bug were fixed, it wouldn't fix the behavior on older Chrome versions, which is where the user-agent sniffing or typeof browser !== "undefined" comes in.

Oliver Dunk · Answer 27 · Thu Jun 27 2024 17:40:28 GMT+0800 (China Standard Time)

@Rob--W In the last meeting, you mentioned that you'd need to profile before indicating Firefox's stance on using glob patterns for header matching. Would you be able to take a look and follow-up with your thoughts?

Rob Wu · Answer 28 · Mon Jul 01 2024 03:38:09 GMT+0800 (China Standard Time)

@Rob--W In the last meeting, you mentioned that you'd need to profile before indicating Firefox's stance on using glob patterns for header matching. Would you be able to take a look and follow-up with your thoughts?

Since the expected number of header matching rules is low, I don't expect a significant performance impact with the use of globs for matching headers.

Kelvin Jiang · Answer 29 · Wed Jul 10 2024 06:10:07 GMT+0800 (China Standard Time)

^Ack, glob support has been added in crrev.com/c/5671762