mbaersch / url-cleaner

Google Tag Manager Custom Variable Template for URL cleanup with whitelisting for parameters and optional value redaction with regexes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

URL Cleaner

Custom Variable Template for Google Tag Manager

URL cleanup with black- or whitelisting for parameters and optional value redaction with regexes

Template Status Repo Size License


Note: this template does not need any permissions.

Usage

  • define any URL as input value. Select "Whitelist" and define a set of allowed variables. All other parameters and their values will be eliminated from the URL query string. Alternatively a "Blacklist" can be defined in the same way. In this case, only blacklisted parameters are removed.

Update: Redact Path

In June 2023 a third option "Redact Path" was added. You can use it to remove certain paths from URLs if they match any of the patterns or search terms privided in the redaction list.

Tip: If picked, there will be no additional adjustment of parameters. If you need to change both path and parameters, use result of a first URL Cleaner as input for a second one.

  • The remaining parameter values can optionally be checked and redacted by using one or more regex expressions.

  • if you want to lowercase the result, check the option in the variable settings (default is "off")

The return value will be a clean URL, path with parameters or parameter string only.

"Partial Match And RegEx" Option

when comparing parameter keys with white- or blacklist entries, you can optionally enter only a common part of a set of parameters (like utm_) or use regular expressions.

When active, the template uses a JavaScript "match" (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match) to determine if a parameter is to be white- or blacklisted. If not, a parameter must be equal to an entry on the list to be handled by the code.

using a blacklist on https://www.domain.com/?foo=1&bar=0&find=PII+here&find2=keep with find on the list results in

https://www.domain.com/?foo=1&bar=0

if you use ^find$ instead, the result will only delete the first parameter and keeps find2

https://www.test.com/?foo=1&bar=0&find2=keep

Special Functions

Radacting Parameters Without Removing

If you want to redact parameters without removing them (for controlling purposes or other reasons), you can use the "Redact Parameters" option and define a parameter name instead of a regex. In this case, the template does not match the list entry as RegEx against the values but compares any list entry with parameter names, when a special format is used.

In order to use this, add parameter names in the following format to the list:

%%parametername%%

By surrounding a parameter name with double "%" signs, any matching parametername in the URL will be kept in the result, but the value gets replaced with the defined text just as if a normal RegEx had matched the parameter value. Note: this option works case insensitive, so %%something%% would catch "Something" as well as "something", "someThing" or any other format. There must be a complete match (no RegEx or partial match here).

Using Comma-separated List Items

You can define multiple comma-separated values like param1, param2, param3 in one list entry instead of creating three separate list items for black- and whitelists. This allows dynamic definition of blacklists in a separate variable that uses different sets of list items for different consent conditions.

About

Google Tag Manager Custom Variable Template for URL cleanup with whitelisting for parameters and optional value redaction with regexes

License:Apache License 2.0


Languages

Language:Smarty 100.0%