AdguardTeam / ecsstree

Adblock Extended CSS supplement for CSSTree

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Improve `:contains`

scripthunter7 opened this issue · comments

commented

If the ( ) balancing is not correct, it is impossible to clearly determine the actual end of the pseudo-class. For example:

  • :contains((aa)
  • :contains(aa))

In the above cases, parenthesis must be escaped as follows:

  • :contains(\(aa)
  • :contains(aa\))

But the following cases are fine, since balancing is fine:

  • :contains((aa))
  • :contains((aa)(bb)(cc)dd(ee))

The following cases are currently problematic in ECSSTree (throws parsing error):

  • :contains('(aa')
  • :contains('aa)')

This is because ECSSTree overrides CSSTree's default algorithm while parsing :contains() and currently ONLY parses the argument as raw (and not takes string into account like CSSTree). If we introduce the string node management here, the following case must also be taken into account:

  • from :contains('aa"aa'), 'aa"aa' is parsed as string node by CSSTree, but when generating, CSSTree always uses double quotes, so it generate this selector as :contains("aa\"aa") (reference)
  • :contains('aaa) - handle as unclosed string or parse 'aaa as raw?
  • :contains("aaa) - handle as unclosed string or parse "aaa as raw?
  • :contains('(aa) - invalid

See AdguardTeam/FiltersCompiler#156 (comment)

@slavaleleka What do you think about this?

commented

@slavaleleka It seems ExtendedCSS also throws an error for

  • :contains('(aa')
  • :contains('aa)')
  • :contains("(aa")
  • :contains("aa)")

But not for

  • :contains('(aa)')
  • :contains("(aa)")
commented

This should also be fixed:

:contains(/(\(| |^)EqG[) ]/)

(square brackets should also be included in the balancing when searching)