ua-parser / uap-core

The regex file necessary to build language ports of Browserscope's user agent parser.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Break up into two lists

sandstrom opened this issue · comments

How about breaking up the list of user agents into two lists:

  1. Major browsers and common tools
  2. Everything else

Would make it easier to prioritize issues/PRs, since parsing accuracy will always be more important for certain products. It would also allow some implementation libraries to only load the smaller subset (much quicker).

A primary list could look like this:

  • Chrome
  • Safari
  • Samsung Internet
  • Edge
  • Firefox
  • Opera
  • IE
  • UC
  • GoogleBot
  • curl
  • wget
  • etc…

It would make sense for this list to have a hard cap of ~50 products, with some criteria. For example the criteria could be:

  • All browsers with > 1% global market share
  • Top 20 most common crawlers on the web
  • Top 20 most common software tools (scrapers, http libraries, etc)

Having two lists would allow some tools to only load one (much smaller) and return 'unknown browser' for everything else. This would be configurable in the parser, so all parsers would still support both data sets (full + limited).

@commenthol let me know what you thinK!