t1gor / Robots.txt-Parser-Class

Php class for robots.txt parse

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Possible to unintentionally override method setUserAgent()

JanPetterMG opened this issue · comments

Continuation from #68 (comment)

$parser->setUserAgent($string) is no longer ignored, but every time an user-agent is specified (not null) in one of the following methods, it permanently overrides the user-agent value previously set.

  • $parser->getDelay(...)
  • $parser->getRules(...)
  • $parser->isAllowed(...)
  • $parser->isDisallowed(...)

Code to reproduce - Example: Check if your bot is denied, while others are let in

$content = <<<TXT
User-agent: *
Allow: /
User-agent: myBot
Disallow: /
TXT;

$parser = new RobotsTxtParser($content);
$parser->setUserAgent('myBot');

if ($parser->isDisallowed('/')) {
    // we're not allowed in

    if ($parser->isAllowed('/', '*')) {
        // Most others are let in, but not me...
    }
    // ISSUE: $parser->setUserAgent('*') is called internally, when it shouldn't

}

There are two solutions I can think of

  • Remove user-agent as input parameters, in the four methods in the next mayor version
  • Somehow make it possible to one-time override the previously set user-agent