nayzo / NzoGrabberBundle

Symfony2/3/4 Bundle used to Crawl and to Grab all types of links and Tags (img, js, css) from any website

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NzoGrabberBundle

Build Status Latest Stable Version

The NzoGrabberBundle is a Symfony Bundle used to Crawl and to Grab all types of links, URLs and Tags for (img, js, css) from any website.

Features include:

  • Compatible Symfony version 3 & 4
  • Url Grabber/Crawler for HTTP/HTTPS
  • Url Grabber/Crawler for HREF / SRC / IMG types
  • Exclude any type of file by extension
  • Prevent specified URLs from Grabbing
  • Compatible php version 5 & 7

Installation

Through Composer:

Install the bundle:

$ composer require nzo/grabber-bundle

Register the bundle in app/AppKernel.php (Symfony V3):

// app/AppKernel.php

public function registerBundles()
{
    return array(
        // ...
        new Nzo\GrabberBundle\NzoGrabberBundle(),
    );
}

Usage

In the controller use the Grabber service and specify the options needed:

Get all URLs:

     public function indexAction($url)
    {
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrls($url);

        //....
    }

OR .. get all URLs not recursively:

Get all URLs no recursive:

     public function indexAction($url)
    {
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrlsNoRecursive($url);

        //....
    }

OR .. get all URLs that does not figure in the exclude array:

     public function indexAction($url)
    {
        $notScannedUrlsTab = ['http://www.exemple.com/about']
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrls($url, $notScannedUrlsTab);

        //....
    }

OR .. you can exclude URLs that contains a specified text and also you can select by file extension:

     public function indexAction($url)
    {
        $exclude = 'someText_to_exclude';
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrls($url, null, $exclude, array('png', 'pdf'));

        //....
    }

OR .. get all URLs selected by file extension:

     public function indexAction($url)
    {
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrls($url, null, null, array('png', 'pdf'));

        //....
    }

OR .. get all Img Files from the specified URL:

     public function indexAction($url)
    {
        $img = $this->get('nzo_grabber.grabber')->grabImg($url);

        //....
    }

OR .. get all Js Files from the specified URL:

     public function indexAction($url)
    {
        $js = $this->get('nzo_grabber.grabber')->grabJs($url);

        //....
    }

OR .. get all Css Files from the specified URL:

     public function indexAction($url)
    {
        $css = $this->get('nzo_grabber.grabber')->grabCss($url);

        //....
    }

OR .. get all Css, Img and Js Files from the specified URL:

     public function indexAction($url)
    {
        $extrat = $this->get('nzo_grabber.grabber')->grabExtrat($url);

        //....
    }

License

This bundle is under the MIT license. See the complete license in the bundle:

See Resources/doc/LICENSE

About

Symfony2/3/4 Bundle used to Crawl and to Grab all types of links and Tags (img, js, css) from any website

License:MIT License


Languages

Language:PHP 100.0%