bbc / wraith

Wraith — A responsive screenshot comparison tool

Home Page:http://bbc-news.github.io/wraith/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wraith spidering mode checks entire website after paths have been mentioned in spider.yaml

khana25 opened this issue · comments

I want spider to crawl and checks for all those paths that comes after '/new-homes' not the entire website paths

At the moment, it checks for the entire website rather checking only the paths after '/news'

I have given the example of spider.yaml file below


Reporting a problem? Please describe the issue above, and complete the following checklist so that we can help you more quickly.

Issue checklist:

  • I have validated my config file against YAML Validator to make sure it is valid YAML.

  • I have run the wraith info command and pasted the output below:

/new-homes
    /
    /new-homes/developments-by-county
    /my-home
    /my-home/sign-in
    /new-homes/forthcoming-developments
    /new-apartments
    /luxury-new-homes-apartments
    /the-buying-process/our-team
    /new-homes/completed-developments
    /the-buying-process/meeting-your-expectations
    /the-buying-process/step-by-step-process
    /the-buying-process/mortgage-payment-calculator
    /purchasing-schemes/the-schemes
    /purchasing-schemes/help-to-buy
    /the-berkeley-difference/the-difference
    /the-berkeley-difference/berkeley-overview
    /the-berkeley-difference/berkeley-approach
    /the-berkeley-difference/world-class-places
    /about-berkeley-group
    /investor-information
    /sustainability
    /media-centre
    /the-buying-process
    /the-buying-process/our-team
    /purchasing-schemes
    /purchasing-schemes/the-schemes
    /property-developers/berkeley
    /the-berkeley-difference
    /the-berkeley-difference/the-difference
    /property-developers/st-george
    /property-developers/st-edward
    /property-developers/st-james
    /property-developers/st-joseph
    /property-developers/st-william
    /the-queens-award
    /about-berkeley-group/our-vision
    /about-berkeley/careers
    /accessibility
    /sitemap
    /legal
    /privacy-policy
    /about-berkeley-group/contact-us
    /modern-slavery-statement
    /cookie-policy
    /new-homes/buckinghamshire/taplow/taplow-riverside
    /new-homes/london/tower-bridge/one-tower-bridge
    /new-homes/west-sussex/horsham/highwood
    /new-homes/london
    /new-homes/london/twickenham/brewery-gate
    /press-releases/2018/local-schoolchildren-enjoy-new-facilities
    /media-centre/press-releases
    /press-releases/2017/berkeley-named-one-of-britains-most-admired-companies
    /media-centre/press-releases
    /press-releases/2018/new-development-launches-in-blackheath
    /media-centre/press-releases
    /media-centre/press-releases
    /new-homes/berkshire
    /new-homes/buckinghamshire

  • I have run the command in verbose mode (by adding verbose: true to my config) and pasted the output below:
paste results here
  • [Y] I have pasted the contents of my config file below:
##############################################################
##############################################################
# This is an example configuration provided by Wraith.
# Feel free to amend for your own requirements.
# ---
# This particular config is intended to demonstrate how
# to use Wraith in 'spider' mode.
##############################################################
##############################################################


# Add as many domains as necessary. Key will act as a label
domains:
  my_site: 'https://www.berkeleygroup.co.uk/new-homes'

# Notice the absence of a `paths` property. When no paths are provided, Wraith defaults to
# spidering mode to check your entire website.

paths:
  new-homes: /

# A list of URLs to skip when spidering.
# Ruby regular expressions can be used, if prefixed with `!ruby/regexp` as defined in the YAML Cookbook
# See http://www.yaml.org/YAML_for_ruby.html#regexps

imports: "spider_paths.yml"

# the filename of the spider file to use. Default: spider.txt
spider_file: example_com_spider.txt

# the number of days to keep the site spider file
spider_days: 10

# amount of fuzz ImageMagick will use when comparing images. A higher fuzz makes the comparison less strict.
fuzz: '20%'

# the maximum acceptable level of difference (in %) between two images.
# Wraith considers it a failure if an image diff goes above this threshold.
threshold: 5

# screen widths (and optional height) to resize the browser to before taking the screenshot
screen_widths:
  - 320x568   #iPhone 5
  - 375x667   # iPhone 6/7/8
  - 414x736   # iPhone 6/7/8plus
  - 375x812   # iPhoneX
  - 768x1024  # iPad
  - 834x1112  # iPad 10.5
  - 1024x1366 # iPad 12.5
  - 2560x1440 # iMac
  - 1440x900  # Desktop
  - 1366x768  # Desktop
  - 1920x1080 # Desktop

# the engine to run Wraith with.
browser: "phantomjs"

# the directory that your latest screenshots will be stored in
directory: 'shots'

# choose how results are displayed in the gallery (default is `alphanumeric` if omitted)
# Different screen widths are always grouped together.
# Options:
#   alphanumeric - all paths (with or without a difference) are shown, sorted by path
#   diffs_first - all paths (with or without a difference) are shown, sorted by difference size (largest first)
#   diffs_only - only paths with a difference are shown, sorted by difference size (largest first)
mode: diffs_only