wappalyzer / wappalyzer

Identify technology on websites.

Home Page:https://www.wappalyzer.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Does analyze() support DNS directly?

pmeenan opened this issue · comments

I'm in the process of adding DNS support to WebPageTest for the HTTP Archive (HTTPArchive/cwv-tech-report#24)

Does the analyze() call in wappalyzer.js uspport passing dns: directly or does DNS need to be resolved outside of the regular analyze flow? If it supports it, what format are the records supposed to be in?

I tried:

const dns = {
	"CNAME":["www.hostinger.com.cdn.cloudflare.net."],
	"NS":["any2.hostinger.com.","any1.hostinger.com."],
	"MX":["10 aspmx3.googlemail.com.","1 aspmx.l.google.com.","5 alt2.aspmx.l.google.com.","5 alt1.aspmx.l.google.com.","10 aspmx2.googlemail.com."],
	"TXT":["\"apple-domain-verification=IyFbOUpTx9DUOFwL\"","\"nordpass-domain-verification=6b627232b00e4e9ea70693c7994f2d50\"","\"v=spf1 ip4:31.220.23.4 include:_spf.google.com include:amazonses.com include:_spf.hostedemail.com include:_spf.psm.knowbe4.com -all\"","\"google-site-verification=MOjKs17dYrFXyEPndU4bK505my3D0dyC63-c5mvaNGU\"","\"mailru-verification: a8a9886e0072b036\"","\"google-site-verification=4EfGmYRIEIPWA_ACJsA5zFGUzzY1pa8Du2tiHb8EKuI\""],
	"SOA":["any1.hostinger.com. dns.hostinger.com. 2021102522 10800 3600 604800 3600"]};
result = await Wappalyzer.analyze({
	...
	dns: dns
	})

but it doesn't seem to be able to resolve the technologies that rely on DNS. In this case, the technology detection I'd expect to trigger is:

  "Hostinger": {
    "cats": [
      88
    ],
    "description": "Hostinger is an employee-owned Web hosting provider and internet domain registrar.",
    "dns": {
      "SOA": "\\.(?:dns-parking|hostinger)\\.com"
    },
    "icon": "Hostinger.svg",
    "pricing": [
      "low",
      "recurring"
    ],
    "website": "https://www.hostinger.com"
  },

I tried having each record as a string, concatenating with a space but that didn't seem to help. Do the values need to be massaged (stripping the expiration times) or otherwise tweaked or do I need to implement the DNS resolution logic outside of the analyze path like has to be done for JS and DOM?

Got it working with the node cli using -p so hopefully that's enough for me to figure out how to wire it up.

Lowercase keys (cname, mx, etc) did it. Should be working in the HTTP Archive crawl now.

I'm working at Hostinger, a quick comment: you can distinguish if the site is hosted under Hostinger using HTTP headers:

platform: hostinger or server: hcdn.

Related: #7186