microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS

Home Page:https://github.com/microcosm-cc/bluemonday

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to get tel: links to not be removed?

clarencefoy opened this issue · comments

I have tried:

bMN.AllowStandardURLs()
bMN.AllowAttrs("href").OnElements("a")
bMN.AllowAttrs("href").Matching(regexp.MustCompile(`tel:`)).OnElements("a")

But to no avail. If someone submits a link that's simply "tel:123456", how to make sure the href attribute does not get deleted?

Many thanks

commented

https://go.dev/play/p/Z2PWy4-1rQx

package main

import (
	"fmt"

	"github.com/microcosm-cc/bluemonday"
)

func main() {
	// Do this once for each unique policy, and use the policy for the life of the program
	// Policy creation/editing is not safe to use in multiple goroutines
	p := bluemonday.UGCPolicy()
	p.AllowURLSchemes(`tel`)

	// The policy can then be used to sanitize lots of input and it is safe to use the policy in multiple goroutines
	html := p.Sanitize(
		`<a href="tel:+44.12345678">Call me</a> or <a href="https//example.org">Check my website</a>`,
	)

	// Output:
	// <a href="tel:+44.12345678" rel="nofollow">Call me</a> or <a href="https//example.org" rel="nofollow">Check my website</a>
	fmt.Println(html)
}
commented

Ultimately the UGCPolicy() is building the allow list, and that is calling a helper called AllowStandardURLs within helpers.go:L128.

The helper itself has this:

// AllowStandardURLs is a convenience function that will enable rel="nofollow"
// on "a", "area" and "link" (if you have allowed those elements) and will
// ensure that the URL values are parseable and either relative or belong to the
// "mailto", "http", or "https" schemes
func (p *Policy) AllowStandardURLs() {
	// URLs must be parseable by net/url.Parse()
	p.RequireParseableURLs(true)

	// !url.IsAbs() is permitted
	p.AllowRelativeURLs(true)

	// Most common URL schemes only
	p.AllowURLSchemes("mailto", "http", "https")

	// For linking elements we will add rel="nofollow" if it does not already exist
	// This applies to "a" "area" "link"
	p.RequireNoFollowOnLinks(true)
}

And it is the AllowURLSchemes that permits the protocol part of a URL.

That func takes a list of schemes, and they are appended to the allow list.

So what was missing was just calling that once more and allowing tel too.