microcosm-cc / bluemonday

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS

Home Page:https://github.com/microcosm-cc/bluemonday

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Double escaping attribute values

jtran opened this issue · comments

The values of attributes are getting double escaped. As far as I can tell, the problem was introduced in f0057e2. In particular, the double-quote character and non-breaking spaces won't ever make it through sanitizing.

In code, I expect the following test to pass.

func TestQuotesSanitization(t *testing.T) {
	tests := []test{
		{
			in:       `<p title="&quot;"></p>`,
			expected: `<p title="&quot;"></p>`,
		},
		{
			in:       `<p title="&nbsp;"></p>`,
			expected: `<p title="&nbsp;"></p>`,
		},
	}

	p := UGCPolicy()
	p.AllowAttrs("title").OnElements("p")

	// These tests are run concurrently to enable the race detector to pick up
	// potential issues
	wg := sync.WaitGroup{}
	wg.Add(len(tests))
	for ii, tt := range tests {
		go func(ii int, tt test) {
			out := p.Sanitize(tt.in)
			if out != tt.expected {
				t.Errorf(
					"test %d failed;\ninput   : %s\noutput  : %s\nexpected: %s",
					ii,
					tt.in,
					out,
					tt.expected,
				)
			}
			wg.Done()
		}(ii, tt)
	}
	wg.Wait()
}

However, I get this output.

=== RUN   TestQuotesSanitization
    sanitize_test.go:3713: test 1 failed;
        input   : <p title="&nbsp;"></p>
        output  : <p title="&amp;nbsp;"></p>
        expected: <p title="&nbsp;"></p>
    sanitize_test.go:3713: test 0 failed;
        input   : <p title="&quot;"></p>
        output  : <p title="&amp;quot;"></p>
        expected: <p title="&quot;"></p>
commented

Looks like I introduced a bug when attempting to fix HREF santization: f0057e2

However when I remove the double-sanitizing in that and test the output of the tests I wrote for that code... it doesn't pose a risk.

That is, removing the double escaping still results in sanitized and safe to use output.

As such, I'm going to remove the part of that prior commit that led to the double escaping.

@buro9 will you be releasing a new version of bluemonday or should we (gitea) just pull a head?


Ah I see you've already released 1.0.19 - Thanks!