extractus / feed-extractor

Simplest way to read & normalize RSS/ATOM/JSON feed data

Home Page:https://extractor-demos.pages.dev/feed-extractor

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Empty description when content is wrapped in CDATA

lwojcik opened this issue · comments

Hi!

When I pass a feed that contains content wrapped in CDATA tags, normalized feed entry contains empty description.

Sample feeds:

For now I use a dirty workaround using getExtraEntryFields and some custom code to process HTML:

getExtraEntryFields: (feedEntry) => {
	const cdataDescription = feedEntry.description.includes("<![CDATA[")
	  ? stripAndTruncateHTML(
	      feedEntry.description
	        .replaceAll("<![CDATA[", "")
	        .replaceAll("]]>'", ""),
	      siteConfig.maxPostLength
	    )
	  : "";

	return { cdataDescription };
}

Also - do you have a donation link or something? I'd love to buy you a coffee because this project ROCKS. ❤️

@lwojcik thank you for pointing out the issue.

I've investigated and found the cause. It relates to the function buildDescription, when we strip HTML tags from description text.

I'm going to fix it asap.

Regarding the donation, thank you for your compliment. I will add my Paypal link to README in the next update.

Works like charm. Thank you very much!

@lwojcik glad it worked for you, and thank you very much for the gift 🥮