snarfed / granary

💬 The social web translator

Home Page:https://granary.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Infer feed title from page title?

singpolyma opened this issue · comments

https://granary.io/url?input=html&output=atom&url=https%3A%2F%2Fsingpolyma.net%2Fcategory%2Ftech%2F

The atom:title is just the entire text of the webpage... seems probably not the desired outcome ;)

hah, funny. that's just mf2util.interpret_feed(parsed, url).get('name'). i'll take a look at what it's doing.

title=mf2util.interpret_feed(parsed, url).get('name'),

whee, looks like this is actually implied name parsing and mf2py again. if an h-feed has no p-name, mf2py returns its entire text, including children, in the name property. ugh.

i guess i'll just ellipsize it.

In this case there is no h-feed and that's being implied as well. What's the right place to raise issues about h-feed design in this case? Probably should have a "use page title" fallback rule

This particular case is technically a bug in mf2py, since implied name shouldn't be used for backcompat: http://microformats.org/wiki/microformats2-parsing#parsing_for_implied_properties

But even a page with top-level h-entry will have this issue

Oh, it's here:

If no "h-feed" nor "hfeed" element is found, however multiple top-level h-entry elements (explicit or backcompat) are found, implementations may use:

top level h-entry elements as items in a synthetic h-feed.
<title> of the page or the URL of the page as p-name

http://microformats.org/wiki/h-feed#Parser_Compatibility