RedHatProductSecurity / advisory-parser

A library for parsing security advisories

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Chrome advisory parsing broken after Google blog post format change at the end of Oct 2019

thoger opened this issue · comments

Google Releases blog recently changed its layout and therefore can not be parsed by advisory-parser any more. It fails with error as:

Could not parse public date (Beta Channel Update for Desktop) from https://chromereleases.googleblog.com/...

The following archive.org links can be used to compare how formatting of the same post changed between Oct22 and Oct25:

https://web.archive.org/web/20191022191950/https://chromereleases.googleblog.com/2019/10/stable-channel-update-for-desktop_22.html
https://web.archive.org/web/20191025133128/https://chromereleases.googleblog.com/2019/10/stable-channel-update-for-desktop_22.html

The above error seems to be the direct consequence of having the post date moved before the post title (Stable Channel Update for Desktop).

Additional concern is that the end of blog post is no longer detected correctly. Text 'Labels:\nStable updates' used to serve as separator, but the text that appears now is 'Labels: Desktop Update, Stable updates'.

The chromereleases blog is currently back to the old theme / layout, and hence parsing currently works.

Oh the joys of providing security advisories as blog post...

@thoger wontfix/notabug then? Or do you think it's worth it to modify the parser to handle that one format as well just in case it pops up again in the future?

I do not dare to guess if the new layout will be back or not, hence I have no idea if adding support for it would be worth the effort. At this point, I'd wait - either keeping this open, or close if it can be re-opened later if needed.

I'll close this and we can re-open this if they choose to switch to the new format again.