ycphs / openxlsx

openxlsx - a fast way to read and write complex xslx files

Home Page:https://ycphs.github.io/openxlsx/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Saving HTML tables (rvest) as Excel files

Mkranj opened this issue · comments

I'm downloading a certain HTML table using the rvest package. Currently, I'm transforming it to a regular dataframe and then saving it as .xlsx. However, the table in question has a lot of merged cells. When transforming to a dataframe, all the spaces a merged cell occupies get filled with its text, leading to many duplicates.
Is there a way to directly save a HTML table as an Excel file? Since Excel and openxlsx support merged cells, this would lead to a true-to-original output. I believe this would be a very useful feature :)
From what I've tried, the rvest table is in a xml_node format.

Thanks for the great work!