RLesur / crrri

A Chrome Remote Interface written in R

Home Page:https://rlesur.github.io/crrri/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is possible to extract data from Power BI dashboard using crrri package?

covid19ec opened this issue · comments

Hi all. I hope you are fine. I am trying to extract some data from a Power BI dashboard. The issue is the dashboard has three pages and data is in the last page inside a plot. This is the dashboard I am trying to scrape:

https://app.powerbi.com/view?r=eyJrIjoiMTkwNTZjZmEtNDJkYi00MmI3LThlZmYtZjViMDVmYTk1NTJiIiwidCI6IjJmYzgyYWFkLWYyMjUtNDM0OS04YjliLTg0MTZhNGFmNGQ3ZiJ9&pageName=ReportSection5e050ac003d0b042a320

It looks like this, the main issue is that in order to get the final page, I need to click over Siguiente button (circled in red):

imagen

Then, in second page there is a similar Siguiente button that I need to click:

imagen

After clicking the button I finally arrive at final page. The data I need is placed on the TOTAL DOSIS SEGÚN CANTÓN plot:

imagen

In order to get the data, I need to right click on the plot to get the option Show as table:

imagen

After that I need to click on this pop-up and see this:

imagen

The data I need is placed on the final part after the plot (the three columns). I have had some issues trying to obtain the data because it is difficult to identify the Siguiente buttons and then click the plot and see as table. I was trying to sketch some code using RSelenium but I am not able to determine the click buttons. Here is the code I have used:

library(dplyr)
library(purrr)
library(readr)
library(wdman)
library(RSelenium)
library(xml2)
library(selectr)

# using wdman to start a selenium server
selServ <- selenium(
  port = 4444L,
  version = 'latest',
  chromever = '91.0.4472.101', 
)
# using RSelenium to start chrome on the selenium server
remDr <- remoteDriver(
  remoteServerAddr = 'localhost',
  port = 4444L,
  browserName = 'chrome'
)
# open a new Tab on Chrome
remDr$open()
# navigate to the site you wish to analyze
report_url <- "https://app.powerbi.com/view?r=eyJrIjoiMTkwNTZjZmEtNDJkYi00MmI3LThlZmYtZjViMDVmYTk1NTJiIiwidCI6IjJmYzgyYWFkLWYyMjUtNDM0OS04YjliLTg0MTZhNGFmNGQ3ZiJ9&pageName=ReportSection5e050ac003d0b042a320"
remDr$navigate(report_url)
# find and click the button leading to the Siguiente action
NexBtn <- remDr$findElement('.//button[descendant::span[text()="Siguiente"]]', using="xpath")
NexBtn$clickElement()


The last two lines of code did not work because I do not know how to place the Siguiente buttons.

Maybe is it possible to extract this data using crrri package? Any help is welcome.