Suppose that you are doing some fancy analytics with R for your business users. How do you present and visualize the results? Do you build a Shiny appdio.com/, use an external dashboard application or do you export the data to a database for ingesting by a visualization tool?
With its November update, we can now add Microsoft Power BI to our R visualization options. In this post, we will show you how to use R to perform some basic computations and then present the results with Power BI.
In our opinion, Power BI fulfils some of the most important requirements for an R business intelligence visualization tool.
- You have the ability to schedule the execution of your R scripts, i.e. data refreshes.
- Your business users do not need to use yet another application, as the visualization is part of a comprehensive suite of business intelligence tools.
- Power BI contains many useful and nice looking visualizations out of the box, with many more available at the Power BI Visuals Gallery.
To demonstrate how to use R with Power BI, I downloaded the Rossmann dataset from Kaggle. This dataset contains sales data for 1 115 German drugstores that belongs to the retailer Rossmann. The dataset also includes data about the individual stores, including a distance measurement to the store’s nearest competitor. In a predictive model, we will probably get a better result if we use a distance banding metric (i.e. distance intervals) instead of the actual distance. I made a simple R script that performs this banding, and I want to make this banding available to Power BI.
Ingest the R data into Power BI
To make the results from the execution of the R-script available to Power BI, open Power BI Desktop and select Get Data > Others > R Script in the Home ribbon.
Next, you paste your R script inside the Execute R Script window. Note that Power BI can only read data from R that are stored within data frames. You can also specify the path to your R installation, which is very useful if you have multiple R installations. When you click OK, the script is executed locally on your computer. If your script contain errors, you are taken back to the dialogue windows.
After the script has executed successfully, you are taken to the data preview window. Here you select the data frames that you want to load into the Power BI model. Note that it is possible get an empty window if there is not any data available that can be loaded into the model. If this is the case, you need to start all over as there is not a back button to Execute R Script window. Unfortunately, as the execution ended successfully, you will also have to go to Recent Sources in the Home ribbon and remove the source that was just added. If not, you’re going to end up with a lot of R connectors that you can’t tell apart from each other.
As the R connector is still in beta, you are also presented with a disclaimer that basically tells you that the functionality might change over time. I think it is safe to assume that there won’t be any large changes in how the data from the R script execution are made available to the model. Changes are more likely to the input of the R script (e.g. by specifying an R file) or how the script is processed is much more likely.
That’s it! Your data is now accessible within your Power BI model and it’s no different from any other data. You perform any necessary ETL using Edit Queries, you can create relationships between data from different data frame or different data sources, and you can define new calculated columns or measures.
For this post, I just created a simple table to illustrate the distance banding that was computed in the R-script.
Schedule the execution of the R script
To schedule the execution of the R-script, you first need to publish your report to your Power BI site. After this, the R connection used in the report is available as dataset. To schedule a refresh, go to the dataset’s settings and select Schedule Refresh. Next, you simply toggle the Schedule Refresh button and add your preferences for the refresh. Done! Once a day, the gateway will now execute my R script and update the report on the Power BI site. Naturally, in a production environment, the execution should be performed on a computer that’s always on, instead of a laptop as in my case.
With the Power BI November update, we get access to powerful tool for sharing analytics and insights computed with R. Given the fast pace of Power BI development and the upcoming SQL Server 2016, I’m sure that we will see many more features during the next twelve month.