Getting Started with R Services in SQL Server 2016 – Part 1

Introduction

Advanced Analytics is becoming a norm now. Every organization wants to use the power of advanced analytics to predict the future outcome and take proactive corrective measures to remain competitive and win over its competitors. Some of the scenarios where predictive analysis can help to find future outcomes are:

  • Which stocks should we target as part of our portfolio management?
  • Did some stocks show haphazard behavior? Which factors are impacting the stock gains the most?
  • How and why are users of e-commerce platforms, online games, and web applications behaving in a particular way?
  • How do we optimize the routing of our fleet of vehicles based on weather and traffic patterns?
  • How do we better predict future outcomes based on an identified pattern, like predicting the remaining useful life of the machine or time to failure for a machine, etc.

Some of the examples in different industries where predictive analysis is playing or can play crucial roles are:

  • Marketing: Clickstream Analytics, Mix and Price Optimization, etc.
  • CRM: Campaign Optimization
  • Digital Media and Advertisement: Recommendation Engines, Ad Placement, etc.
  • Social Media:  Sentiment Analysis
  • Retail: Customer Churn, Purchase Prediction
  • Insurance: Fraud Waste and Abuse, Usage Based Insurance, etc.
  • Manufacturing: Predictive Maintenance, Demand Forecasting, etc.
  • Operations: Supply Chain Optimization
  • Finance: Credit Risk Analysis, Econometric Market Prediction, etc.
  • Life Sciences: Pharmacogenomics
  • Healthcare: Outcome Prediction
  • Health Insurance: Fraud Detection
  • Transportation: Routing optimization, Asset Utilization, etc.

It’s now becoming necessary for every organization to exploit the data that they have (or in some cases external data) and predict future outcomes, to remain competitive by offering their customers what they want, when they want and where they want. Looking at this potential, SQL Server 2016 brings native support doing advance analytics in the database itself using R Services.

In this article we will learn what R is, what on-premise advance analytics options from Microsoft are and how to get started using R Services with SQL Server 2016.

What is R?

R programming language is an open source, popular and powerful statistical programming language optimized for Statistical Analysis, Machine Learning and Data Science. It has an ever growing, vibrant community of developers and data scientists across businesses, academics, research organization, etc. This community has contributed more than 8000 free pre-built solutions or packages (algorithms, test data and evaluations) to CRAN (Comprehensive R Archive Network), which can be reused and leveraged across your projects easily and quickly. It also includes packages for creating varieties of data visualization for better data exploration.

R Programming Language
R Programming Language

R programming language is one of the fastest growing languages and now ranks at 6 in the IEEE Spectrum survey result published in July 2015.

What Are Advance Analytics Options with R?

No matter what platform you are targeting, Microsoft has Advance Analytics offerings for both on-premise as well as cloud platforms. In this article I will be talking about advance analytics offerings available on-premise (though there is some overlap; for example, you can use Microsoft R Server on your on-premise machine as well as in the cloud). To learn about Cortana Intelligence Suite, advance analytics offerings in the cloud from Microsoft, you can refer here.

Figure 1 - Microsoft Advance Analytics Offerings
Figure 1 – Microsoft Advance Analytics Offerings

Microsoft has broadly three advance analytics offerings for on-premises:

  • Community – Microsoft R Open
    • Free, enhanced and open source R distribution
    • 100% compatible with all R-related software
    • Faster performance with multi-threading
    • Available for Windows, Mac and Linux
  • Enterprise ready – Microsoft R Server
    • Secure, Scalable and Supported Distribution of R
    • With proprietary components created by Microsoft (Revolution Analytics company acquired by Microsoft) for scaling and better performance
    • For customers who have already have invested in storing data on non-Microsoft platforms like Hadoop, Teradata, and Linux
    • Commercial licensing and enterprise support
  • Enterprise ready – SQL Server 2016 R Services
    • Combining the powers of SQL Server and Microsoft R Server
    • For customers who have data in SQL Server and want to do database advance analytics without moving data elsewhere
    • Commercial licensing and enterprise support

Why Use SQL Server R Services

Microsoft R Server and SQL Server R Services are both enterprise ready and have been designed to scale and perform, although SQL Server R Services has these additional benefits:

  • Bring the compute to the data (reduce data movement and duplication). With in-database analytics you can reduce data movement. Everyone can work on the same copy of the data and the data stays in the database.
  • Key benefits of R Services is to make it easy to bring R applications to production through a familiar T-SQL interface. It is all about accelerating innovation: allowing you to build smart applications with R, and deploy them to production through standard T-SQL Interfaces.
  • Dividing work between different roles creates efficiency (database administrator focuses on managing data whereas data scientists can focus on creating predictive models).
  • Protecting important data is much easier with the built in security feature in SQL Server.
  • Scalability and resource management for ongoing computation of data between SQL Server and R Service.
  • Superior performance and scale for common operations, over open-source R while allowing the use of any R package of your choice (applies to Microsoft R Server as well).

Getting Started with R Services

SQL Server 2016 installation wizard has been enhanced to let you install R Services on your SQL Server instance. On the Feature Selection page of the wizard, you need to select R Services (In-Database), as shown in the figure below, which includes and installs Advance Analytics extension (that supports executing external scripts and processes) for integration of standard T-SQL statement with R language.

SQL Server 2016 Setup
SQL Server 2016 Setup

In case you want to automate the process of installing R Services on a SQL Server instance, you can use execute this command from the SQL Server installation media folder:

Setup.exe /q /ACTION=Install /FEATURES=SQL,AdvancedAnalytics /INSTANCENAME=MSSQLSERVER /SECURITYMODE=SQL /SAPWD="<password>" 
/SQLSYSADMINACCOUNTS="<username>" /IACCEPTSQLSERVERLICENSETERMS /IACCEPTROPENLICENSETERMS

You need to install the database engine on each instance where you will use R Services (In-Database) in SQL Server 2016 and that’s the reason I have used both, SQL and AdvanceAnalytics features to be installed in the above command.

If you are doing offline installation, you need to follow the instructions mentioned here.

So far, we have installed the components required for running R Services; now to enable the R Services feature itself you need to explicitly enable it with the following command before you can invoke R scripts in SQL Server.

--Execute this script to verify if R Services in enabled 
EXEC sp_configure  'external scripts enabled'
 

Verify R Services
Verify R Services

--Execute this script to enable R Services on the SQL Server   instance
EXEC sp_configure  'external scripts enabled',
RECONFIGURE WITH OVERRIDE
 
--Execute this script to verify if R Services in enabled
EXEC sp_configure  'external scripts enabled'
 

Verify R Services


Verify R Services 

Finally, you need to restart SQL Server service, which will automatically restart the related SQL Server Trusted Launchpad service (it runs advanced analytics extensions to enable integration with Microsoft R Open using standard T-SQL statements; disabling this service will make the Advanced Analytics features of SQL Server unavailable) as well for executing external R scripts.

SQL Server Configuration Manager
SQL Server Configuration Manager

At this time, if you execute the following command again, you will notice 1 for both config_value and run_value, which means you are not good to execute R script on this SQL Server instance:

--Execute this script to verify if R Services in enabled 
  EXEC sp_configure  'external scripts enabled'

Verify R Services
Verify R Services

SQL Server 2016 includes a new system stored procedure (sp_execute_external_script) to enable you to execute external scripts inside the database engine. The stored procedure loads and executes a script written in a supported language from an external location. In my subsequent article, I will discuss this stored procedure in detail, though here is an example of calling this stored procedure to execute R script, which fetches a sample Irish dataset and returns to SQL Server.

EXEC sp_execute_external_script  
       @language = N'R'  
       , @script = N'iris_data <- iris;'  
       , @input_data_1 = N''  
       , @output_data_1_name = N'iris_data'  
       WITH RESULT SETS 
       ((
              "Sepal.Length" float not null,   
              "Sepal.Width" float not null,  
              "Petal.Length" float not null,   
              "Petal.Width" float not null, 
              "Species" varchar(100)
       )); 

Results
Results

Conclusion

SQL Server 2016 brings native support to doing advanced analytics in the database itself using R Services. In this article I talked about what R is, what the on-premise advance analytics options from Microsoft are, and how to get started using R Services with SQL Server 2016. In next article we will look into executing R scripts from SQL Server Management Studio and other R client tools.

Resources

Set up SQL Server R Services

Data Science End-to-End Walkthrough

Getting Started with SQL Server R Services

See all articles by Arshad Ali

Arshad Ali
Arshad Ali
Arshad Ali works with Microsoft India R&D Pvt Ltd. He has 8+ years of experience, mostly on Microsoft Technologies. Most recently, as a SQL Developer and BI Developer he has been working on a Data Warehousing project. Arshad is an MCSD, MCITP: Business Intelligence, MCITP: Database Developer 2008 and MCITP: Database Administrator 2008 certified and has presented at several technical events including SQL-School. On an educational front, he has an MCA (Master in Computer Applications) and MBA-IT. Disclaimer : I work for Microsoft and help people and businesses make better use of technology to realize their full potential. The opinions mentioned herein are solely mine and do not reflect those of my current employer or previous employers.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles