Big Data with R - Exercise book

Learn how to use R with Hive, SQL Server, Oracle and other scalable external data sources along with Big Data clusters in this two-day workshop. We will cover how to connect, retrieve schema information, upload data, and explore data outside of R. For databases, we will focus on the dplyr, DBI and odbc packages. These packages enable us to use the same dplyr verbs inside R but are translated and sent as SQL queries. For Big Data clusters, we will also learn how to use the sparklyr package to run models inside Spark and return the results to R. We will review recommendations for connection settings, security best practices and deployment options. Throughout the workshop, we will take advantage of the new data connections available with the RStudio IDE.