Wednesday, 10 September 2014

Installing RMySQL in Windows to connect R with MySQL

If you're using the wonderful statistical computation tool R, and need to fetch data from or write data to MySQL databases (or run any MySQL query for that matter from the comfort of an R environment) you'll need to download some third party packages to do the same.

One of the most popular of those packages, is - no points for guessing - RMySQL - but installing it in a Windows environment isn't a very easy task. This is because unlike other R packages that can be downloaded and installed using the install.packages("packagename") from the R console, RMySQL isn't available as a precompiled .zip archive. It needs a certain Windows dynamically linked library (.dll) in order to work, and therefore, while it works out of the box in *nix OSes, it requires compiling on Windows.

Here's how you get around to making it work. This has been tested on a 64 bit Windows 8.1 computer.

First up, you need to have R installed on your system. I'm not sure about the exact versions of R that are compatible with RMySQL, but my guess is that it should work with most of the newer ones (mine, for example is R 3.1.1 - which was released in July 2014.) If you don't have R, you can always head over to the r-project page on CRAN to get it.

Next, you need to have MySQL Server running on your machine. You can get that here, if you don't have it already.

Next you need to install RTools. Download it from this page. I downloaded RTools31.exe (choose the version that is compatible with your version of R) While installing RTools, select the following check-boxes in the setup wizard.

R toolset
Cygwin DLLs
R 2.15.x+ toolchain

Keep the other two checkboxes unchecked. (This isn't mandatory, but you won't be needing them for this exercise)

In the next window, you'll be shown an awful lot of file paths. Keep the checkboxes checked. This will ensure that RTools sets its required system variables correctly.

The next window prompts you to edit the system path. You don't need to do anything here, just check and see that the following filepaths are indeed added to your system path.


Now you can proceed with RTools installation.

The next step is to set your set your MySQL home, in order to tell Windows and other applications that wish to connect with MySQL (like in our case, R) where your MySQL libraries are located. There are multiple ways of doing that, but the simplest and quickest way is to add a new system variable called MYSQL_HOME and set it to your MySQL Server installation folder. For me, that looked something like this:

The last step is simple, but extremely important. RMySQL won't work if you don't do this, and I split my hairs oover this one step because I just couldn't figure out why RMySQL wasn't installing correctly. Open the lib folder of your MySQL installation and copy paste the libmysql.dll file over to the bin folder. This is because RMySQL expects for some reason, that libmysql.dll file exists in the bin folder, and therefore cannot compile the source code if it's not there.

Once that's done, you're good to go. Start up R, (or restart if it was already running) and type in the following command into the console.

install.packages(“RMySQL”, type = “source”)

It'll show a lot of compilation steps, but it should complete correctly.

You will now be able to load this package using the usual library("RMySQL") command use it to connect to your local database.

Create a connection object :

con <- dbConnect(MySQL(), host="", port= 3306, user="username",
   password = "password", dbname="databasename")

And run any queries that you wish :

dbGetQuery(con, "select * from foo;")

That's it. I hope you find this tutorial useful.
Post a Comment