Friday, June 19, 2015

Alternative Versions of R

Fair warning: most of this post is specific to Linux users, and in fact to users of Debian-based distributions (e.g., Debian, Ubuntu or Mint). The first section, however, may be of interest to R users on any platform.

An alternative to "official" R

By "official" R, I mean the version of R issued by the R Foundation. Revolution Analytics has been a source of information, courses and a beefed-up commercial version of R (Revolution R Enterprise). They have also produced an open-source version, Revolution R Open. It is freely available from their Managed R Archive Network (MRAN). I'll leave the description of its features to them, and just mention the one that caught my eye: coupled with the Intel Math Kernel Library, it provides out-of-the-box multithreading, which should speed up long computational runs.

Revolution Analytics is now a Microsoft property, and I can't being to describe the cognitive dissonance caused by downloading open-source software from Microsoft. Nonetheless, I decided to install RROpen. One catch is that it requires reinstalling pretty much every R package I use, which can be a bit time consuming. The remainder of this post is about the other catch: how to get RROpen and "official" R (henceforth "R" without any qualifications) to coexist peacefully, on a Debian-based distribution (Mint in my case).

What's the problem?

RRO and R install in different directories, so why is anything special necessary to let them coexist? I found out the hard way. After installing RRO (on a system that already had R), I found that the command R in a terminal launched /usr/bin/R, which RRO had replaced with its R binary, making itself the default choice. That was fine for me. This morning, though, the system package manager notified me that the r-base package had been updated. I authorized the update, which among other things resulted in <sigh>/usr/bin/R being replaced by the "official" R version</sigh>.

That can be fixed manually from a terminal (as root), but it appeared that I was doomed to perpetually mediating between the two versions whenever the r-base package updated. This could be why Rich Gillin ("owner" of the Google+ Statistics and R community) recommended uninstalling R before installing RRO.

What's the solution?


Debian-based distributions of Linux use the Debian alternatives system to manage contention between different programs that have the same use. In essence, the system puts a link in a directory on the system command path that points not to a specific executable but to a directory containing choices for what program should answer to that name. Applied to our case, /usr/bin/R becomes a symbolic link to /etc/alternatives/R, which in turn is a symbolic link to the version of R you want to execute when you type R at a command prompt (or when, say, the RStudio IDE launches). In my case, that link points to /usr/lib64/RRO-3.2.0/R-3.2.0/lib/R/bin/R. (How would like to have to type that every time you wanted to run RRO?)

To manage alternatives, you can either use the update-alternatives command in a terminal or the Galternatives GUI for it. On my system, though, Galternatives does not entirely work. It lets you add alternatives once a base configuration is set, but it will not let you create (or delete) the initial configuration. So I went with the command line.

Assuming you have both RRO and R installed, here are the commands to set them up as alternatives:

sudo rm /usr/bin/R
sudo update-alternatives --install /usr/bin/R R /usr/lib64/RRO-3.2.0/R-3.2.0/lib/R/bin/R 200
sudo update-alternatives --install /usr/bin/R R /usr/lib/R/bin/R 100

The anatomy of the middle line is as follows:
sudo
Run this stuff as root.
update-alternatives
Run the configuration update tool.
--install
I want to install a new alternative for a program. Other choices include --configure to alter the default choice, --list to show alternatives for a program, and --display to show a list including priorities.
/usr/bin/R
Where the link to the various alternatives will go. This should be somewhere on your command path. A program (such as RStudio) trying to execute R with no path, or using the default path for R, will look here and be forwarded to the chosen version. Similarly, if you just type R in a terminal, this is the program that will run.
R
The name used in the alternatives directory for the symbolic link to the actual executable.
/usr/lib64/RRO-3.2.0/R-3.2.0/lib/R/bin/R
The executable to be run.
200
A priority value. The highest priority alternative becomes the default choice, unless you manually override it.
To confirm that it worked, try running update-alternatives --display R in a terminal (no need to sudo this one). It should list your two choices for R, with priorities, and indicate which is the default choice. For me, that is RRO. If you want "official" R to be the default choice, switch the priorities.

The first time I tried this (with /usr/bin/R being the RRO executable), I omitted the first line, and update-alternatives informed me that it had not created /usr/bin/R (presumably because it was reluctant to overwrite what was already there, despite running with root privileges). So I suggest deleting the current entry before setting up the alternatives. Also, as an aside, after running the first two lines, I was able to us Galternatives to enter the third one, and I can use Galternatives to change which is the default choice for R.

I tried reinstalling the r-base package after doing this, and the default R choice remained RRO, so I think this will survive upgrades to either system. We'll see.

No comments:

Post a Comment

If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on OR-Exchange.