R For Analytics: A Beginner’s Guide, Part 1
Two years ago, or thereabouts: I was introduced in my statistics lectures to a programming language and software environment called R. I took to it immediately, preferring it by far to the simpler software we’d used before, and also to the proprietary software (SAS, SPSS) we used in addition.
Flash forward to:
A couple of months ago, sitting in the audience for the talks at MeasureFest, and then we had what was, I think, by broad agreement, the best talk by Ela Osterberger, about using R with the Google Analytics API. It was nice to see someone talking about this, as I’d spent a fair bit of time in the months leading up to it doing exactly this. The talk was excellent, and the slide design was sumptuous, but to the best of my knowledge, none of it was put up online.
Flash forward to:
Me, right now, writing this (I’m actually writing this on Friday but we’ll probably be publishing next week for better numbers). A little while ago, I was asked for some unsampled data from GA. This can be quite annoying, especially when you bump up against the API limits, and running a million enquiries with the spreadsheets extension or query explorer is just a lengthy, tedious slog. However, there is an easier way!
R is incredibly powerful and, once you’ve got a handle on it, can be used for tasks both simple and complex. That said, it’s also free software, and suffers somewhat from the newcomer-hostility that free software is prone to. I learnt R through a command-line interface, but I already had a little bit of programming experience to smooth the way – some of the other people learning alongside me found it heavier going. However, it can, through judicious use of additional tools, be made much friendlier.
First, you need to download and install the package itself from any of the innumerable mirrors, and then I’d suggest, to ameliorate some of that user-unfriendliness, that you download RStudio, an IDE (integrated development environment) that provides a graphical user interface. In fact, even if you don’t care*, it’s probably a good thing to have, as it makes a fair few things easier to do.
Once you’ve installed those two, you need to pick your R Google Analytics package. It may surprise you (it surprised me) to learn there are several of these. They are:
The package that was most easily discoverable when I was researching this a while back was R Google Analytics, developed by Google employees so arguably the most “official”. Regrettably, I found it at a point when it was rather better-documented than it was now, with the user guide for some reason having been removed from the (there’s still a step-by-step on Daniel Waisberg’s site) but I’ve never been able to get this particular extension working as it should, especially with regards to paginating results so as to bypass the API query limits.
2). rga (lower-case)
rga is much better-documented, and seemed as though it would do everything I wanted it to. Regrettably, I couldn’t get it going (though I know several people who were able to, and swear by it, so your mileage may vary).
3). RGA (upper-case)
Took me a little while to notice this one, because as you can imagine, given that the name is exactly the same but for case, it’s not the easiest thing to search for. This is the one that I’ve settled on for the moment, because it seems to be the one that (for me, anyway) allows easiest access to the data, with the least amount of fuss. It also allows you to query the metadata, management, real-time reporting and multi-channel funnels APIs too, so it has a great deal of versatility and additional functionality.
Once you’ve downloaded R and R Studio, you need to install RGA, which you do by entering into the RStudio console:
Next time, we’ll be looking at how to actually use the extension.
*though if that’s the case and you’re willing to tackle the command-line interface, this is probably a bit basic for you