R For Analytics: A Beginner’s Guide, Part 2
IMPORTANT: THE PACKAGE HAS BEEN UPDATED BUT ADAM HASN’T HAD TIME TO UPDATE THIS YET! PROCEED AT YOUR PERIL!
Last week, I discussed R briefly, and why it might be useful to people doing analytics work. I also had a quick look at the libraries that would allow you to interface R with Google Analytics. At this point, you should have R and RStudio installed on your computer, and also have installed the RGA package on your R setup. If you haven’t done that yet, go back to the previous entry and do that first.
Next, we need to authorise the package to access our GA data. You have to go to the Google Developers Console and set up a new project – it doesn’t really matter what it’s called – then select the ‘APIs’ option from the “APIs & auth” menu on the left-hand-site, enable the Analytics API (click on it, turn it “on”), then go to “Credentials” (below “APIs” on the left-hand menu) and click “Create new Client ID”, and select “Installed application”, and click “Create ID”. This is a bit of a bother, and the Developer Console interface can be a bit sluggish and weird, but it’s preferable to all the messing around with temporary tokens that expire after an hour and that kind of nonsense that you have to do with some of the other libraries.
Now you’ve got the client ID and client “secret”, you can authorise… sorry, authorize* the application, so open RStudio, click on the “packages” tab in the bottom-right window if it’s not already selected and tick the box next to “RGA”. Then, give it access to your GA account by entering the following code into the R console:
authorize(client.id = "your client id", client.secret = "your client secret")
(quotes included). This will open an OAuth page in your browse asking if you want to allow the app to access your Google Analytics data (you do (obviously)).
Now you have access to the API with R, you can begin to pull in data.
get_accounts: This is a pretty basic command – it lists all the accounts to which you have access.
get_accounts(start.index = NULL, max.results = NULL, verbose = getOption ("rga.verbose",FALSE))
You can also get profiles (views), segments and goals, using similar commands:
get_profiles(account.id = "~all", webproperty.id = "~all", start.index = NULL, max.results = NULL, verbose = getOption("rga.verbose", FALSE)) get_segments(start.index = NULL, max.results = NULL, token, verbose = getOption("rga.verbose", FALSE)) get_goals(account.id = "~all", webproperty.id = "~all", profile.id = "~all", start.index = NULL, max.results = NULL, verbose = getOption("rga.verbose", FALSE))
For these, the things you can tweak are pretty similar – account.id, webproperty.id and profile.id for some of them – which just mean that you can get only the results for the accounts, properties and views (profiles) that you’re interested in. Only want the views for, e.g. the Universal Analytics property for an account? Enter the account.id and web property.id in the quotes in place of “all”. Pretty simple.
This isn’t, it should be emphasised, a comprehensive guide, and there are quite a few other things you can do in this vein – pulling lists of filters on views, for instance – but for that I would suggest you more fully investigate the library itself. As for the meat of the RGA library’s utility – getting analytics data, we’ll be looking at that next time.
- Note to the reader: It’s probably easier to copy and paste a lot of this code, filling in your specific values where necessary. When I was initially experimenting with this stuff, I had the irritating experience of inexplicable non-functionality – and after [far more time than I’m comfortable to admitting to] I realised that my error had been introduced due to my attempting to type out the code (a useful discipline I picked up doing Zed A Shaw’s excellent Python The Hard Way course) but I had neglected the fact that, like all good British things – football, Doctor Who, etc – the spelling of the word ‘authorise’ has been ruined by Americans, – a ‘z’ not an ’s’.