Geocoding location data with dismo

Today’s Gist could actually end up being very useful to a number of you. It’s something of a trumped-up example, but it illustrates in very simple code how to do three interesting things:

  1. Gather Tweets by search term (which we’ve done before), and look up user info for each of the users returned by that search.
  2. Convert textual user location data to approximate latitude & longitude coordinates with the Google geocoding web-service, using a single function, geocode(), from the dismo package. This is a revelation to me, and though there appears to be a daily rate limit, I can imagine so many applications for which this would be useful.
  3. Very easily plot a world map (albeit with a lame projection), and superimpose points indicating the inferred location of #rstats-Tweeting users.

And all in just 29 (+/-) lines. Truly, truly, we are living in a great era for statistical computing.

Evaluating term popularity with twitteR

I really wanted to put something together for this series on the twitteR package. Unfortunately, at the moment the number of interesting things than can be done with twitteR, as opposed to through API calls and RCurl, is limited. Regardless, I have Yet Another Invented Application to illustrate a pretty typical use-case for twitteR: grabbing Tweets by search term.

I’ve done this before, for sentiment analysis of Tweets about Republican presidential primary candidates, and indeed, despite its limitations, the searchTwitter()¬†function can be useful. Since the number of Tweets one can grab appears to be limited to 1000, this Gist attempts to infer term popularity by frequency — with only minor success, as you can see in the plot below.