Originally, I wanted to explore whether or not the punk rock genre of music is still a present style in both the SXSW and ACL austinmusic scenes and look at how much people on Twitter were discussing the two and draw some elaborate comparisons to show if the genre could be considered “dead” in the area.  However, working within the confines of this project and with the Twitter API raised several different problems that made this difficult to look at.  The biggest of these was the lack of geolocated tweets available to scan and use in order to draw any real correlation between whether or not people where talking about punk rock in regards to the festivals.

In fact, whenever I even just tried to pull the terms “punk” and “austin” from the API, I was left with only a small handful of tweets and several of them about how much of a “punk” some guy named Austin was.

This leads to my current data set and some of the problems/issues I had dealing with the data.  In order to get a large enough compilation of data to use, I had to choose a incredibly broad term search such as “music” and “austin”.  This still needed to be parsed down because some of the tweets weren’t anything related to music in the Austin area.  In fact, many of the tweets that pulled themselves onto the map were tweets where I couldn’t find ANY reference to either one of the search terms that I had been interested in.

I’m looking forward at learning more about how to better increase the amount of tweets I can get access to (outside of lat/long geolocation) and also in discovering what techniques can be used in order to parse out some of the irrelevant or unrelated conversations going on around particular search terms.  Until then, feel free to explore what I was able to put together for the assignment, and feel free to leave me any comments with thoughts or recommendations!

Too Much Data VS Not the Right Type of Data