Using Twitter as a source, we can analyze every ‘tweet’ that comes through. The challenge is to determine whether or not a tweet is talking about a news story—and if so, which news story it is referring to. It is conceivable that people will Tweet about breaking news even before it hits the popular news websites, and the algorithm should recognize this. The algorithm will know the Twitter short-hand language. Text may or may not contain links, ‘retweets’, ‘hash-tags’, ‘@ messages’ or a ‘trending topic’. The source of a tweet can also be used to enhance the algorithm, whether it is the web, cell phone, or some other Twitter application. Other challenges include ambiguous news stories and hyper-local stories. For example, if there is a tweet that mentions, “car accident on Main Street”. First, it needs to be determined that it is in fact a news story, and not someone simply tweeting that they scratched their bumper. Main Street is a common street name, and accidents happen very frequently. The algorithm will try to determine the origin of the tweet using longitude and latitude to determine if the tweet matches the location of the news story. Using the lat-lon information, there will be a map of the United States showing the most popular news stories for each region.
Bio:
Zack Schwartz graduated with his B.S. in Computer Science in 2011 with a minor in Sociology. He graduated with his M.S. in Computer Science with a concentration in Computer Security/ Information Assurance in 2013. He studied abroad at University College Dublin in Dublin, Ireland during Spring 2010.
Documentation: