•   When: Tuesday, January 16, 2018 from 02:00 PM to 03:00 PM
  •   Speakers: Sneha Nagpaul
  •   Location: ENGR 4201
  •   Export to iCal

Given the deluge of data caused by crowd-generated content from social media websites, the complexity of extracting information from has increased manifold. An important characteristic of such text is its original location which can, in turn, be used to respond to emergencies such as floods and crimes. The patterns discovered by such geolocation of social media related unstructured text can also be used commercially for targeted advertising and recommender systems. This work deals with geolocating text from Twitter data that are labeled with a user’s information. However, instead of locating the user who can be viewed as a collection of tweets, it focuses on locating individual tweets. For this task, the problem is described within the multiple instance learning framework,  and a novel approach using neural networks is designed which trains a tweet level classifier using only user location labels. The model outperforms state of the art in multiple instance learning and provides significant scalability and speedup compared to existing methods. The intuitive instance level neural network classifier exceeds the Bag of Words models from prior geolocation research by discovering high-level language features such as grammar and identifies name places without feature engineering.

Posted 6 years, 10 months ago