I’ve had a number of interesting conversations with various groups engaged in natural language processing (or NLP) over the past few weeks. These having included Harvard University’s Institute for Quantitative Social Science (IQSS), Harvard’s Berkman Center on Internet and Democracy, the European Commission’s Joint Research Center (JRC) and the private sector company Virtual Research Associates (VRA). There is a lot of very interesting (potentially groundbreaking) work being carried out in the field of NLP, particularly given the shift towards a more generative Internet (see Zittrain’s new book on the Future of the Internet and Benkler’s excellent piece on The Wealth of Networks for some fascinating insights on the generative Internet).
So I thought I’d use this blog entry to list some of the leading papers/articles on the topic:
- Hopkins, Daniel and Gary King. 2008. “Extracting Systematic Social Science Meaning from Text,” Institute for Quantitative Social Science, Harvard University. [PDF]
- Tanev, Hristo; Jakub Piskorski and Martin Atkinson. 2008. “Real-time News Event Extraction for Global Monitoring Systems,” Joint Research Center of the European Commission, Web and Language Technology Group of the Institute for the Protection and Security of the Citizen (IPSC). [PDF]
- Piskorski, Jakub; Tanev, Hristo; Martin Aktinson and Erik van der Goot. 2008. “Cluster Centric Approach to News Event Extraction,” Joint Research Center of the European Commission, Institute for the Protection and Security of the Citizen (IPSC). [PDF]
- King, Gary and Will Lowe. 2003. “An Automated Information Extraction Tool for International Conflict Data with Performance as Good as Human Coders: A Rare Events Evaluation Design,” International Oganizations, 57, Summer 2003: 617-642. [PDF]
- Bond, Doug; Bond, Joe; Oh, Churl; Jenkins, Craig and Charles Taylor. 2003. “Integrated Data for Events Analysis (IDEA): An Event Typology of Automated Events Data Development,” Journal of Peace Research, 40(6): 733-745. [PDF]
I would be happy to continue adding to this list so if anyone has any recommendations for additional references, please don’t hesitate to contact me.
3 responses so far ↓
Detecting Rumors with Web-based Text Mining System « Conflict Early Warning and Early Response // February 15, 2009 at 8:38 pm |
[...] would really like to apply the methodology to early detection and tracking of conflict rurmors. See this post for more on early warning and natural language parsing. Possibly related posts: (automatically [...]
A Brief History of Crisis Mapping (Updated) « iRevolution // March 12, 2009 at 5:39 pm |
[...] (ACM), real-time and automated information collection mechanisms using natural language processing (NLP) have been developed for the automated and dynamic mapping of disaster and health-related events. [...]
Towards a “Theory” (or analogy) of Crisis Mapping? « iRevolution // August 25, 2009 at 4:52 pm |
[...] are increasingly able to extract time and place data from online media and user-generated content (example); and innovative crowdsourcing platforms are producing new geo-referenced conflict datasets [...]