2012 Poster Sessions : Understanding Digital Communications in 5,000 Languages

Student Name : Robert Munro
Advisor : Chris Manning
Research Areas: Artificial Intelligence
The recent global proliferation of technology, cell phones in particular, means that there are roughly 5,000 languages in the connected world -- that's how many languages you could find at the other end of your phone right now. Text-messaging (SMS) is the most popular form of remote communication for most languages, surpassing regular mail, email, and actual phone calls. However, very little is know about the nature of how people express language in short message communications, especially in the context of non-standardized spellings, varying literacy, and frequent code-switching between languages. This poster will showcase a number of research projects and actual deployments that seek to triage communications in less-resourced languages, leveraging advances in natural language processing and crowdsourcing. This includes using new methods to process health-related messages in the Chichewa language of Malawi, emergency response communications in Haitian Kreyol, and crisis-information reports in the Urdu, Sindhi and Pashto languages of Pakistan. For all three contexts, it is shown that new natural language processing technologies allow us to better understand the world’s digital linguistic diversity and in turn how we can use the same technologies to aid the speaker communities in projects as varied as health, education, crisis-response, employment, and access to market information.

Robert is a Stanford Graduate Fellow in computational linguistics in the final year of his PhD, specializing in natural language processing and crowdsourcing technologies for processing large volumes of communications in less-resourced languages. His background includes working for the UN High Commission for Refugees in Liberia, Mission 4636 in Haiti, Energy for Opportunity in Sierra Leone, the Endangered Languages Archive in London, Global Viral Forecasting world-wide, and various Silicon Valley search-engines and start-ups.