Sean Simpson

PhD Candidate, Computational Linguistics

Georgetown University

Detecting Novel Drug Terminology

Up-to-date knowledge of street terms for high-risk, rapidly evolving recreational drugs is essential for health-care professionals working with addicted and at-risk populations. Unfortunately however, the time lag between the advent of a new term and recognition of that term by public health researchers is often measured in months, if not years.

Since May 2016 I've been working with a research team at the Center for Advanced Study of Language (CASL) in collaboration with the Center for Substance Abuse Research (CESAR) to develop automated methods for detecting novel drug terminology in social media discourse. The system we are currently developing, of which I am the principle designer, crawls continuous social media streams and draws on vector space semantics to search for unknown terms which fit the contextual linguistic profile of known drug terminology.

While still in trial phase, our initial results indicate that such an approach can identify previously unknown drug slang with a high degree of accuracy and cut the lag time between term introduction and detection to weeks, if not days. An article detailing this system and our initial results is currently under review for the Journal of Medical Internet Research.

The Catalogue of Endangered Langauges (ELCat)

Since 2012 I have been contributing (along with many others) to the endangered language research presented in both the Catalogue of Endangered Languages and the Endangered Languages Project (ELP). An exciting and encouraging result of our work on the Catalogue is that our data do not support the oft-cited claim that a language dies roughly every two weeks. Rather, our data suggest that the current extinction rate is closer to one language every 3 months.

To hear more about our revised language extinction rate and other findings, please check out our forthcoming publication from Taylor & Francis, entitled Cataloguing the Endangered Languages of the world. Look also for the chapter that Anna Belew and I contributed to the forthcoming Oxford Handbook of Endangered Languages, which draws heavily on the data we have collected in the Catalogue.

Hawaii English
In collaboration with Katie Drager, Joelle Kirtley, and James Grama, I have been involved in several related research projects aimed at describing and investigating language variation and change in Hawaii. We presented a subset of our findings concerning socially conditioned variation of the vowels /ɪ/ /ɛ/ and /æ/ in Hawaii English at NWAV 41 (2013), and presented work investigating changes in progress involving the back vowels of Hawaii English at NWAV 42 (2014). Out of these related research projects has come the first major acoustic description of the vowels of Hawaii English, published in 2016 in the Journal of the International Phonetic Association.
Twitter: @LongSeanSilvr
Github: LongSeanSilvr