Recent Changes - Search:



> Robin Stewart


Automatic Identification of Off-Topic Regions of Conversation

Senior Thesis
Williams College
May 22, 2006


A collection of recorded and transcribed telephone conversations clearly demonstrates the universality of small talk and other socially-motivated utterances. Building on theories about the linguistics of conversational speech, I consider various ways of describing each utterance, including which words were used, their part-of-speech, and the proximity to the beginning of the conversation. In order to better understand which of these features are most useful, I create a system for automatically distinguishing between on- and off-topic utterances and compare its performance when using different combinations of these features. The central hypothesis is that conversational speech contains sufficient low-level clues to separate on- and off-topic utterances with an automatic classifier. I find that the overall structure of conversations is predictable, and automatic classification can indeed be done with better-than-chance accuracy. But distinguishing more reliably between on- and off-topic utterances will probably require deeper knowledge of the context and overall topic.

Download Thesis (pdf)

Edit - History - Print - Recent Changes - Search
Page last modified on October 30, 2008, at 03:23 PM