In recent years, machine learning techniques have shown significant potential to the practice of automated message processing, in particular dialog act classification for spoken words, but the use of such classifiers on non-spoken words has not been investigated. Almost every user of the Internet is using one or more text messaging services. These include email, Newsgroups, and instant messaging. In 1999, an estimated 3 billion email messages were sent every day in the USA. The study by Kraut et al. showed that interpersonal communication is a stronger driver of Internet use than are information and entertainment applications. That is, text messaging services are virtually everywhere and constantly demanding our attention as messages arrive. This ubiquity and the accelerating growth in the number of text messages make it important that we develop automated text message processing and text message based query processing in general.
However, conversational speeches (e.g. telephone conversation) are different from electronic text messages (e.g. email, Newsgroups articles). The main difference is that the speaker turns (i.e. who is talking) are explicitly specified by the non-text field (i.e. sender) of the messages. Therefore, there is no need to identify the speaker turns. Another difference is that text messages contain longer and more descriptive sentences than conversational speech. That is, a text message is a mix of conversational speech and normal text documents. We propose a method of extracting intentions from online text messages, such as web query strings and instant messages. The experiments describe a first attempt to extract intended acts from Usenet newsgroup messages. Dialog-act classifiers were constructed to extract dialog-acts, and then sender’s’ intentions were deduced based on intentional speech act theories. Predicate forms (verb-object-subject) of each dialog-act units together with the extracted dialog-acts were then used to represent a set of intended acts (speech acts) of message senders.