Every once in a while when I'm talking to Siri, I can get something that hints at conversation:
ME: Set a timer, please SIRI: For how long? ME: 45 minutes SIRI: OK, 45 minutes and counting
But mostly, the modern voice-driven digital assistants (Apple's Siri, Amazon Echo, Microsoft Cortana, Google Now, etc.) respond to single commands or questions, without any conversational back and forth. Find me the nearest pizza place, how tall is Mount Rainier, play me some Stevie Wonder, what's the temperature in Duluth?
Contrast this with this short fragment of conversation from the film 2001: A Space Odyssey:
HAL: I've just picked up a fault in the AE-35 unit. It's going to go a hundred percent failure within 72 hours. DAVE: Is it still within operational limits right now? HAL: Yes, and it will stay that way until it fails.
One of the many things that distinguishes this fictional exchange between computer and person from modern speech-driven assistants is that HAL is able to understand Dave's use of the word "it". HAL's ability to determine that "it" refers to the "AE-35 unit" is an example of anaphora resolution. Anaphora resolution is one of many problems confronted in the general research area of dialogue systems.
This project will focus on surveying and learning about the field of dialogue systems.
For this project, you will study the dialogue systems theory literature and, based on your research, develop a voice-driven conversational command-and-control system for a simple program. The project will focus on studying and understanding a selection of algorithms from the dialogue systems literature. Implementation of those algorithms will illustrate your understanding and presumably result in a neato program, but the research and understanding comes first, and I will require each feature in the final system to be explicitly backed by references from the literature.
There are many possible subjects for your conversational system. One example might be a simple diagram construction program:
PERSON: Draw three small circles side by side COMPUTER: What color? PERSON: Make the left one red, the middle one blue, and the right one purple COMPUTER: OK PERSON: Draw an arrow from the blue circle to the purple circle etc.
There are, of course, many other possible programs such a system might be designed to control. A to-do list ("move the next two items to the top of the list" and "delete the items about my dog"). A window control system ("maximize the other window"). A calendar ("schedule me for yoga every Tuesday at four"). In any case, we will need something that's rich enough to support a reasonable variety of dialogues, but also simple enough that we'll be able to keep our focus on the dialogue itself rather than on the program the dialogue is about. We'll negotiate this choice early in the fall.
Your final deliverables will include: