Products ATOM Spoken Dialogue SDK
Join our mailing list

Step 1: Data Collection

Data collection is the process of collecting spoken natural language data for the purpose of system development, testing and tuning. Since data collection needs to be done at the beginning of the development process, no working system is available. Therefore, unbeknownst to the user, a human simulates the behavior of the dialogue system (Wizard of Oz simulation, named after the book The Wizard of Oz in which the Wizard was just a simulation controlled by a man behind a curtain).

Tools for Data Collection

For data collection, two networked computers are needed. The data collection software included in the SDK is installed on the computer used for recording speech. The wizard can interact with the user by listening to the users' recorded speech or by generating utterances through a text-to-speech system. The data collection software acts as a web server, therefore, all interactions from the wizards' side are done through a web browser.

Results of the Data Collection Process:

The result of Step 1 is a corpus of interactions between users and wizards. The interactions are logged and time-stamped. The log files are stored in XML format and can be viewed using standard web browsers. The users' speech is recorded and referenced from the log files. The recorded speech can be used to train speech recognizers or, after transcription, to develop grammars.

Supported Standards and APIs for Transcription:

EMMA
EMMA (Extensible MultiModal Annotation markup language) is used to represent the input events from the user.
SAPI 4.0 and 5.1
SAPI 4.0 or SAPI 5.1 compliant speech synthesis engines can be used to synthesize text from the wizard.