xml output for sphinx

3
XML Output for Sphinx • Motivation: applications may be able to make use of richer information from sphinx including n-best lists, the word lattice, and other features. An xml dtd format will be standard, and easy to parse, express, and modify.

Upload: darius-padilla

Post on 31-Dec-2015

26 views

Category:

Documents


0 download

DESCRIPTION

XML Output for Sphinx. Motivation: applications may be able to make use of richer information from sphinx including n-best lists, the word lattice, and other features. An xml dtd format will be standard, and easy to parse, express, and modify. Proposed DTD. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: XML Output for Sphinx

XML Output for Sphinx

• Motivation: applications may be able to make use of richer information from sphinx including n-best lists, the word lattice, and other features. An xml dtd format will be standard, and easy to parse, express, and modify.

Page 2: XML Output for Sphinx

Proposed DTD

– http://www.cs.cmu.edu/~tkharris/usi/utterance-0.1.dtd

– Sphinx produces utterances, each utterance is an xml document that conforms to the DTD

– An utterance is an n-best list or word-lattice or both

– An n-best list is a list of lists of words

– Each list and the words may have features

– The DTD desperately needs review

Page 3: XML Output for Sphinx

Issues

• Is the motivation justified?

• Computational/Network impact too much?

• API’s are needed to parse XML

• Need to get requirements/observations from Sphinx customers