slr validation: procedures and prospects eric sanders henk van den heuvel

13
SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

Upload: rolf-parks

Post on 14-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

SLR Validation: procedures and prospects

Eric Sanders

Henk van den Heuvel

Page 2: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

• WHAT

• WHY

• WHEN

• WHO

• HOW

• WHERE

is validation?

validate databases?

validate databases?

validates databases?

do we validate databases?

do we go from here?

Page 3: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

WHAT is validation?

1. checking a SLR against a fixed set of requirements

2. putting a quality stamp on a SLR as a result of this check

3. the evaluation of a SLR in a field test

Page 4: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

WHY validate databases?

• increasing number of databases

• high costs and price of databases

• fair trade (SpeechDat)

Page 5: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

WHO validates databases

• for SpeechDat?

• for ELRA?

• in general?

Page 6: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

WHO validates databases and WHEN?

Validator

Validation scheduling

during production

after production

internal 1 2

external 3 4

Page 7: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

HOW do we validate databases?

• procedure

• check points

• rank order

• quality values

Page 8: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

Validation procedureSLR

1.Prevalidation (10 spk)

2. Validation

Ready for distribution

OK?

3. Revalidation

yes

no

Page 9: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

Check points

documentation database format design speech files label files phonemic lexicon speaker & environment distributions orthographic transcriptions

Page 10: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

Rank order

1. indispensable: speech signals, orthographic transcription, documentation

2. some flaws allowed: design, speaker and environment distributions

3. not very important: label files, database format, lexicon

Page 11: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

Quality values

• OK

• Not OK, but acceptable

• Not acceptable

Page 12: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

WHERE do we go from here?

• validate databases from ELRA’s catalogue

• bug report

• SPEECON

• NETWORK-DC

• exchange ideas

Page 13: SLR Validation: procedures and prospects Eric Sanders Henk van den Heuvel

References

• H. vd Heuvel et al., SLR Validation: present state of affairs and prospects, LREC 2000

• H. vd Heuvel, The Art of Validation, to appear in ELRA News oct 2000.

• www.spex.nl/spex