ibm - cvut student research projects creating mp3 from text documents & viewing text information...

6
IBM - CVUT Student Research Projects Creating mp3 from text documents & Viewing text information via voice Zlatuška Martin ([email protected]) Kašpárek Jaroslav ([email protected]) Mařan Jiří ([email protected]) Kadlec Antonín ([email protected]) Příhoda Karel ([email protected]) Meloun Jaroslav ([email protected])

Upload: darrell-morgan

Post on 01-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IBM - CVUT Student Research Projects Creating mp3 from text documents & Viewing text information via voice Zlatuška Martin (zlatum1@fel.cvut.cz) Kašpárek

IBM - CVUT Student Research Projects

Creating mp3 from text documents&

Viewing text information via voice

Zlatuška Martin ([email protected])Kašpárek Jaroslav ([email protected])Mařan Jiří ([email protected]) Kadlec Antonín ([email protected])Příhoda Karel ([email protected])Meloun Jaroslav ([email protected])

Page 2: IBM - CVUT Student Research Projects Creating mp3 from text documents & Viewing text information via voice Zlatuška Martin (zlatum1@fel.cvut.cz) Kašpárek

IBM - CVUT Student Research Projects

2

Project goalsCreating mp3 from text documents

• Create mp3 generator, that creates a set of mp3 files with ID3v2 tags, from a text file.

• In ID3 tag, there will be original synchronized text, chapter names, images and other information

• Input is a text document, output is compressed directory with mp3 files and images.

• Project can be expanded for reading wikipedia, certain news portal or email.

Viewing text information via voice• Design and create multiplatform program for playing mp3s created from text

documents• Player should have intuitive user interface and can be operated with several

voice commands.• Player reads given text and allows shifting between paragraphs, chapters or

shifting by given keywords.

Page 3: IBM - CVUT Student Research Projects Creating mp3 from text documents & Viewing text information via voice Zlatuška Martin (zlatum1@fel.cvut.cz) Kašpárek

IBM - CVUT Student Research Projects

3

Text→mp3 solution architecture

firefoxplugin

UIMA framework

html→text text→wav wav→mp3 ID3 tags zip

ServerUser

text URL

link on archive with mp3 files

Page 4: IBM - CVUT Student Research Projects Creating mp3 from text documents & Viewing text information via voice Zlatuška Martin (zlatum1@fel.cvut.cz) Kašpárek

IBM - CVUT Student Research Projects

4

Description of text→mp3 solution • user sends desired website URL with firefox plugin to text→mp3

server• parameters hands over cgi script on the server

(following server parts are connected via UIMA framework)

• html →text – downloads web page and parses it on text paragraphs, headlines and images

• text →wav – converts text to audio and generates words with time marks

• wav →mp3 – encodes raw audio to mp3 format• ID3 tags – writes synchronized text and other information in ID3 tag• zip – packs directory with all generated data and sends a download

link back to user

Page 5: IBM - CVUT Student Research Projects Creating mp3 from text documents & Viewing text information via voice Zlatuška Martin (zlatum1@fel.cvut.cz) Kašpárek

IBM - CVUT Student Research Projects

5

MP3 player• intuitive user interface• speech command interface• written in flash

PC - linux PC - windows PDA smartphones carPC

Video with player demonstration

multiplatform

Page 6: IBM - CVUT Student Research Projects Creating mp3 from text documents & Viewing text information via voice Zlatuška Martin (zlatum1@fel.cvut.cz) Kašpárek

IBM - CVUT Student Research Projects

6

Project status• text→mp3

• server parts– text→wav (100%)– wav→mp3 (100%)– ID3 tags (100%)– connection via UIMA framework (70%)– html →text (50%)– zip (100%)– cgi skripts (0%)

• client parts– firefox plugin (80%)

• mp3 player• finished and fully functional• tested under windows XP, testing on other platforms is not finished yet• usability tests are in progress