ibm - cvut student research projects creating mp3 from text documents & viewing text information...
TRANSCRIPT
IBM - CVUT Student Research Projects
Creating mp3 from text documents&
Viewing text information via voice
Zlatuška Martin ([email protected])Kašpárek Jaroslav ([email protected])Mařan Jiří ([email protected]) Kadlec Antonín ([email protected])Příhoda Karel ([email protected])Meloun Jaroslav ([email protected])
IBM - CVUT Student Research Projects
2
Project goalsCreating mp3 from text documents
• Create mp3 generator, that creates a set of mp3 files with ID3v2 tags, from a text file.
• In ID3 tag, there will be original synchronized text, chapter names, images and other information
• Input is a text document, output is compressed directory with mp3 files and images.
• Project can be expanded for reading wikipedia, certain news portal or email.
Viewing text information via voice• Design and create multiplatform program for playing mp3s created from text
documents• Player should have intuitive user interface and can be operated with several
voice commands.• Player reads given text and allows shifting between paragraphs, chapters or
shifting by given keywords.
IBM - CVUT Student Research Projects
3
Text→mp3 solution architecture
firefoxplugin
UIMA framework
html→text text→wav wav→mp3 ID3 tags zip
ServerUser
text URL
link on archive with mp3 files
IBM - CVUT Student Research Projects
4
Description of text→mp3 solution • user sends desired website URL with firefox plugin to text→mp3
server• parameters hands over cgi script on the server
(following server parts are connected via UIMA framework)
• html →text – downloads web page and parses it on text paragraphs, headlines and images
• text →wav – converts text to audio and generates words with time marks
• wav →mp3 – encodes raw audio to mp3 format• ID3 tags – writes synchronized text and other information in ID3 tag• zip – packs directory with all generated data and sends a download
link back to user
IBM - CVUT Student Research Projects
5
MP3 player• intuitive user interface• speech command interface• written in flash
PC - linux PC - windows PDA smartphones carPC
Video with player demonstration
multiplatform
IBM - CVUT Student Research Projects
6
Project status• text→mp3
• server parts– text→wav (100%)– wav→mp3 (100%)– ID3 tags (100%)– connection via UIMA framework (70%)– html →text (50%)– zip (100%)– cgi skripts (0%)
• client parts– firefox plugin (80%)
• mp3 player• finished and fully functional• tested under windows XP, testing on other platforms is not finished yet• usability tests are in progress