some thoughts on hpc in natural language engineering
DESCRIPTION
Some Thoughts on HPC in Natural Language Engineering. Steven Bird University of Melbourne & University of Pennsylvania. Sponsorship. Natural Language Engineering: Integrating Parallel and Parametric Processing Victorian Partnership for Advanced Computing Expertise Grant EPPNME092.2003. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/1.jpg)
Some Thoughts on HPC inNatural Language EngineeringSteven Bird
University of Melbourne &
University of Pennsylvania
![Page 2: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/2.jpg)
Sponsorship
Natural Language Engineering: Integrating Parallel and Parametric Processing
Victorian Partnership for Advanced Computing Expertise Grant EPPNME092.2003
![Page 3: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/3.jpg)
NLE Application Areas
Information Extraction Information Retrieval Authoring Tools Language Analysis Language Understanding Knowledge Representation Knowledge Discovery
Spoken Language Input Written Language Input Natural Language Generation Spoken Output Multilinguality Multimodality Discourse and Dialogue
Spoken dialogue systems Cross-language information retrieval Word-sense disambiguation Multi-document summarisation Natural language database interfaces
![Page 4: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/4.jpg)
Some NLE Applications in detail Information extraction from broadcast news
Tokenization, alignment, entity detection, coreference resolution, semantic mapping
Spoken language dialogue systems (SLDS) Speech recognition, parsing, user modelling, discourse
management, generation, synthesis Language analysis
Interlinear text annotation, lexicon development, morphosyntactic grammar development
![Page 5: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/5.jpg)
Meta Activities
Discovery What tools work with data in format X? What lexical resources exist for language Y?
Reuse Diverse implementation frameworks Component integration, wrapping, etc
Training and evaluation Parametric and parallel processing Comparing systems running on the same data Gold standard vs theory comparison Analyzing interaction logs
![Page 6: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/6.jpg)
Learn about NLE
This department hosts a mirror of the ACL digital anthology
50k pages, 40 years http://www.cs.mu.oz.au/acl/
![Page 7: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/7.jpg)
SLDS Architecture
![Page 8: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/8.jpg)
SLDS Components
![Page 9: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/9.jpg)
Another SLDS Architecture
![Page 10: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/10.jpg)
Observations Common components, different arrangements
Multiple components for doing the same task Most NLE components convert between
information types Parser: from strings to trees ASR: from speech to text Summariser: from text to selected text
But: Many processes benefit from other information sources
(e.g. exploiting intonation in input) Input and output can be aligned Solution: multilayer annotations
![Page 11: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/11.jpg)
Multilayer annotations
![Page 12: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/12.jpg)
Multilayer Annotations
![Page 13: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/13.jpg)
Annotation Graphs
Labelled digraphs with timestamped nodes
![Page 14: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/14.jpg)
Annotation Graphs: complex example
AGTK: Annotation Graph Toolkit library, applications agtk.sourceforge.net
![Page 15: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/15.jpg)
NLE and Grids
NLE Applications typically constructed out of numerous components each component responsible for a specialised task executed against large data sets
To use grids in NLE: subscribe to a model which allows automated discovery of
data and components flexible design of applications, coordination of execution,
storage of results Ideally:
view grid as a commodity, hidden from application developers
![Page 16: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/16.jpg)
Architectural Components
Data Language resources for analysis E.g. Switchboard, 2400 annotated telephone conversations (26 CDs)
Software Components minimal individual functional units
e.g. Annotation Server, Alignment, ASR, Data Source Packaging, Format Conversion, Text Annotation, Lexicon Server, Semantic Mapping
common interface specification Metadata Repositories
Dublin Core Application Profile for NLE resources Application
data + components + processing instructions declarative specification in XML
Grid Service computational and storage resources for application execution
![Page 17: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/17.jpg)
Architecture
![Page 18: Some Thoughts on HPC in Natural Language Engineering](https://reader035.vdocument.in/reader035/viewer/2022062809/56815928550346895dc6512e/html5/thumbnails/18.jpg)
Conclusion
Natural Language Engineering interesting test case for grid services many mature component technologies applications that are both data and processor
intensive applications for building the multilingual
information society of the future...