d-square (d-kwadraat) digital databases and tools for dutch dialect dictionaries jos swanenberg,...
TRANSCRIPT
D-square(D-kwadraat)
Digital Databases and Tools for Dutch Dialect Dictionaries
Jos Swanenberg, Folkert de Vriend & Roeland van Hout
Topics
• Historical background
• Overview of project phases
• Conversion procedures
• New encoding for data
• End user access to the data
Volumes
1. Agricultural terminology
2. Other technical or craft terminologies
3. Common vocabulary
Macro structure WBD & WLD
Constituents:- Lexical meaning (title, description of the
concept)- Lexical form (‘dutchified’ entry)- Phonetic form - Sources- Geographical code (+ map)
Micro structure WBD & WLD
WBD & WLDExample of WLD, volume 1:
1960-1980 Filing cards
1985-1995 Word processor, Genoveva
1995-2007 Databases + word proc.
2002 Online database WBD
2003 -2007 D-square
History of automation
WBD & WLD
Filing cards:
WBD & WLDExample of WLD, volume 1:
Online database WBD
www.ru.nl/dialect
Example from database: “Meikever” (Eng: “maybug”)
Example of WBD, volume 3
Online database, query
Online database, query result
Deel IIIMS-Word
Editors
/
Management
Users
Analog Digital Digital Analog
(parts of)Vol. I+IIMS-Word
Filing cards
WebsiteWBD/WLD
with tools forsearching andcartography
EnricheddataXML
Raw dataFileM Pro
Vol. I+IIMacWrite
Questionnaires Nijmegen and
Leuven
Questionnaires (chiefly) Meertens
Raw data
Vol. I + II Vol. III
Edited data
Specializedprint editions (dialect atlas
or local dictionary)
Online DB WBD
(Polderland)
Edited dataXML
Vol. IIIFileM Pro
SGV on CD(Polderland)
Vol. III
1. Conversion to a new format
2. End user access to data
3. Enrichment of data
4. Data management
Overview phases D-square
Phase 1: Conversion to a new format
Reasoning behind new encoding
• XML, not relational database
• Tailored to WBD and WLD
• Flexible enough to be used for other dialect dictionaries
• Based on standard: LMF (ISO TC 37/SC 4)
Example from WBD, meikever
Example from database: “Meikever” (Eng: “maybug”)
Example XML-encoding<LEXICON dialect="Brabants"><ENTRY> <META>
… </META> <CONCEPT lang=“dutch” ontol_id=“492”>Meikever</CONCEPT> <DATA> <VARIANT type=“heteronym”>Bakkertje <VARIANT type=“lexical”>bakkerke
<VARIANT type=“raw” import=“diplomatic”>bakkərkə <LOCATION source1=“N83”>K 178</LOCATION>
</VARIANT> </VARIANT> </VARIANT> </DATA></ENTRY> …</LEXICON>
Example from WALD
Example from dictionary of the dialects of Zeeland
Phase 2: end user access to data
Small scale survey
- Tools: Search engine, Cartographic tool, Format conversions.
- Enrichment: POS, morphemes (syllables)
- Links to other resources: Other dictionaries, questionnaires, FAND, MAND.
Difficulties to overcome• Search engine
• Getting from question to query (coaching needed). Is SmartMatch (fuzzy matching) helpful in this regard?
• Speed of XML searching
• Cartography
Availability of base maps
• Links to other resources
Differences in interpretation
Information about D-square
www.ru.nl/dialect
Questions?