database management
DESCRIPTION
Database Management. 2nd EUKLEMS Consortium Meeting, 9-11 June 2005, Helsinki This project is funded by the European Commission, Research Directorate General as part of the 6th Framework Programme, Priority 8, "Policy Support and Anticipating Scientific and Technological Needs". - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/1.jpg)
Database Management
2nd EUKLEMS Consortium Meeting, 9-11 June 2005, HelsinkiThis project is funded by the European Commission, Research Directorate General as part of the 6th Framework Programme, Priority 8, "Policy Support and Anticipating Scientific and Technological Needs".
![Page 2: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/2.jpg)
EUKLEMS Database requirements
• Interface with Consortium inputs
• Produce ‘consistent’ KLEMS data from disparate data
• Store EUKLEMS database
• Generate Productivity estimates
• Store and document ‘Assumptions’ and Methods
• Interface with EUKLEMS public output
![Page 3: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/3.jpg)
Examples of Practical Problems (1)• Aggregation:
• Chain aggregate 2-digit output by industry to a 1-digit level
• Disaggregation:• Estimate computer mfg. and other machinery output given output of total
machinery mfg.
• Ensure that the results aggregate to the observed total.
• Employ available indicators (“pattern data”) of how to split the two.
• Balancing• Estimate investment by industry and by asset type.
• Observe detail in only 1 dimension at a time.• Estimates aggregated by type to equal the observed totals by industry.• Estimates aggregated by industry to equal the observed totals by type.• Employ available indicators (e.g. from other countries) to estimate the
detailed breakdown.
![Page 4: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/4.jpg)
Examples of Practical Problems (2)
• Concording• Converting data for timeseries breaks in national industry codes
• Converting data from national industry code to EUKLEMS definitions
• Cross-classification (technically the same as concording) • Estimating labor by ‘quality-level’ using data by occupation.
![Page 5: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/5.jpg)
Productivity Research Data System
• Programmed as application layer on top of SAS, designed to overcome practical data hurdles
• Data to be manipulated are coded in relational databases
• Identities coded in meta data
• SAS macros to manipulate the data make use of the meta data• Aggregation, disaggregation, balancing, and concording
• Specialized tools to estimate Capital stocks, labour services, capital services, TFP, Domar weighting, alternative deflation
• In addition:• Documentation
• reference manual (also on-line)• users guide
• User community (in statu nascendi)
![Page 6: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/6.jpg)
What is a relational database?
• Seperate data and meta-data
• Meta-data has number of dimensions, e.g.• Year: 1970, 1971,…..
• Country: A, B,C,..
• Industry: 1, 2, 3, ….
• Unit: nominal, real, deflator
• Type: investment in building, machinery,…,
• 1970_B_3_nom_mach: 5,870
![Page 7: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/7.jpg)
General lay-out of Analytical module
• 1. Collection of basic data• A. Basic data by country
• 2. Making meta-data (hierarchies, concordances)• B. Data in Prodsys readable format
• 3. Combining datasources: (dis)agg, concor, balancing• C. Input data KLEMS: full time-series and industries
• 4. Applying productivity tools: harmonization and experiment• D. Output data KLEMS
• 5. Selection of series for analytical database• E. Public analytical database
• Statistical module is same as analytical, except only data and procedures which are backed by NSO
NSO 1 2 3 4 5
Partner
E. Public output
A. Basic data
B. Readable
format
C. Input KLEMS
D. Output KLEMS
![Page 8: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/8.jpg)
Partner Involvement
• Use SAS/Prodsys or Do not use SAS/Prodsys? Advantages of using SAS/Prodsys is full flexibility to experiment, but steep learning curve and expensive
• Importantly: SAS/Prodsys is not needed for data delivery, only Excel!
![Page 9: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/9.jpg)
Data delivery standard
• Each table neeeds the following information• Description of source (publisher, NSO) and version (release
date)
• Classification Meta-data associated with each dimension of national database
• Concordance with EUK dimension lists
• Document with guidelines will be prepared
![Page 10: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/10.jpg)
Tree structure of data hierarchy: A as Grand Total with nodes (‘children’) B and C
each subdivided further down
A
B C
D E F G H
![Page 11: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/11.jpg)
Highest node on top, children indented,each node always immediately followed by itsown children:
A
.B
..D
..E
..F
.C
..G
..H
![Page 12: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/12.jpg)
EU KLEMS row tree structure
![Page 13: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/13.jpg)
Sequence of processing raw data
• Prepare metadata (hierarchy) and concordance
• Metadata and concordance are read into ProdSys
• Then data is read into ProdSys
• Metadata determine how original raw data gets ‘translated’ to SAS (table) version of original raw data
• Metadata + already present EU KLEMS metadata + concordance determine how SAS version of original raw data gets transformed into EU KLEMS data
• You don’t have to rearrange the actual data into this hierarchical structure; ProdSys assigns the data to the correct category in SAS once the category is part of the metadata.
![Page 14: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/14.jpg)
Concordances
• A raw data category may have to be split into several EU KLEMS categories
• We need to be able to determine how the split is made: weights
• Use pattern data to calculate these weights
![Page 15: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/15.jpg)
NACE1 (4-digit)-to-EUK concordance
nace1-4-digit-to-euk72.xls
![Page 16: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/16.jpg)
Part of Dutch IO-to-EUK concordance
![Page 17: Database Management](https://reader036.vdocument.in/reader036/viewer/2022062422/56813ae6550346895da34671/html5/thumbnails/17.jpg)
Next steps
• Document will be prepared on data delivery
• User guide for Prod Sys
• Prod Sys is still in development