www.uis.unesco.org building sdmx data structure definitions based on a generic conceptual model for...

Post on 27-Mar-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

www.uis.unesco.org

Building SDMX Data Structure Definitions based on a generic conceptual model for contents

Experience with the joint Eurostat-Unesco-OECD education statistics questionnaire

Michael Bruneforth, UNESCO Institute for Statistics

Expert Group on SDMX

m.bruneforth@uis.unesco.org May 10, 2007, UN, Geneva

www.uis.unesco.org

Overview

The world of international education data collections

Why building a conceptual model

Steps to build the model

The model

From the model towards a SDMX data structure definition

www.uis.unesco.org

The world of international education data collections

www.uis.unesco.org

The UNESCO-UIS / OECD / EUROSTAT (UOE) Data Collection on Education Statistics

• EXCEL based questionnaire, organized in 31 work sheets• 47 countries, 14,000+ data points• Changes: 2003

The World Education Indicators Project (WEI)• Based on UOE Instruments, extended by 10 work sheets• 16 countries , >15,000+ data points• Examples at www.uis.unesco.org/publications/wei2006

The UIS Survey• Pdf based E-Questionnaire infrastructure, plus paper form• All remaining countries, 5,000+ data points• Examples at www.uis.unesco.org -> current surveys

Instruments used in the system of international education data collections

www.uis.unesco.org

Instruments used in the system of international education data collections (II)

UOE

i ii i i

i i

UIS

Can be transformed

Can be transformed

WEI

www.uis.unesco.org

Education Questionnaires: ever changing

1998• Tables were introduced after ISCED 97 was adopted.

2000• Redesign of Finance tables.

2001 – 2005: ???

2005• Major redesign: Tables redesigned, some tables spilt or combined.

2006• In ENRL8a; ENRL8b and ENRL8c: the Caribbean countries are now included with Latin America

instead of Northern America.

2007• In table ENRL-7, three new sub-categories, “unknown residence”, “unknown prior education”,

and “unknown citizenship” have been added..

• In ENTR-2 a new row has been added to collect typical age of entry.

• In GRAD-1 and GRAD-3 a new row has been added to collect typical graduation age.

www.uis.unesco.org

Why building a conceptual model?

Meta data

• Theoretical basis for describing data

• Visualization of data

• Validation of codes

Questionnaire design

• Improving internal consistency in questionnaires

• Maintaining the coding schemes:

» Avoiding random or ad-hoc data descriptions leading to inconsistent, incomprehensible systems

(we need discipline as much as a model!)

www.uis.unesco.org

Why using a conceptual model as basis for SDMX?

A model describes a universe of questionnaires

• Consistency across questionnaires

• Consistency across tables

• Consistency across statistical units

• Facilitates adaptation of SDMX to changes to tables» Typically no/few keys need to be changed, most new data can be defined using existing keys

A model can be used to describe indicators and derived data

• SDMX exchange of results (->WorldBank, MDG)

A model can be transformed into/from data base definitions

• Use of existing meta data (efficiency)

• Avoid redundant information (less error prone)

• Basis to match national data to international SDMX definitions

www.uis.unesco.org

Building the model

Step 1: Bo Sundgren’s analysis of the UIS Questionnaire

Step 2: Analysis of the relational data base at UIS

Step 3: Correction / Expansion of Bo’s model

Step 4: Model verification1: review of UOE questionnaires

Step 5a: Model verification2: Transformation of UIS database model to conceptual model, automated creation of full code list

Step 5b: Model verification2: Analysis of the relational data base at OECD

Step 6: Creation of data structure definition based on existing meta data

Student

.count .partTimeFraction.sum x age x sex x origin (array) x previousEducation

Enrolment

- partTimeFraction x workmode x repeater x completer x entrant x (adjustement)

ProgramExecution(Class, pedagogical unit, …)

.count x adult x grade x ISCED.field x location

Institutions(education provider; non-instructional institutions; …)

x sector x InstType

EducationProgram(Utility)

- startingAge - duration - Name x ISCED.level x ISCED.orientation x ISCED.destination x ISCED.degreePos

is enrolled in

for

of

provides

www.uis.unesco.org

Example 1: Students and Repeater

ENRL3

COUNTRY

School year, data collection period: Please indicate the dates in table ENRL1a. Sources:

Methods:

LEVEL OF EDUCATION PRIMARY (ISC 1)

LOWER SECONDARY

(ISC 2)

UPPER SECONDARY

(ISC 3)PRIMARY

(ISC 1)

LOWER SECONDARY

(ISC 2)

UPPER SECONDARY

(ISC 3) UOE version:All educational programmes

All general programmes

All general programmes

All educational programmes

All general programmes

All general programmes

TOTAL PUBLIC AND PRIVATE INSTITUTIONS 1 2 3 4 5 6TOTAL FULL-TIME AND PART-TIMETotal males and females Grade groups

A1 Total: All grade groups (within ISC-Level) (A2toA12) X

A2 Grade 1 (within ISC-Level) (A14+A26)

A3 Grade 2 (within ISC-Level) (A15+A27)

A4 Grade 3 (within ISC-Level) (A16+A28)

A5 Grade 4 (within ISC-Level) (A17+A29)

A6 Grade 5 (within ISC-Level) (A18+A30)

A7 Grade 6 (within ISC-Level) (A19+A31)

A8 Grade 7 (within ISC-Level) (A20+A32)

A9 Grade 8 (within ISC-Level) (A21+A33)

A10 Grade 9 (within ISC-Level) (A22+A34)

A11 Grade 10 (within ISC-Level) (A23+A35)

A12 Grade unknown (within ISC-Level) (A24+A36)

Males Grade groups

A13 Total: All grade groups (within ISC-Level) (A14toA24)

A14 Grade 1 (within ISC-Level)

A15 Grade 2 (within ISC-Level) Y

A16 Grade 3 (within ISC-Level)

NUMBER OF STUDENTS AND REPEATERS (ISC123) IN GENERAL PROGRAMMES BY LEVEL OF EDUCATION, SEX AND GRADE

Number of repeatersNumber of students Block Check Global Check & Save

Row Instructions

RowNotes ColumnNotes CellNotes

Missing Value Codes:

Student

.count .partTimeFraction.sum x age x sex x origin (array) x previousEducation

Enrolment

- partTimeFraction x workmode x repeater x completer x entrant x (adjustement)

ProgramExecution(Class, pedagogical unit, …)

.count x adult x grade x ISCED.field x location

Institutions(education provider; non-instructional institutions; …)

x sector x InstType

is enrolled in

for

of

provides

EducationProgram(Utility)

- startingAge - duration - Name

x ISC.level: 2 x ISC.orientation: General x ISCED.destination x ISCED.degreePos

Count of lower secondary general students

Student

.count .partTimeFraction.sum x age

x sex: Male x origin (array) x previousEducation

Enrolment

- partTimeFraction x workmode

x repeater: Yes x completer x entrant x (adjustement)

ProgramExecution(Class, pedagogical unit, …)

.count x adult

x grade: 2 x ISCED.field x location

Institutions(education provider; non-instructional institutions; …)

x sector x InstType

EducationProgram(Utility)

- startingAge - duration - Name

x ISC.level: 2 x ISC.orientation: General x ISCED.destination x ISCED.degreePos

is enrolled in

for

of

provides

Count of male lower secondary general repeater at grade 2

www.uis.unesco.org

Example 2: Students and Classes

Class1 AVERAGE CLASS SIZE BY LEVEL OF EDUCATIONCountry AND BY TYPE OF INSTITUTIONS

School year start (mm/yyyy): Sources:

School year end (mm/yyyy): Methods:

PRIMARY Education (ISC 1)

LOWER SECONDARY

SCHOOLS (ISC 2) UOE version:

TYPE OF INSTITUTIONS All regular

programmesAll general

programmes

1 2TOTAL: Public and private institutions

A1 Average class size

A2 Number of students

A3 Number of classes

Public institutions

A4 Average class sizeA5 Number of students X

A6 Number of classes Y

Count of lower secondary general classes and students, class size

Student

.count .partTimeFraction.sum x age x sex x origin (array) x previousEducation

Enrolment

- partTimeFraction x workmode x repeater x completer x entrant x (adjustement)

ProgramExecution(Class, pedagogical unit, …)

.count x adult x grade x ISCED.field x location

Institutions(education provider; non-instructional institutions; …)

x sector x InstType

is enrolled in

for

of

provides

EducationProgram(Utility)

- startingAge - duration - Name

x ISC.level: 2 x ISC.orientation: General x ISCED.destination x ISCED.degreePos

www.uis.unesco.org

Example 2: Students and Classes

ANNUAL INTAKE BY LEVEL OF EDUCATION

AND PROGRAMME DESTINATION

TOTAL PUBLIC AND PRIVATE INSTITUTIONS UPPER POST-

SECONDARY

TOTAL FULL-TIME AND PART-TIME SECONDARY NON-TERTIARY

Total males and females ISCED 3 ISCED 4 ISCED 5A ISCED 5B ISCED 6

Enrolment 1 2 3 4 5

A1Total number of students enrolled (ENRL1, row A1) (A2+A3+A4)

Of which:

A2 New entrants (B1)

A3 Re-entrants

A4 Continuing students

New entrants

B1 New entrants (from A2) (B2+B3)

Of which:

B2With previous education at the other tertiary level

B3Without any previous education at the tertiary level

TERTIARY

Count of new entrants to tertiary 5B with previous tertiary education

Student

.count .partTimeFraction.sum x age x sex x origin (array)

x previousEducation: 5A

Enrolment

- partTimeFraction x workmode x repeater x completer

x entrant: New entrant x (adjustement)

ProgramExecution(Class, pedagogical unit, …)

.count x adult x grade x ISCED.field x location

Institutions(education provider; non-instructional institutions; …)

x sector x InstType

is enrolled in

for

of

provides

EducationProgram(Utility)

- startingAge - duration - Name

x ISC.level: 5 x ISC.orientation: General

x ISCED.destination: B x ISCED.degreePos

Student

.count .partTimeFraction.sum x age x sex x origin (array) x previousEducation

Enrolment

- partTimeFraction x workmode x repeater x completer x entrant x (adjustement)

EducationStaff(Teacher, …)

.partTimeFraction.sum x age x sex x training

Engagement

.count - partTimeFraction x workmode x engagementType

Institutions(education provider; non-instructional institutions; …)

x sector x InstType

EducationProgram(Utility)

- startingAge - duration - Name x ISCED.level x ISCED.orientation x ISCED.destination x ISCED.degreePos

EducationSystem(Utility)

- compulsoryEducationStart - compulsoryEducationEnd - academicYearBeginn - academicYearEnd - financialYearBeginn - country - currency

Funder(Governements, private entities, …)

x sector

Expenditure

.amount.sum x nature

is enrolled in

belongs to

for

isEngagedIn

for

of

spends on

belongs to

transfers to/spends on

receives transfer

receives transfer

ProgramExecution(Class, pedagogical unit, …)

.count x adult x grade x location

provides

Institutions(education provider; non-instructional institutions; …)

x sector x InstType

Funder(Governements, private entities, …)

x sector

Expenditure

.amount.sum x nature

Householdsspends on

transfers to/spends on

receives transfer

receives transfer

receives transfer

transfers to/spends on

www.uis.unesco.org

Principles for the generation the detailed model for individual data points

Use existing meta data

Avoid multiple capturing of questionnaire information

Ensure consistency with existing systems

www.uis.unesco.org

Generate the detailed model for individual data points

Cell ID: 1052108025 (ENTR1-B2:4, Version 2005 to 2099)

Object: Student: Count

-age: Total -sex: Female -origin: All -previousEducation: 5A

enrolled in -> Object: Enrolment

-workmode: Total -repeater: NO -completer: NO -entrant: First time entrant

for -> Object: ProgramExecution

-adult: Total -grade: Total -ISCED.field: Total -location: Total

of -> Object: EducationProgram

-ISCED.level: 5 -ISCED.orientation: Total -ISCED.destination: B -ISCED.degreePos: First

provided by -> Object: Institution -sector: Total -InstType: Instructional

www.uis.unesco.org

The basis: the UIS meta data (relational database description)

www.uis.unesco.org

Example: UIS meta data (relational database codes, XML version)

www.uis.unesco.org

What is needed beyond the model to get a complete data structure definition?

The data structure definition has to cope with data points collected twice.

• Total number of primary students is collected in ENRL1a, ENRL1, ENRL3, ENRL4, CLASS1

The data structure definition has to cope with adjustements to data concerning coverage of data.

• The count of student is collected with coverage adjusted to expenditure data.

www.uis.unesco.org

Questions, comments?

Education content: Michael Bruneforth (m.bruneforth@uis.unesco.org)

IT: Brian Buffett (b.buffett@uis.unesco.org)

Thanks

top related