edufest 2015 at iit madras - presentation on technology driven transformation of education by mr....
TRANSCRIPT
© 2013-14 IBM Corporation
Technology Driven Transformation of Education
Shajith Ikbal, Ph.D.
Research Scientist @ IBM Research, India( [email protected] )
Talk @ International Conference on Excellence in School Education15-May-2015
© 2013-14 IBM Corporation
Education Transformation
Education sector witnessing unprecedented transformation driven by many factors
Relook into
teaching pedagogy
Digitization Content
Data and process
digitization
Rapid growth of Education
Industry
© 2013-14 IBM Corporation
Hi-Ed GER >30% (2020)
20.2% (2012)12.4%
(2009)
45M more 10th graders
Will need 50K more colleges; 800 more universities (now: 350+)
India
eLearning CAGR (GSV Advisors)Global 23% ; US 15%
eBook Accounts for 23% of US Publisher Sales, $6B 2012 -$16B 2016
Education Industry is Rapidly Growing
© 2013-14 IBM Corporation
Additional Growth Projections
40% growth
Between 2012 and 2017 expenditure onsocial learning and learning communities will grow 40% year on year
Total global spend on education hasincreased by $1 trillion since 2012 and will continue to grow at 7% per annum
Student demand for post-secondary education is expected to grow rapidly from the current 600+ millionto over 1 billion by 2025
608m students$5.5 trillion
Information Technology is driving 30% per annum growth in e-learning and online delivery . . .
. . . from $90bn annual spend in 2012 to $255bn in 2017
© 2013-14 IBM Corporation
Teaching Methods
Traditionally– One size fits all
Moving towards– Personalized– Adaptive– Collaborative– Online– Blended– Flipped class room– Teachers as facilitators– Intelligent and interactive content– …
© 2013-14 IBM Corporation
Traditional Learning
6
Learning Management
System
Learning Management
System
Logs into
Logs into
Accesses Biology Content
Accesses Biology Content
Good in biology, likes to read details
Weak in biology, learn quick by example
----------------------------------------------------Carbon and energy requirements of the autotrophic organism are -----------------------------------------------
SimilarContent
----------------------------------------------------Carbon and energy requirements of the autotrophic organism are -----------------------------------------------
One size doesn’t fit to everyone. So need for adaptive learning to enable personalized education.
© 2013-14 IBM Corporation
Adaptive Learning
7
These students will receive personalized content
Good in biology, likes to read detailed contents
Weak in biology, learn quick by examples and graphics
CARBON--Carbon is the chemical element with symbol C and atomic number 6.-
Learning Object Repository
----------------------------------------------------Carbon and energy requirements of the autotrophic organism are ----------------------------------------------- Receives Receives
JoinsJoins
Collaborative and Social Learning Platform
Learning Management
System
© 2013-14 IBM Corporation
Flipped Classroom
© 2013-14 IBM Corporation
Massive Open Online Courses
© 2013-14 IBM Corporation
Teachers of The Future
© 2013-14 IBM Corporation
Teachers of The Future
© 2013-14 IBM Corporation
Students of The Future
© 2013-14 IBM Corporation
Blended Learning
© 2013-14 IBM Corporation
Intelligent Interactive Content
© 2013-14 IBM Corporation
Classroom Will Learn You
© 2013-14 IBM Corporation
Vast Amount of Education dataAs a result of digitization of educational data and processes
Learning Content repository
Assessment and Q/A Database
Learning Instructions (Curriculum Standards)
Knowledge Graph (aka Concept Graph)
Student Information System
School Management System
Performance/Grade Book Database
Attendance Database
…
Disparate sources
Interconnecting these sources (loosely-coupled) for 360 degree view of student information, learning content and assessment and Q/A db alignment, linkages of standard curriculum with learning content, etc… this enables adaptive education.
© 2013-14 IBM Corporation
Use analytical tools, case management and collaborative capabilities to personalize learning programs
• Portals• Dashboards• Analytics• Collaboration• Mobile Devices
Analytics
Student InformationCRM
LearningContent &Resources
UIs
Personalized Learning and Delivery
Personalized Learning
© 2013-14 IBM Corporation
Publications so far Shajith Ikbal, Ashay Tamhane, Bikram Sengupta, Malolan Chetlur, Saurav Ghosh and James
Appleton, “On Early Prediction of Risks in Academic Performance for Students", to appear in IBM Journal of Research and Development, issue on Technologies for Education Transformation, 2015.
Ashay Tamhane, Shajith Ikbal, Bikram Sengupta, Mayuri Duggirala and James Appleton, "Predicting Student Risks Through Longitudinal Analysis", in Proc. of 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'14), New York, USA, 2014.
Danish Contractor, Sumit Negi, Kashyap Popat, Shajith Ikbal, Balunaini Prasad, Sandeep Vedula, Sreekanth Kakarpathy, Bikram Sengupta and Vijay Kumar, “Smarter Learning Content Management Using the Learning Content Hub", to appear in IBM Journal of Research and Development, issue on Technologies for Education Transformation, 2015.
Danish Contractor, Kashyap Popat, Shajith Ikbal, Sumit Negi, Bikram Sengupta and Mukesh Mohania, "Labeling Educational Content with Academic Learning Standards", in Proc. of SIAM International Conference on Data Mining (SDM'15), Vancouver, Canada, 2015.
V. K. Reddy, L. Said, B. Sengupta, M. Chetlur, J. P. Costantino, A. Gopinath, S. Flynt, P. Balunaini and S. Vedula, “Personalized Learning Pathways: Enabling Interventions One Student at a Time”, to appear in IBM Journal of Research and Development, issue on Technologies for Education Transformation, 2015.
Sumit Negi, “Single Document Keyphrase Extraction Using Label Information”, in Proc. of COLING’14, 2014.
Malolan Chetlur, Ashay Tamhane, Vinay Kumar Reddy, Bikram Sengupta, Mohit Jain, PongsakornSukjunnimit, Ramrao Wagh, "EduPaL: Enabling Blended Learning in Resource Constrained Environments", in Proc. of ACM DEV'14, 2014.
© 2013-14 IBM Corporation
To talk about…
Student risk prediction– Early prediction of risks in academic performance for students
Content analytics– Linking learning content to curriculum standards
In collaboration Gwinnett County Public Schools (GCPS)- One of the largest school districts in the US- In Georgia state- 160+ schools- 160000+ students
© 2013-14 IBM Corporation
Predicting Potential Risks in Academic Performance
Traditionally – Teachers predict– Using recent past academic results, experience with similar students in the past– Negatives:
• limited knowledge, not objective quantification• Often do not leave enough time to apply appropriate intervention
Now – There is an opportunity to predict better and well ahead in time– Student’s longitudinal journey through K-12 is captured– Data from thousands of students from the past is available
• Including academic history and non-academic attributes such as demography, behavior.
© 2013-14 IBM Corporation
Data from Gwinnett County Public Schools
One of the largest school districts in the US– Data related to students, teachers and assessments from all constituent schools are
collated into hundreds of tables in a central data warehouse.– A snapshot of this warehouse was made available to IBM.
© 2013-14 IBM Corporation
Specific Data Considered Grades: 1 to 8 (Primary & Middle school)
Subjects: Mathematics, Science, Literature,
Tests:– CRCT – Criterion References Competency Test– ITBS – Iowa Test of Basic Skills– CogAT – Cognitive Ability Test
Test
Sub-test
Strand
Longitudinal view includes:– scores from all past
grades, tests, subtests,and strands
~ 160,000 studentsmax. 516 scores per student
Many missing scores!!
Test Hierarchy
© 2013-14 IBM Corporation
Prediction Task
Data Preparation:– Target: for CRCT score < 800 is considered ‘at-risk’. For ITBS score < 25 is ‘at-risk’– Features: all scores from grades < 8th grade + demography + behavior – many scores
missing– Students chosen such that at least 20% features are present– Missing features are mean imputed– Data size: CRCT - 58707 students and 342 features; ITBS - 43310 students and 282
features– Experimental setup: 5-fold cross validation
Prediction:– Classifiers from IBM SPSS or WEKA: logistic regression, naïve bayes, decision tree– To predict: ‘at-risk’ and ‘no-risk’ students.
Evaluation metric:– ROC-AUC – area under receiver operating curve - true positives vs false positive– False positive rate for True positive rate of 90% or more
© 2013-14 IBM Corporation
Risk Prediction Performance
Sample ROC curve
ROC-AUC for various classifiers
FP for TP>=90
© 2013-14 IBM Corporation
Feature Importance
Scores are important, demography information helps
Recent past scores are the most important
© 2013-14 IBM Corporation
Early Prediction
CRCT ITBS
At Grade 4, it is possible to predict for Grade 8 with reasonably high accuracy
Accuracy improves as more and more features are aggregated from lower grades
© 2013-14 IBM Corporation
*Natural Language Processing
**Underlined words are tagged automatically.
What is Text Analytics?Text Analytics (NLP*) describes a
set of linguistic, statistical, and machine learning techniques that allow text to be analyzed and key
information extracted.
Chapter 1Section 1 “Stoichiometry”
Gas stoichiometry deals with reactions involving gases, where the gases are at a known temperature, pressure, and volume, and can be assumed to be ideal gasesFor gases, the volume ratio ……………….
- Subject: Chemistry- Grade: High School- Course: Organic Chemistry- Instruction: Gas Stoichiometry- Concept Density: 0.8- Readability Score: Medium- Illustrative Richness: 0.1- Comprehension Burden: .44- Key Terms: ideal gases, gas
stoichiometryMetadata extracted from content
Annotated/TaggedContent with MetadataPassive/Flat
ContentChapter 1
Section 1 “Stoichiometry”
Gas stoichiometry deals with reactions involving gases, where the gasesare at a known temperature, pressure, and volume, and can be assumed to be ideal gasesFor gases, the volume ratio……………….
**
To know more about it click here
Content Analytics: Enrich Content through Automatic Meta-Tagging
© 2013-14 IBM Corporation
Meta-tagging
Content in different formats
Text extraction
Language id
0 – 1
??28
Language Identification
ComprehensionBurden scores
Illustrative Richness
Other meta-dataExtraction
0 - 100
Curriculum Linking
Ranked list ofLearning standards
© 2013-14 IBM Corporation
Linking learning content to curriculum standards – An annotator
Given a collection of documents (educational content) and a learning standard, label the documents with instructions from the learning standard.
• Teachers and students need help to navigate large volumes of digital content and identify the right learning objects for specific concepts/instructions
• Content needs to be tagged with curriculum taxonomy to allow easy search and retrieval; a given document can be related to multiple instructions
© 2013-14 IBM Corporation
Millions of Learning Objects –Content Documents – Learning Material,
Instruction Plan, Assessments
Millions of Learning Objects –Content Documents – Learning Material,
Instruction Plan, Assessments
Learning standards – Curriculum –Thousands of Instructions –
AKS (Academic Knowledge & Skills)
Learning standards – Curriculum –Thousands of Instructions –
AKS (Academic Knowledge & Skills)
Content Linking : Problem
30
11
22
MM
Explain and apply Newton’s third law
of motion
Explain and apply Newton’s third law
of motion
Automatic recommendation of learning objects for various instructions (AKSs)• To help students / teachers find relevant learning material• In adaptive learning systems
11
22
33
11
NN
Find content best suited to learn what is specified in the instruction
Content-Curriculum Linking
Links LCH metadata
© 2013-14 IBM Corporation
Taxonomy of Learning Standards – AKS
Grade=1Grade=1 Grade=2Grade=2 Grade=HSGrade=HS
Subject=Maths
Subject=Maths
Subject=ScienceSubject=Science
Subject=LanguageSubject=Language
Course=GeometryCourse=
GeometryCourse=AlgebraCourse=Algebra
Course=StatisticsCourse=Statistics
Strand=Basic algebra
Strand=Basic algebra
Strand=Linear
algebra
Strand=Linear
algebra
Instruction=Solve linear equations
Instruction=Solve linear equations
Given a content doc– find match against all nodes in the tree– Link to best matching nodes at all levels – instruction, strand, course, subject, grade
31
© 2013-14 IBM Corporation
Matching Content Document Against AKS Tree
Grades
Subjects
Courses
Strands
Instructions
ContentContent
32
• Valid path if• Parent node score >= fraction of child score
• Sort based on • Leaf node score• Average of all node scores in path
Find match against all nodes in the tree and link at all levels – instruction, strand, course, subject, grade
© 2013-14 IBM Corporation
How To Measure Match?
AKS -Instruction
AKS -Instruction
Content doc
Content doc
Semantic similaritySemantic similarity
Matchingscore
Approach:•Build AKS Dictionary – to fill the lexical gap
– AKS extract key phrases expand key phrases
•Content -> extract key phrases•Match AKS features with content features
33
relate temperature pressure and volume of gases to the behavior of gases { relate, temperature, pressure, volume, gases, behavior } { boyle’s law, charle’s law, absolute temperature, ideal gas, proportional }
Expanded Lexicon
Expanded Lexicon
© 2013-14 IBM Corporation
Dictionary Building – AKS Key Phrases Expansion
34
• Wikipedia• Query ‘key phrases’ for wiki page title match, extract key words using ‘Text rank’, cleanup noise
using page ‘category’ info.
• Wordnet• Extract words from Synsets & gloss (description) field
• Domain content – more reliable than the external knowledge sources• Crawl/download Wikibooks, NCERT books, similar domain data and build Lucene index• Query for AKS key words, extract key words from matching docs
Instruction (AKS) key phrases Instruction (AKS) key phrases
WikipediaWikipedia WordnetWordnetDomain content: school text booksDomain content: school text books
Wi-D1Wi-D1 Wi-DnWi-Dn D-D1D-D1 D-DnD-Dn
term1, term2, ……… termN - Dictionary (Expanded Lexicon)term1, term2, ……… termN - Dictionary (Expanded Lexicon)
WolframWolfram
Knowledge sources
Relevant document snippets
Relevant words
© 2013-14 IBM Corporation
Matching Content Document Against AKS Tree
Grades
Subjects
Courses
Strands
Instructions
ContentContent
35
• Valid path if• Parent node score >= fraction of child score
• Sort based on • Leaf node score• Average of all node scores in path
Find match against all nodes in the tree and link at all levels – instruction, strand, course, subject, grade
© 2013-14 IBM Corporation
Architecture
© 2013-14 IBM Corporation
Experiments : Data
Content documents for evaluation
• ‘HS-Mathematics’ – 61 labeled documents• ‘HS-Science’ – 147 labeled documents• Video transcripts of Video Lectures from Khan Academy
• 30 transcripts of High school Mathematics• 30 transcripts of High school Science
Experiments to evaluate labeling at:
• ‘Strand (topic) ’ level (also looking into ‘AKS instruction’ level)Manual labels
• Only AKS labels were mentioned• So extracted all the paths in the tree that match the manual AKS key
© 2013-14 IBM Corporation
Evaluation metrics
• For accuracy computation compare:• ‘System generated list’ Vs ‘Manual list’
• System generated list – after ranking:• Choose top-N rank list (3,5,7,10,15,20, entire)
• Accuracy measures:• Minimal match accuracy @ N
• Doc is accurate if atleast 1 item in the manual list matches system generated list of length N
• Full match accuracy @ N• Accurate if all items in the manual list matches system generated list of
length N
• Recall @ N• Count of manual links that appear in top-N list / total count
• Mean Reciprocal Ratio (MRR)• Average inverse of rank of best ranked manual link in the entire rank list
³=
© 2013-14 IBM Corporation
Results : High School Mathematics
Results on learning content labeled by curriculum experts
© 2013-14 IBM Corporation
Results : High School Science
© 2013-14 IBM Corporation
Results : Instruction level precision
Instruction level precision on labeled documents
Instruction level precision on video transcripts
© 2013-14 IBM Corporation4242
– Deployed on October 22nd 2014 and pilot ran till mid-December
– Fifteen K-12 District S teachers participated (some schools within two clusters within district):• Four ES Math Teachers• Five MS Science Teachers• Four MS Math Teachers• Two HS Math Teachers
– ~1400 students available for the pilot
– 259 interventions created – among 142 distinct students• 56 distinct learning standards reflected in interventions (36 Math vs. 20 Science)• 51 Science interventions• 208 Math interventions
– Project Activities included:• Training with teachers – Oct 17th 2014• Weekly chats were scheduled with each teacher• Survey created for teachers and 11 teachers responded• January 15th 2015 Gathering with Teachers• Three Research Papers came out of FOAK - being published by IBM Research Journal
– Personalized Learning Pathways: Enabling Interventions One Student at a Time– Learning Content Tagging and Management: Using the Learning Content Hub– On Early Prediction of Risks in Academic Performance for Students
PLP Pilot with GCPS – some facts
© 2013-14 IBM Corporation
Key Insights - Continued
Enabled quicker identification of students who need interventions
Enabled more efficient administration of interventions through automation
Increased transparency/accountability by reducing paper-work
Helped achieve intervention outcomes (e.g. AKS concept mastery) by students more quickly
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
Q17_S1: In what ways (if any) did PETALS help you with the intervention process (compared to the existing process of managing interventions)?
© 2013-14 IBM Corporation
Key Insights - ContinuedQ21_S1: 21. PETALS Data (Class Roster, Student Summary, Intervention Data)
Was the data presentation effective (y/n)? 100% Yes
Why?
What changes or improvement would you make?
None
Filters and Sorting
More User friendly
eClass Integration
0 2 4 6 8 10 12
© 2013-14 IBM Corporation
83%
17% Yes No
Key Insights from Survey
50%50%
Yes No
83%
17% Yes No
Q24_S1: 24. If available, would you like to continue to use PETALS
Q22_S1: 22. Learning Content (the availability of it, information pertaining to it, its usage)
25%
42%
33%
Very Easy
Easy
Moderately Difficult
Q14_S1: 14. How easy is it to create an intervention in PETALS?
Q25_S1: 25. Would you recommend the tool to your colleagues?
© 2013-14 IBM Corporation