introduction our approach - github pages · 2020-01-28 · students with dyslexia perform just as...

1
1 DATA COLLECTION We have partnered with a pilot school system of 1200 children. Here we present a subset of 100 grades 1-6. This handwriting is “in the wild”, or non-standardized. We split the sample into lines and then split each line into patches, which are not necessarily letters, but transitions between letters. This allows our approach to identify the specific features of the handwriting itself and focus less on the letters. We compare patches from the different samples by running each patch through the network. Model description shown in Fig 2. Preliminary results suggest a 77.6% accuracy rate. We will continue adding new data and developing new methods to improve the detection accuracy. This is an iterative process, improving the accuracy of the model with every batch of new data. DYSLEXIA DETECTION PROCESS 2 PREPROCESSING 3 STANDARD CONVOLUTIONAL NEURAL NETWORK 4 ACCURACY ANALYSIS Can We (and Should We) Use AI to Detect Dyslexia in Children’s Handwriting? Katie Spoon 1,2 , David Crandall 1 , Katie Siek 1 , Marlyssa Fillmore 3 1 Indiana University Bloomington, 2 IBM Research, 3 The Academic Achievement Center Introduction Our Approach We Can; Should We? Contact Katie Spoon: [email protected] Reading and Writing are Linked 10.2 million kids in the US with dyslexia will not be diagnosed by the recommended age 2 Dyslexia is not tied to IQ, which is a common myth. Students with dyslexia perform just as well with accomodations. To get these accomodations, students meet with a school psychologist. However, there is a long waitlist and diagnosis is often delayed. If students are placed in the waitlist by the end of 2 nd grade, they improve their chances of graduating from high school 2 . However, teachers usually don’t have the training to detect LDs 3 and often detection comes too late. Fig 1. Standard dyslexia evaluations depend partially on writng. School pyschologists use many evaluation metrics (perhaps the most widely used method is the Barton method), including letter confusion, misspellings, and self-correction, among others (captilization, punctuation, content, pencil grip, etc.). Literacy is the best predictor of success later in life. Dyslexia is a learning disability (LD) characterized by a difficult to read or interpret words, letters, and symbols according to their shapes¹. Goal: Create a deep learning-based early screening tool to identify characteristics of dyslexic handwriting and place students into the diagnostic queue by the end of 2 nd grade 2% by end of 2 nd grade 48% by end of K-12 50% never diagnosed Fig 2. Our model. This model 4 was used for a different problem. It takes in a patch of handwriting. It has 5 convolutional layers, 3 max-pooling layers, and 2 fully-connected layers. Additional variants studied in our previous work include siamese networks (two different streams take patches from students with and without dyslexia) and the addition of a bayesian classifier to the output. We omit these variants for this work as accuracy improvements were not the focus here. Neural Network Architecture & Potential Feature Identification Fig 3. Dyslexic handwriting tends to be messier. TSNE is an ML visualization technique that maps a high-dimensional feature vector (in our case, from the last layer of a neural network) into 2D space. Points closer together are more similar. After plotting a sample of patches from 2 nd grade, it appears that while there is a lot of overlap, the dyslexic handwriting tends to be messier or harder to identify. dyslexic non-dyslexic Dangers of this Technology Bias: Some of the handwriting we receive labeled as not characteristic of dyslexia could be from a child with an undiagnosed reading or writing problem. Additionally, we need to investigate intersectional breakdowns of accuracy by gender, race, SES, etc. False negatives: While a false positive would refer a student for testing who may not need it, a false negative has more dire consequences. Future Work We will improve the model for detecting dyslexia after we utilize additional data, build a front-end application for teachers and parents, and employ data visualization techniques to better understand which features the network uses to make decisions. 1. Learning Disabilities Association of America (LDA) 2. National Center for Learning Disabilities (NCLD) 3. National Council on Teacher Quality (NCTQ) 4. Priya Dwivedi, English Deep Writer, (2018), Github repository, https://github.com/priya-dwivedi/Deep-Learning/tree/master/handwriting_recognition 5. Hernandez, D. J. (2012). Double jeopardy: How third-grade reading skills and poverty influence high school graduation. Baltimore, MD: The Annie E. Casey Foundation. 6. British Dyslexia Association We would also like to eventually expand the system to detect different types of LDs and to monitor English Language Learners, as the tool is not language-specific. Students who struggle to read typically also struggle to write. Students can be good readers with poor writing (dysgraphia), but typically not vice versa. One of the authors hand-labeled patches of handwriting—5100 of them—in various catego- ries of messiness: (1) illegible (cannot make out the letter or transition), (2) partially illegible (can make out a letter but not sure what it is), and (3) legible (can clearly tell the letter or transition). We re-validated the network and found that 84% of the patches marked illegible and 60% of the patches marked partially illegible were from samples of children with dyslexia. This seems to confirm that students with dyslexia have objectively "messier" handwriting than their peers, and the network is using "messiness". Quantifying “Messiness” Use outside of intended use: What is to stop someone from doing the same thing to try to distinguish varying levels of intelligence in children? It is difficult to know the scope of use that will follow, and even though this project intends to help, there is always future potential for someone else to do harm.

Upload: others

Post on 13-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction Our Approach - GitHub Pages · 2020-01-28 · Students with dyslexia perform just as well with accomodations. To get these accomodations, students meet with a school

1 DATA COLLECTIONWe have partnered with a pilot school system of 1200 children. Here we present a subset of

100 grades 1-6. This handwriting is “in the wild”, or non-standardized. We split the sample into lines and then

split each line into patches, which are not necessarily letters, but transitions between

letters. This allows our approach to identify the specific features of the

handwriting itself and focus less on the letters.

We compare patches from the different samples by running

each patch through the network. Model description

shown in Fig 2.

Preliminary results suggest a 77.6% accuracy rate. We will continue adding new data and developing new methods

to improve the detection accuracy.

This is an iterative process, improving the accuracy of the model

with every batch of new data.

DYSLEXIADETECTIONPROCESS

2 PREPROCESSING

3 STANDARDCONVOLUTIONAL

NEURAL NETWORK

4 ACCURACY ANALYSIS

Can We (and Should We) Use AI to Detect Dyslexia in Children’s Handwriting?Katie Spoon1,2, David Crandall1, Katie Siek1, Marlyssa Fillmore3

1Indiana University Bloomington, 2IBM Research, 3The Academic Achievement Center

Introduction Our Approach We Can; Should We?

ContactKatie Spoon: [email protected]

Reading and Writing are Linked

10.2 million kids in the US with dyslexiawill not be diagnosed by the recommended age2

Dyslexia is not tied to IQ, which is a common myth. Students with dyslexia perform just as well with accomodations. To get these accomodations, students meet with a school psychologist. However, there is a long waitlist and diagnosis is often delayed.

If students are placed in the waitlist by the end of 2nd grade, they improve their chances of graduating from high school2. However, teachers usually don’t have the training to detect LDs3 and often detection comes too late.

Fig 1. Standard dyslexia evaluations depend partially on writng. School pyschologists use many evaluation metrics (perhaps the most widely used method is the Barton method), including letter confusion, misspellings, and self-correction, among others (captilization, punctuation, content, pencil grip, etc.).

Literacy is the best predictor of success later in life. Dyslexia is a learning disability (LD) characterized by a difficult to read or interpret words, letters, and symbols according to their shapes¹.

Goal: Create a deep learning-based early screening tool to identify characteristics of dyslexic handwriting and place students into

the diagnostic queue by the end of 2nd grade

2% by end of 2nd grade

48% by end of K-12

50% never

diagnosed

Fig 2. Our model. This model4 was used for a different problem. It takes in a patch of handwriting. It has 5 convolutional layers, 3 max-pooling layers, and 2 fully-connected layers. Additional variants studied in our previous work include siamese networks (two different streams take patches from students with and without dyslexia) and the addition of a bayesian classifier to the output. We omit these variants for this work as accuracy improvements were not the focus here.

Neural Network Architecture & Potential Feature Identification

Fig 3. Dyslexic handwriting tends to be messier. TSNE is an ML visualization technique that maps a high-dimensional feature vector (in our case, from the last layer of a neural network) into 2D space. Points closer together are more similar. After plotting a sample of patches from 2nd grade, it appears that while there is a lot of overlap, the dyslexic handwriting tends to be messier or harder to identify.

dyslexicnon-dyslexic

Dangers of this Technology

Bias: Some of the handwriting we receive labeled as not characteristic of dyslexia could be from a child with an undiagnosed reading or writing problem. Additionally, we need to investigate intersectional breakdowns of accuracy by gender, race, SES, etc.

False negatives: While a false positive would refer a student for testing who may not need it, a false negative has more dire consequences.

Future WorkWe will improve the model for detecting dyslexia after we utilize additional data, build a front-end application for teachers and parents, and employ data visualization techniques to better understand which features the network uses to make decisions.

1. Learning Disabilities Association of America (LDA)2. National Center for Learning Disabilities (NCLD)3. National Council on Teacher Quality (NCTQ)4. Priya Dwivedi, English Deep Writer, (2018), Github repository, https://github.com/priya-dwivedi/Deep-Learning/tree/master/handwriting_recognition5. Hernandez, D. J. (2012). Double jeopardy: How third-grade reading skills and poverty influence high school graduation. Baltimore, MD: The Annie E. Casey Foundation.6. British Dyslexia Association

We would also like to eventually expand the system to detect different types of LDs and to monitor English Language Learners, as the tool is not language-specific.

Students who struggle to read typically also struggle to write. Students can be good readers with poor writing (dysgraphia), but typically not vice versa.

One of the authors hand-labeled patchesof handwriting—5100 of them—in various catego-ries of messiness: (1) illegible (cannot make out the letter or transition), (2) partially illegible (can make out a letter but not sure what it is), and (3)legible (can clearly tell the letter or transition).

We re-validated the network and found that 84% of the patches marked illegible and 60% of the patches marked partially illegible were from samples of children with dyslexia. This seems to confirm that students with dyslexia have objectively "messier" handwriting than their peers, and the network is using "messiness".

Quantifying “Messiness”

Use outside of intended use: What is to stop someone from doing the same thing to try to distinguish varying levels of intelligence in children? It is difficult to know the scope of use that will follow, and even though this project intends to help, there is always future potential for someone else to do harm.