![Page 1: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/1.jpg)
Automated Classification of Medical Questions Using
Semantic Parsing Techniques
Paul E. Pancoast, MDArthur B. Smith, MSChi-Ren Shyu, PhD
University of Missouri-Columbia
![Page 2: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/2.jpg)
Physicians Have Questionswhen they treat patients
• What is the best treatment for migraines in patients who are diabetic?
• How often should I repeat the TSH for this patient who is on synthroid?
• When should I get an X-ray for this patient with low back pain?
![Page 3: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/3.jpg)
Observational Studies of Physician Information Needs
• Covell – 1985 – Annals of Internal Medicine. Oct 1985;103(4):596-599.
• Osherhoff – 1992– Annals of Internal Medicine. Apr 1 1991;114(7):576-581.
• Gorman – 1994 – Medical Decision Making. Apr-Jun 1995;15(2):113-119.
• Ely – 1999– BMJ. Aug 12 2000;321(7258):429-432.
![Page 4: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/4.jpg)
Common Themes from Observational Studies
• Physicians have questions for 45-65% of all patients they see
• Physicians pursue only about 30% of those questions
• Physicians find answers to 80% of the questions they pursue
![Page 5: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/5.jpg)
Collections of Questions
• Over 10,000 question strings collected – NLM, Ely, Vanderbilt, Duke, FPIN, U of
Washington, Britain, Australia
• No good way to classify the questions• No automated method of finding duplicate
questions
![Page 6: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/6.jpg)
Reasons to Automate Classification
• Organize collections of questions• Improve accuracy of existing classification• Find redundancy (duplicate questions) • Find frequency of occurrence
![Page 7: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/7.jpg)
Research GoalAutomate Classification of Medical Questions• Question Type – based on semantic and
syntactic information (this experiment)• Question Meaning – based on the specific
instantiations of semantic and syntactic information (subsequent experiments)
• Ultimately – to match questions directly with structured medical information
![Page 8: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/8.jpg)
Study Overview
MU
Ely
1101 Specific Questions
Dom
ain
Exper
ts
64 Categories170 Generic Question
Strings
Sem
antic
Gro
upSe
quen
ce
Patte
rns
Automated Classification
Of Specific Questions
![Page 9: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/9.jpg)
Ely Taxonomy
• Generic Category (64 total)– 1111
• Generic Question Strings (GQS)– What is the cause of symptom x?– What is the differential diagnosis of symptom x?– Could symptom x be condition y or be a result of
condition y?– What is the likelihood that symptom x is coming
from condition y?
![Page 10: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/10.jpg)
Methods for this study(overview)
1. Extracted medical concepts from question strings using UMLS MRXNS table
2. Assigned concept unique identifier (CUI) to Semantic Groups
3. Found Semantic Group Sequence (SGS) patterns using Apriori Algorithm (modified)
4. Matched SGS from specific questions to SGS in Ely’s generic question strings to assign the generic category
![Page 11: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/11.jpg)
1. Extracted CUIs from question strings
3 word, 2 word, 1 word window parser matching strings to MRXNS
– [How should I] treat acute pharyngitis?– How [should I treat] acute pharyngitis?– How should [I treat acute] pharyngitis?– How should I [treat acute pharyngitis]?
![Page 12: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/12.jpg)
1. Extracted CUIs from question strings
How should I treat [acute pharyngitis]?– Acute pharyngitis => UMLS semantic type T047
• Disease or Syndrome
How should I [treat] ------------?– Treat (treatment) => UMLS semantic type T061
• Therapeutic or Preventative Procedure
![Page 13: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/13.jpg)
2. Assigned CUIs to Semantic Groups
• Semantic Groups are aggregations of similar semantic types
• 27 Semantic Groups (from UMLS Semantic Network)– T047 is in 017 (PATH-PROC)– T061 is in 027 (THER)
• 39 additional, non-medical Semantic Groups (derived from general thesauri)
![Page 14: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/14.jpg)
3. Found Semantic Group Sequence (SGS) patterns
• Example question:– How should I treat acute pharyngitis– 253 | 250 | 242 | 27 | 17 |
• 253 – How/Why• 250 – Does/Can/Could/Should• 242 – I/You/He/She/We• 27 – treat (treatment)• 17 – acute pharyngitis
• Ran 3000 question strings through the parser and looked for recurrent patterns
![Page 15: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/15.jpg)
3. Found Semantic Group Sequence (SGS) patterns
• Example question:– How should I treat acute pharyngitis
253 250 242 27 17Matching patterns:Semantic types Support / Confidence 253 250 242 27 0.0398 0.5588253 242 27 0.0409 0.5612253 250 27 0.0477 0.5084253 250 242 0.0712 0.7598236 17 0.0613 0.5493253 27 0.0691 0.5077253 242 0.0728 0.5346253 250 0.0938 0.6885
3% occurrence for support50% incidence for confidence
![Page 16: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/16.jpg)
3. Found Semantic Group Sequence (SGS) patterns
Support
the pattern of SGS occurs in at least 3% of all the questions parsed
Confidence
253 250 242 27 occurs 50% of the time when 253 250 242 is found
![Page 17: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/17.jpg)
4. Matched SGS patterns in generic and specific questions
• Generic question:– How should I treat condition y?
Specific questions with some matching SGS patterns
– How do I treat depression?– How do I manage Parkinsonism? – How do I treat acne? – How do I treat conjunctivitis? – How do I treat dementia? – How do I treat STD’s?
![Page 18: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/18.jpg)
Results• 1101 specific questions• 20,710 total words• 867 (2804 instances) did not match in MRXNS
or in MRCON (MRXNW gave too many hits)• The majority of these strings were mapped to
an existing semantic type using ad-hoc stemming techniques
![Page 19: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/19.jpg)
Results• 7183 SGS patterns matched in specific and
generic questions• 204 (18%) specific questions had potential
matches with generic questions (using SGS)• 97 (10%) actual matches between specific
and generic questions (using domain expert)• 67 of these (using SGS) matched the same
category assigned by Dr. Ely
![Page 20: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/20.jpg)
Discussion• 6% of specific question strings mapped to the
generic category assigned by Ely (67/1101)• 33% of those predicted to match by SGS
patterns had matching generic categories• 73% of specific question strings didn’t map to
any generic category• 45% of specific question strings (that did map
to a generic category) mapped to more than one generic category
![Page 21: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/21.jpg)
Discussion• Automatic Classification of Questions
– SGS pattern matching can cluster questions with similar semantic and syntactic information
– These clustered questions often have the same meaning
• Discrepancy in classification – SGS and Ely– Our model needs work– Ely classifications are not semantically-based– Ambiguity in Ely classifications
![Page 22: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/22.jpg)
Why Questions Didn’t Match Categories
Generic Category
Assigned Category
Specific Question
Diagnosis 1111 What is the differential diagnosis of a rash?
Diagnosis 1121 What is the differential diagnosis of a rash?
35 questions are assigned to more than one category
![Page 23: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/23.jpg)
Future Work• Improve accuracy of model
– Refine Semantic Groups – Use relevance feedback and Semantic Group
weighting – Include part-of-speech tagging and syntactic
parsing– Incorporate WordNet for non-medical terms
• Develop an indexing schema that represents the semantic groups and syntactic information as vectors in a high-level feature space model
![Page 24: Automated Classification of Medical Questions Using Semantic Parsing Techniques](https://reader035.vdocument.in/reader035/viewer/2022062722/56813a4a550346895da24006/html5/thumbnails/24.jpg)
Thank-you!
Acknowledgements: This research was supported in part by National Library of Medicine Biomedical and Health Informatics Research Training Grant 2-T15-LM07089-11.And, thanks to Dr. John Ely for his willingness to share his raw questions and classification data.