1 improving interpretive interfaces for math entry richard zanibbi department of computer science...

22
1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

Upload: daniella-mccoy

Post on 23-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

1

Improving Interpretive Interfaces for Math Entry

Richard Zanibbi

Department of Computer Science

Rochester Institute of Technology

Page 2: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

2

RIT Document and Pattern Recognition Lab (DPRL)

Goals:1. Improve theory and tools for constructing and evaluating

pattern recognition systems2. Apply these to problems in document recognition and pen-

based computing

Members:• Richard Zanibbi• Kurt Kluever (Master’s student)• New members welcome!

http://www.cs.rit.edu/~rlaz/dprl.html

Page 3: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

3

Current Directions:1. Theory and Tools:

• Tools for recognition module integration and evaluation, such as the Recognition Strategy Language (Zanibbi et al.)

• Game-theoretic models of recognition problems and systems (e.g. for classifier combination)

• Machine learning algorithms for system optimization

2. Applications:• Pen and image-based math entry (lab maintains open-source

Freehand Formula Entry System(Smithies, Novins, Arvo, Zanibbi et al.)

• Optical character recognition (OCR)• Image and text-based document retrieval• “CAPTCHAs” (for distinguishing humans from 'bots’)• Table recognition, etc.

Page 4: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

4

Interpretive Interfaces for Math Entry

Page 5: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

5

Pen-Based Math Entry

Recognition Challenges• Large number (e.g. > 500 in LaTeX) of symbols, many

similar in structure (e.g. 0 and O)• Layout of symbols on baselines can be ambiguous• Little redundancy• Context influences symbol identity and layout interpretation

Page 6: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

6

Example:Freehand Formula Entry System/DRACULAE

Contributors:FFES first developed as an MSc project at University of

Otago (Smithites, Novins), New Zealand, using CIT tools of Jim Arvo et al. in 1998

Since then, contributors from Queen’s University (CA), Concordia University (CA), and around the world (CMU, UC Berkley, Companies and non-profits in California and France)

Page 7: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

7

DRACULAE (Zanibbi, 2002)

“Diagram Recognition Application for Computer Understanding of Large Algebraic Expressions”

Page 8: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

8

DRACULAE:Layout Classes for Symbols

Symbol name defines class membership.

Page 9: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

9

DRACULAE Layout Analysis: Sketch

Algorithm:

1. Symbols assigned layout type (class) based on symbol identity

2. Sort symbols left-right on leftmost edge of Bounding Box

3. Create baseline structure tree with region node “Expression”

4. Recursively:

a) Search right-to-left, locate the leftmost (“start”) baseline (dominance rules for symbol layout class pairs)

b) From start symbol, search left-right in symbol list for symbols adjacent on baseline (**Zhang: fuzzy version)

c) Add baseline symbols as children of parent region node

d) Place non-baseline symbols in lists associated with region nodes (e.g. for super/subsc/bleft etc.)

e) Apply a-d to each new region, until no new regions created

Page 10: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

10

Expanding the View…Integration of scanned and pen-based expressionsInfty system, FFES prototype (impl. Josh Zimler 2006)

Long Term Goal: Flexible input and combinationAllow one to easily combine and then reformat/interpret

• LaTeX, eqn, etc.• MATLAB, Mathematica, etc.• Handwritten expressions (tablet/mouse)• Scanned images of handwritten or typeset expressions• “Vector drawing” interface input, e.g. as in Xpress (Pollanen

et al.)

Page 11: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

11

Other Math Entry InterfacesNatural Log by Matsakis, Miller, and Viola (MIT)JIMHR: (Java-Based) Interactive Math Handwriting

Recognizer, a merge and port of FFES/DRACULAE and the Natural Log system by Joy-Gong Ho (Acuitus Corp., USA)

JMathNotes by Ernesto Tapia Rodriguez (Free University of Berlin)

Infty by M. Suzuki et. al. (Kyushu University, Japan)MathJournal by XThink Inc: first commercial pen-based

math recognition systemMathPad by Joseph LaViola

Links available: http://www.cs.rit.edu/~rlaz

Page 12: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

12

The Recognition Strategy Language (RSL)

Page 13: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

13

Motivation: A high-level language for pattern recognition algorithms

Table Recognition Survey (Zanibbi et al. 2004)Summarizes literature in terms of observations,

transformations, and inferences.Techniques studied characterized as making the follow types

of inferences (decisions): • Parameter values (e.g. thresholds)• Interpretation Model Operations:

– Segmentation (identifying regions of interest in data)– Classification (assigning types to regions)– Relating regions (e.g. topology (adjacencies))– Rejecting segments, classes, and region relationships

(Unanswered) Question: How should we combine recognition modules in a complex

math entry system?

Page 14: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

Example: Simple Table Structure Recognition Algorithm (Part 1)

model regions Image Word Cell % default:’Region’ Row Column end regions

model relations % default:’contains’ adjacent_right adjacent_below end relations

recognition parameters sMaxRowSeparation 2 % millimetres sMaxColumnSeparation 2 % millimetres aResolution 300 % dpi; defaultend parameters

Page 15: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

15

strategy main adapt aResolution using getScanResolution() observing {Image} regions

classify {Word} regions as {Cell}

relate {Cell} regions with {adjacent_right} using defineRightAdjacency(sMaxRowSeparation,aResolution)

segment {Cell} regions into {Row} regions using relationClosure() observing {adjacent_right} relations

relate {Cell} regions with {adjacent_below} using defineLowerAdjacency(sMaxColSeparation,aResolution)

segment {Cell} regions into {Column} regions using relationClosure() observing {adjacent_below} relations accept interpretationsend strategy

Trivial Decision

Observation Specification

External Decision Function

Decision type

DecisionFunction

Parameters

Input: Params, Graph withImage, Word regions (BBs)Output: Cells, Rows, Cols

Page 16: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

1. Translate RSL Program to TXL (Using TXL)

2. Pass Input Graph (text file) to Program

3. Output (text files):

• Accepted Structures (interpretations)

• Log of all decisions and their outcomes

Running RSL Programs

Page 17: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

17

False Negatives( F )

Generated Hypotheses:( A U R )

Recognition Targets:Correct Hypotheses

New Metrics Based on Hypothesis Histories: Historical Recall and Precision

Page 18: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

Recall 4/8 (50.0%) 2/8 (25.0%) 8/8 (100.0%)

Precision 4/12 (33.3%) 2/5 (40.0%) 8/8 (100.0%)

Historical Recall 4/8 (50.0%) 6/8 (66.7%) 8/8 (100.0%)

Historical Precision 4/12 (33.3%) 6/17 (35.3%) 8/19 (42.1%)

Hypothesis History

Page 19: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

19

Cell Detection Results (Handley, 2001) RSL Re-implementation on Table ‘a038’ (UW-III)

*Inference times shown are those affecting cells

0: Input (words and lines)

1: Classify words as cells

16: Merge ‘horizontally close’ cells

35: Merge cells sharing column, row assignments. Nearly 50% of correct cells rejected; new correct cells also detected

47: Two cells merged producing column header ‘Total pore space (percent)’

51: Merge header cells bounded by two horizontal lines

83: Merge cells sharing line and white space separators

Page 20: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

20

RSL and Math EntryProposal: “MIN” SystemNew interface for math entry and offline experimentsUse RSL to define recognition strategies, capture results.(Really): testbed for studying recognition algorithms and their

intelligent combination, organization, and deployment in practice.

Goals:Compare different approaches to recognizing mathematical

expressions (from input to output) represented in RSLAllow flexible training, combination, and alteration of various

recognition strategies.Extend RSL to accommodate math and other problem

domains more effectively, while remaining abstract

Page 21: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

21

(Some) Relevant Journals and Conferences

Journals• IEEE Trans. Pattern Analysis and Machine Intelligence• Machine Learning• Pattern Recognition• Pattern Recognition Letters• Artificial Intelligence• Int’l J. Document Analysis and Recognition• …

Conferences• Int’l Conf. Machine Learning• IEEE Computer Vision and Pattern Recognition• Computational Learning Theory (COLT)• Int’l Conf. Document Analysis and Recognition• Int’l Work. Document Analysis Systems• …

Page 22: 1 Improving Interpretive Interfaces for Math Entry Richard Zanibbi Department of Computer Science Rochester Institute of Technology

22

Thank you.

Questions?

Support:

GCCIS Department of Computer Science