Download - Graphics Recognition – from Re-engineering to Retrieval Karl Tombre, Bart Lamiroy LORIA, France
Graphics Recognition – from Re-engineering to RetrievalKarl Tombre, Bart Lamiroy
LORIA, France
Document Analysis in the IR era
Information is at the core of industrial strategies
A lot of digital or digitized information, but often in very “poor” formats
The challenge: not necessarily re-engineering of documents, but enrich poorly structured information, add (limited) amount of semantics, build indexes
Purposes: browsing, navigation, indexing DAR methods and tools useful, but must
be adapted
Specific challenges of large-scale IR applications Genericity: we cannot necessarily build a
complete and exhaustive a priori model of contextual knowledge (ontology)
Adaptability: various input data – scanned paper, PDF, DXF, HTML, GIF… – various resolutions
Robustness: “back-office” applications Efficiency: online searching in
heterogeneous data Scaling: methods have to scale to
increasing number of symbols/features
DAR and IR
Media without (or with very little) contextual knowledge
Image-based indexing and retrieval, indexing of video sequences
Documents do explicitly convey information from one person to another person
Much more structure, syntax and semantics
DAR and IR – some examples
Indexing and/or searching scanned text without OCR
Similarities, signatures Query or index on layout structure Table spotting Keyword spotting …
What about Graphics Recognition? Subfield of DAR, for graphics-rich
documents Numerous methods for various analysis
and recognition problems Raster-to-vector conversion Text/graphics separation Symbol recognition
Many specific technical areas: maps, architectural drawings, engineering drawings, diagrams and schematics, …
Graphics recognition methods Text/graphics separation
Vectorization
Graphics recognition methods
Graphics recognition and IR applications Usual text-based indexing and retrieval
still useful But need for access to other kinds of
information: Symbols Text-drawing connections Description-illustration connections
Some contributions Syeda-Mahmood – maintenance drawings
IEEE Trans. On PAMI 21(8):737-751, Aug. 1999
Some contributions Arias et al., Najman et al. – use of information
contained in legend / title block
Proc. GREC’01, Kingston (Ontario, Canada), p.19-26, Sept. 2001
Some contributions Samet & Soffer – symbols from legend
IEEE Trans. On PAMI 18(8):783-798, Aug. 1996
Some contributions Müller & Rigoll – graphical retrieval in database
of engineering drawings
Proc. ICDAR’99, Bangalore (India), pp. 697-700, Sept. 1999
Some contributions Boose et al. (Boeing) – Generation of Layered
Illustrated Parts Drawings (GREC’ 03)
Proc. GREC’03, Barcelona, pp. 139-144
Wishful thinking?
Symbol DB
Or even better…
Symbol recognition Natural features for indexing and retrieval Most methods work with known databases
of reference symbols – what about interactive querying of arbitrary symbols?
From segmentation followed by recognition, to segmentation-free recognition, or segmenting while recognizing
Scalability Efficiency / complexity Discrimination power
Signatures
Before we move on:
1st contest on
symbol recognition
held last week
See IAPR TC10 homepage
for further details
Image-based signatures
Compute invariant signatures on binary document image F-signatures (ICDAR’01) Radon transform: R-signatures [Tabbone
& Wendling] Ridgelets [Ramos Terrades & Valveny –
GREC’03] – aka wavelet transform of Radon transform
R-signaturesDetection of arrowheads [Girardeau & Tabbone]
DEA degree thesis, INPL, Nancy, Jul. 2002
R-signaturesAnother example [Girardeau & Tabbone]
Ridgelets[Ramos Terrades & Valveny – GREC’03]
Proc. GREC’03, Barcelona,
pp. 202-211
Vector-based signatures
[Dosch & Lladós – GREC’03] Based on set of basic graphical features:
Parallelism Overlap Collinearity T- and V-junctions
Quality factor associated with the various relations
Match signatures of reference symbols with signatures of buckets
Vector-based signatures
Proc. GREC’03,
Barcelona,
pp. 159-169
Towards symbol spotting
Pre-compute – or compute on the spot – a set of basic signatures
Can be sufficient for symbol spotting and retrieval
Followed by classical symbol recognition if more discrimination is needed
Symbol spotting [Jabari & Tabbone] : graph matching through
probabilistic relaxation, with nodes=segments and vertices=relations
DEA degree thesis, INPL, Nancy, Jul. 2003
Symbol spotting [Jabari & Tabbone] : another example
Combining Text and Graphics
Extracting Text/Graphics relationships within document
Using Text matching for inter-document relationships
Transitive inter-document Graphics matching
No need for complex graphics matching Restricted to well known document types
Example: continuation of Wiring Diagrams (Boeing) [Baum et al. – GREC’03]
Proc. GREC’03, Barcelona, pp. 132-138
Scan2XML Example
Proc. GREC’01, Kingston (Ontario, Canada), pp. 312-325
Indexing and Semantics
Signature + metric Semantics = measured distance to signature Applies only to homogenous contexts
Pre-segmented images Pre-determined image classes Implicit application of domain kowledge ...
Semantics = Syntax
Example
Signature type AMetric M
Semantics1 = (1, 1)Semantics2 = (, 2)
Signature value M(M(
semantics = measurement to reference value
Heterogenous Document Bases Semantics do not have a unique syntax
anymore Syntax metrics may be context sensitive Semantics = Syntax + Context
Context needs to be considered
Two different contexts from the automobile industry
Example
Context 1:Signature type AMetric M
(1, 1) = Semantics1 = (1, 1) (, 2) = Semantics2 = (, 2)
Context 2:Signature type BMetric N
Signature value What if
M( and N(
A step to taking into account context(while consolidating existing approaches)
Component Algebra : Image Analysis = Pipeline Syntax + algorithm = semantics
AlgorithmAlgorithmDataData
(syntax)
DataData
(semantics)
AlgorithmAlgorithmDataData
(semantics)
Syntax and semantics need not be distinguished
Component Algebra
Components :Known and implemented document analysis
algorithms, taking input data from one domain, and producing data into another domain.
Application Context :Set of all available Components.
Semantics :Data sets needed by or produced by Components.
Component Algebra is a Graph
ComponentComponentDataData
DataData
ComponentComponent
DataDataDataData
DataData DataData
DataData
ComponentComponent
Advantages
Each node is a semantic concept, semantic relationships are explicitly expressed.
Structure may support automatic reasoning and knowledge inference.
Context is embedded in components, different contexts give different paths in the graph.
Highly scalable and open architecture. Bridge between signal-level document
analysis and high-level document representation.
However ...
The formalism exists, the realization doesn't (yet)
What about parametrization ? How context independant can you get ? What about « guessing » context
appropriateness ? How to design fully interoperable components ?
Conclusion A lot of DA methods – and more specifically
GR methods – can be of direct use in IR, indexing and browsing applications
Specific challenges Scaling and efficiency Heterogeneous sets of documents Incomplete domain knowledge Symbol spotting On-the-fly symbol searching
Sketch of open framework for including document semantics when context can be heterogeneous