citlab argus fer keyword search in historical handwritten...

24
CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks to Mauricio for giving the talk. Tobias Strauß Tobias Grüning Gundram Leifert Roger Labahn September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 1 / 24

Upload: others

Post on 02-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

CITlab ARGUS fer Keyword Search inHistorical Handwritten DocumentsThanks to Mauricio for giving the talk.

Tobias StraußTobias GrüningGundram LeifertRoger Labahn

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 1 / 24

Page 2: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Handwritten Scanned Document Retrieval Task 2016• advanced keyword spotting task

Composition of the data sets provided by the organizers

#pages #lines #characters polygons baselines

training set 363 9645 65488 yes nodevelopment set 433 10589 80758 yes notest set 200 6355 — no yes

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 2 / 24

Page 3: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

System Overview

Polygon2BaseLine

Image LineInfos

BaseLine2Polygon

Preprocessing

Feature Generation

MDRNN

Training Decoding

external input

generation of consistent surrounding polygons

optical model (generation of confidence matrices)

advanced KWS (BBs)

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 3 / 24

Page 4: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Polygon2BaseLine

Image LineInfos

BaseLine2Polygon

Preprocessing

Feature Generation

MDRNN

Training Decoding

external input

generation of consistent surrounding polygons

optical model (generation of confidence matrices)

advanced KWS (BBs)

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 4 / 24

Page 5: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Polygon2BaseLine

1. Convolving the input image with Sobel-Kernels:

2. Evaluation of statistics on the sobel image to calculate the baselines

Module Output (Baselines) is calculated given the polygons

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 5 / 24

Page 6: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Polygon2BaseLine

Image LineInfos

BaseLine2Polygon

Preprocessing

Feature Generation

MDRNN

Training Decoding

external input

generation of consistent surrounding polygons

optical model (generation of confidence matrices)

advanced KWS (BBs)

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 6 / 24

Page 7: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

BaseLine2Polygon

1. Work on the sobel image

2. For each baseline calculate a medial seam (by maximizing sobel activity)

3. For each medial seam calculate two separating seams by using dynamicprogramming to calculate an optimal path from ”left to right”

4. Merge both separating seams to get a surrounding polygon per baseline

Module Output (Polygons) is calculated given the baselines

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 7 / 24

Page 8: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Polygon2BaseLine

Image LineInfos

BaseLine2Polygon

Preprocessing

Feature Generation

MDRNN

Training Decoding

external input

generation of consistent surrounding polygons

optical model (generation of confidence matrices)

advanced KWS (BBs)

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 8 / 24

Page 9: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

PreprocessingInput: Line image with surrounding polygons

Image normalization: contrast enhancement and global size normalization

Writing normalization: deskewing and deslanting

Output: Normalized line image to fixed image heightwith mainbody centered in the middle of the image

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 9 / 24

Page 10: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Polygon2BaseLine

Image LineInfos

BaseLine2Polygon

Preprocessing

Feature Generation

MDRNN

Training Decoding

external input

generation of consistent surrounding polygons

optical model (generation of confidence matrices)

advanced KWS (BBs)

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 10 / 24

Page 11: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Feature GenerationInput: Normalized line image

Output: 3-dimensional frequency feature

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 11 / 24

Page 12: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Polygon2BaseLine

Image LineInfos

BaseLine2Polygon

Preprocessing

Feature Generation

MDRNN

Training Decoding

external input

generation of consistent surrounding polygons

optical model (generation of confidence matrices)

advanced KWS (BBs)

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 12 / 24

Page 13: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Multidimensional Recurrent Neural Network

INPUT LAYER

FEATURE LAYER

2D LAYER 1

0D LAYER 2

2D LAYER 2

OUTPUT LAYER

Subsample 1

Subsample 2

Subsample 3

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 13 / 24

Page 14: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Output of the Network: Confidence Matrix (ConfMat)Input: 3-dimensional frequency featureOutput: Confidence of a character at a specific position (high confidence = black)

Transcription: “contribute to exempt him from a burthen or a biu-”

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 14 / 24

Page 15: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Polygon2BaseLine

Image LineInfos

BaseLine2Polygon

Preprocessing

Feature Generation

MDRNN

Training Decoding

external input

generation of consistent surrounding polygons

optical model (generation of confidence matrices)

advanced KWS (BBs)

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 15 / 24

Page 16: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Network Training• The network is trained using CTC as error Function and an extension of

Nesterov’s Accelerated Gradient Descent with momentum µ = 0.9 forparameter update.• One epoch contains 10.000 training samples (lines) from the training set.

No lines of the devel or the test set were used for training at all.• A learning rate λ = 5e − 4 is used for 19 epochs using the original

images.• For 3 epochs noise to preprocess parameter and network activation are

added with the same learning rate.• For additional 19 epochs we set λ = 5e − 5 and add degradation (pixel

noise, blur, cross outs,...) to the image.

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 16 / 24

Page 17: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Polygon2BaseLine

Image LineInfos

BaseLine2Polygon

Preprocessing

Feature Generation

MDRNN

Training Decoding

external input

generation of consistent surrounding polygons

optical model (generation of confidence matrices)

advanced KWS (BBs)

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 17 / 24

Page 18: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Decoding I

Find words regionsFor any keywordωωω and any line image XXX

1. Let rrr(ωωω) := .*ωωω .*2. Find the most likely path πππ∗ matching rrr(ωωω)3. Calculate the likelihood

Ps:e(πππ∗(ωωω) | XXX ) :=

e∏t=s

yt,π∗t

ofωωω. (s, e) are determined by the subpath πππs:e which is related toωωω.4. Acceptωωω if Ps:e(πππ

∗(ωωω) | XXX ) exceeds a certain threshold

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 18 / 24

Page 19: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Decoding II

HyphensConfMats of interest end on hyphen symbols such as -:=.

1. Find ConfMats matching rrr := .* [A-Za-z]+ [-:=]2. Combine those columns of the ConfMat which match [A-Za-z]+ with the

ConfMat of the next line3. Search in the resulting new ConfMat for keywords

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 19 / 24

Page 20: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Decoding III

ScoreReestimate the likelihood of the neural net by:

s(ωωω) :=1N

PT (ωωω)PS(ωωω)

Ps:e(πππ∗(ωωω) | XXX )

N :=∑ωωω′∈A∗

PT (ωωω′)PS(ωωω′)

Ps:e(πππ∗(ωωω′) | XXX )

where PT (ωωω) is the uni-gram prior probability ofωωω in the corpus and PS(ωωω) isthe prior probability estimated by the network.

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 20 / 24

Page 21: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Decoding IV

Multi word query ωωω1:n := ωωω1, . . . ,ωωωn

Search all wordsωωωi individually and combine their score by

s(ωωω1:n) :=

(n∏

i=1

s(ωωωi )

) 1n

.

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 21 / 24

Page 22: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Decoding V

Implicit network priors PSEstimate the network priors PS by• abs: the real uni-grams PS(zzz) = PT (zzz)• prior: a constant PS(zzz) ∝ 1

• da: character priors PS(zzz) ∝(∏|zzz|

i=1 P(zi ))c

where P(zi ) is the character prior and 0 < c ≤ 1.

Results: global average precision (gAP) on segment level

abs prior da baseline

devel set 94.81 95.36 94.99 74.2test set 33.89 43.20 46.74 14.4

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 22 / 24

Page 23: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Decoding VI

CTC vs. path probabilityCompare path probability Ps:e(πππ

∗(ωωω) | XXX ) with CTC-probability i.e. the sumover all feasible paths related toωωω in the range from s to e.

Results: gAP on segment level

abs prior dapath ctc path ctc path ctc baseline

devel set 94.81 94.89 95.36 95.42 94.99 95.04 74.2test set 33.89 33.98 43.20 43.72 46.74 47.13 14.4

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 23 / 24

Page 24: CITlab ARGUS fer Keyword Search in Historical Handwritten ...imageclef.org/system/files/CITlab_ARGUS.pdf · CITlab ARGUS fer Keyword Search in Historical Handwritten Documents Thanks

Thank you for your attention.

Sorry for not being here.

September 1, 2016 CITlab ARGUS fer Keyword Search in Historical Handwritten Documents | URO-CITlab 24 / 24