00198175

A hierarchical classifier for visual label identification based on binary features

R Massen and J. Gassler

In order to control the flow of photocassettes in an automatic sorting system, a local photo printer laboratory has developped a unique moulded container which is able to accept and to maintain in a fixed position the different film cassette types currently in use : 24/36 roll films, pocket size , mini- pocketThe containers pass different automatic magazining stations equipped with robotic handlers. Each handler has the task to pick up a specific class of film and stack it into a magazine for further processing. Our goal was the optical recognition of the film type by company brand, number of exposures (24/36 exposures) and sensitivity (100/200 ASA) . 30 classes have to be automatically recognized in 0,s sec each.Some 1O.OOO films are classified per day. The only cue available for an optical classification was the printed patterns on the 1abel.As colour was not available, we choose 3 types of visual greylevel features for automatic classification: a) the average brighmess in a smal l window; “BRIGHT‘ b) the presence of an edge in a smal l window; “EDGE” c) the presence of a “4” or a “6” digit in a small window;”4” All features are binarized in a first step: feature a) by comparing the average intensity in the 20 by 20 pixel window to an average in a much larger neighbourhood; feature b) by comparing the local variance in the small window against a futed threshold; feature c) by matching the window to 2 tem- plates with a “4” and “6” reference digit and deciding for the highest degree of correlation.We thus obtain for each window a set of 3 binarized features expressing the presence of a light or dark back- ground ( BRIGHT ), the presence of an edge (EDGE ) and the presence of a “24” exposure film ( 4). If the classification is critical, we mark the window as “non classifieable”, so that we actually have a ternary result for the BRIGHT, the EDGE and the “4” feature in every window. The position and the shape of typ. 30 windows is fmed interactively by looking at difference images and deciding for those parts of an image which are the most powerful in discriminating several classes of films (Fig. 1)

Fig.1 Ternary features for brightness, edge and “4” digit are measured in 20 fixed sized windows to classify 30 different types of film cassettes

R.Massen J. Gbsler Transfer Centre Constance for Image Processing D-7750 CONSTANCE

The classification is based on the principle of “exclusion”. Every window feature can discriminate between two sets of classes, each set containing at least one member. “Non-Classifiable” windows are just discarded.The simplified situation in Fig2 may demonstrate our approach.We want to classify 4 types of films:Kodak-24,Fuji-24,Fuji-36 and Agfa-100.A set of 6 feature windows is used: 3 BRIGHT windows (F14, F23, F34) , 2 EDGDE windows (F12, F13) and one “4“ window (F24). Feature F14 is able to discriminate between class 1 and class 4, feature F23 between class 2 and class 3 etc. class I class 2

BRIGHT

Kodak-2 Fuji-24

I’ 4 ”

Agfa-100 class 3 class 4 Fuji-36

EDGE

Fig. 2 Simplified example for the classification of 4 types of films using 6 windows.

We arrange the features Fij into a binary decision matrix (Fig. 3).Iffeature F12 is TRUE, it excludes the presence of class 2. If feature F12 is FALSE! it excludes the presence of class 1

4 j 1 2 3

3

F i j = fa lse

\ I F i j = true

n

classes __ _. .. . .

Fig3 Binary decision m a 6 for excluding a class depending on the binarised window feature

I

We still have a free choice with what window we should start the classification procedure.We com- pute a measure for the power of discrimination D(Fij) for each window feature which takes into account the following a-priori knowledge: the joint probability of class i and class j (p(Cij)), the total number of class pairs which can be discriminated by a single feature Fij (Nij) and the compu- ting time needed to calculate a feature Fij (Tij):

Dij = p(Cij) * Nij / Tij

Features pointing to classes with high absolute probability and which are able to discriminate between large sets of classes and which can be computed in a short time are ranking at the top. Classification starts in this ranking order. The features are only extracted at this moment, so that only features which are really used have to be computed. Fig.4 shows the binary decision tree with the ranked features.After each decision (node in the graph), the remaining set of features is reordered in an updated ranking list and classification steps forward to the next node.The graph also shows that the set of features is redundant. This redundancy is used as an additional check by backuacing trough the classification me to check the consistency of the remaining features. If f.i. we have found that a class1 film is present through the inspection of the window features F12,F34 and F13, we backtrace through features F14 which must be FALSE for a consistent result. Non-consistent results lead to a “reject” decision; the films are routed to a manual inspection station.

0 Q(?JO(?JO Fig. 4 Ranked classification tree for example of Fig2 and Fig. 3

This classifier was implemented on a 68000 VME-Bus vision system developped at the Transfer Centre and installed on-line. Classification time took 400 ms in the average with typically 92% of non-ambiguous correct classification rate, 8% of reject rate and an un-measurable rate of mis- classification.The rather high reject rate resulted from a significant number of films , where the labels had been tom off or written on by the customer.It was no problem to classify those films by hand.

00198175

Documents