it dip 05
TRANSCRIPT
-
8/4/2019 It Dip 05
1/7
330 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 4, JUNE 2010
In-Image Accessibility Indication
Meng Wang, Member, IEEE, Yelong Sheng, Bo Liu, and Xian-Sheng Hua , Member, IEEE
AbstractThere are about 8% of men and 0.8% of women suf-fering from colorblindness. Due to the loss of certain color infor-
mation, regions or objects in several images cannot be recognizedby these viewers and this may degrade their perception and under-standing of the images. This paper introduces an in-image acces-sibility indication scheme, which aims to automatically point outregions in which the content can hardly be recognized by color-blind viewers in a manually designed image. The proposed methodfirst establishes a set of points around which the patches are notprominent enough for colorblind viewers due to the loss of colorinformation. The inaccessible regions are then detected based onthese points via a regularization framework. This scheme can beapplied to check the accessibility of designed images, and conse-quently it can be used to help designers improve the images, suchas modifying the colors of several objects or components. To ourbest knowledge, this is thefirst work that attempts to detectregionswith accessibility problems in images for colorblindness. Experi-ments are conducted on 1994 poster images and empirical resultshave demonstrated the effectiveness of our approach.
Index TermsAccessibility indication, colorblindness, posterimage.
I. INTRODUCTION
COLORS play an important role in humans perception
and recognition of visual objects. They are perceived by
humans with their cones absorbing photons and sending elec-
trical signal to the brains. According to their peak sensitivity,
the cones can be categorized into Long , Middle , and
Short , which absorb long wavelengths, medium wave-lengths, and short wavelengths, respectively. Consequently,
light is perceived as three members: where , , and
represent the amount of photons absorbed by -, -, and
-cones, respectively. More formally, color stimulus for a
light can be computed as the integration over the wavelengths :
(1)
where stands for power spectral density of the light, , ,
and indicate -, -, and - cones.
Colorblindness, formally known as color vision deficiency,
is caused by the deficiency or lack of a certain type of cone.Dichromats are referred to as those who have only two types
Manuscript received July 24, 2009; revised December 18, 2009; acceptedMarch01, 2010. Firstpublished March 22, 2010; current version published May14, 2010. The associate editor coordinating the review of this manuscript andapproving it for publication was Prof. Abdulmotaleb El Saddik.
M. Wang and X.-S. Hua are with Microsoft Research Asia, Beijing 100096,China (e-mail: [email protected]; [email protected]).
Y. Sheng is with the Beihang University, Beijing 100191, China (e-mail:[email protected]).
B. Liu is with the University of Science and Technology of China, Hefei230027, China (e-mail: [email protected]).
Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TMM.2010.2046364
Fig. 1. (a) Three poster images that are collected from the Internet. (b) Viewsof a protanopia (a type of red-green colorblindness) which are generated usingthe algorithm in [3]. We can see that there are important regions or objects inthe images that can hardly be recognized due to the loss of color information.
of cones, and they consist of protanopes, deuteranopes, and tri-
tanopes which correspond to the lack of -cones, -cones, and
-cones, respectively. Protanopes and deuteranopes have diffi-
culty in discriminating red from green, whereas tritanopes have
difficulty in discriminating blue from yellow.
Due to the loss of color information, high-quality images
for normal viewers may not be readily perceived by colorblind
viewers. However, a fact is that many images or drawings that
are designed with the intension of being appreciated by public
users, such as graphics, posters, and slides, do not take these
users into account. In fact, currently most standards about the
1520-9210/$26.00 2010 IEEE
-
8/4/2019 It Dip 05
2/7
WANG et al.: IN-IMAGE ACCESSIBILITY INDICATION 331
Fig. 2. Schematic illustration of in-image accessibility indication.
accessibility of multimedia focus on their access on the web
while ignoring the content of the media data [10]. For example,
Fig. 1 illustrates three poster images and their protanopic views
(i.e., what a protanpia perceives). We can clearly see that there
are important objects that can hardly be recognized by the col-
orblind viewer due to the loss of color information, and this willdegrade the perception or understanding of these images for the
colorblind viewer.
To deal with the problem, in this paper, we propose an ap-
proach that is able to indicate regions that encounter the acces-
sibility problem for colorblind viewers, i.e., the regions con-
tain information that may not be well perceived by colorblind
viewers (in the following discussion, we name them inacces-
sible regions for simplicity). This scheme can be applied in dif-
ferent scenarios, such as checking the accessibility of designed
images and helping designers avoid the accessibility problem
by making changes. Of course, a straightforward approach to
solving the accessibility problem is to directly show image de-signers the simulated colorblind view of the image and then let
the designers find if there are inaccessible regions. For example,
there is a plug-in named Vischeck [2] that can illustrate the
colorblind view of images in Photoshop, a well-known image
editing software. However, this approach degrades the experi-
ence of designers since they need to check the designed images
every time when they are revised. This problem is even worse
for the design of slides, as it is labor-intensive to check each
slide. Therefore, in this work, we propose an in-image accessi-
bility indication approach, which automatically detects the in-
accessible regions in designed images. This scheme can help
designers find the problem more efficiently and then consider
changing the designs, such as modifying the colors of severalobjects or components.
To the best of our knowledge, this is the first work that at-
tempts to indicate accessibility problem in colorblindness for
images. The main scheme of our approach is shown in Fig. 2.
First, we compute the gradient maps of original image and its
colorblind view. Then we perform an inaccessible point detec-
tion step to find a set of points around which the patches are notprominent enough for colorblind viewers due to the loss of color
information. The inaccessible regions are located based on these
points via a regularization framework. In this work, we focus
on protanopia and deuteranopia as most dichromats belong to
these two types, but our methods can also be extended to deal
with tritanopia. Existing studies also show that the perceptions
of protanopic and deuternaopic viewers are very close [3], [13].
Actually most of these colorblind viewers are not aware which
type of colorblindness they belong to, and they only know they
are red-green colorblind. So we will not distinguish these two
types of colorblindness in our study.
The organization of the rest of this paper is as follows. InSection II, we provide a short review on the related work. In
Section III, we introduce the detailed accessibility indication
approach, including inaccessible point detection and inacces-
sible region location. Experimental results are presented in
Section IV. Finally, we conclude the paper in Section V.
II. RELATED WORK
There are extensive research efforts dedicated to helping col-
orblind viewers in better accessing or enjoying visual docu-
ments. Clearly, understanding what colorblind viewers observe
is a basis. Therefore, many works have been put on simulating
colorblindness. Brettel et al. [3] proposed a method that trans-forms colors from RGB space to long, medium, short (LMS)
-
8/4/2019 It Dip 05
3/7
332 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 4, JUNE 2010
color space based on cone response and then modifies the re-
sponse of the deficient cones. This algorithm is widely adopted
by colorblindness simulation systems such as VisCheck [2] and
IBM aDesigners low vision mode [1].
Yang et al. proposed an approach that is able to quantify color-
blindnessand a color compensation scheme that can enhance the
perception of colorblind viewers [17], [18]. Several efforts havebeen dedicated to recoloring that aims to help colorblind viewers
better recognize visual documents, such as images, videos, and
web pages [2], [11], [12], [16]. These methods usually analyze
the distribution of colors in a visual document, and then a map-
ping function is adopted to change the colors such that several
details in the document can be enhanced. Dougherty et al. [2]
proposed an image recoloring process named Daltonize, which
first increases the red/green contrast in the image and then uses
the red/green contrast information to adjust brightness and blue/
yellow contrast. Iaccarino et al. [6] have proposed a simple re-
coloring method to improve the accessibility of web pages.Yang
et al. [16] proposed a method which changes a monochromatic
hue into another hue with less saturation for dichromats. Rascheet al. formulate the recoloring task as a dimensionality reduc-
tion problem, i.e., how to map the colors in a three-dimensional
space into a two-dimensional space that can be recognized by
colorblind viewers [13]. Huang et al. [5] proposed an image re-
coloring algorithm that keeps both the discriminative abilities of
colors and the naturalness of the image. In [7], Jefferson et al.
provided an interfaceto support the interactive recoloring imple-
mentationfor colorblind viewers. In [14] and[8], Wang etal. and
Liu etal. proposedan efficientrecoloringmethod whicheven can
be applied in real-time video processing. Although several en-
couraging results have been shown in these recoloring efforts, as
indicated in [14], the quality of many images can hardly be en-hancedsince1-Dcolorinformationhasbeenlostinthecolorblind
view. It is also worth noting that the recoloring approach and our
proposed scheme are essentially different. Our scheme aims to
help normal viewers better design and analyze images for col-
orblind viewers, i.e., it serves colorblind viewers via accommo-
dating normal viewers, whereas the recoloring approach directly
assists colorblind viewers in better perceiving several images.
III. IN-IMAGE ACCESSIBILITY INDICATION
As previously mentioned, the main two steps in in-image
accessibility indication are inaccessible point detection and in-
accessible region location. We introduce them in detail in the
following two subsections.
A. Inaccessible Point Detection
Inaccessible points are defined as the points around which the
patches are not prominent enough for colorblind viewers due to
the loss of color information. As noted by Marr, visual informa-
tion extracted by an observer from visual stimulus is conveyed
by changes perceived as gradients and edges [4], [9]. There-
fore, we estimate the information loss as the difference of gra-
dient maps of the original image and its protanopic view (as pre-
viously mentioned, the protanopic and deuternopic views have
only small difference and thus here, we only employ protanopic
view [3], [13]). As existing studies reveal that the informationloss of protanopia and deuteranopia mainly comes from the
Fig. 3. (a) Gradient maps in a channel of the original images ( G A ) . (b) Gra-dient maps in a channel of the simulated colorblind views ( G A ) . (c) Differ-ence of G A and G A ( G A 0 G A ) . (d) Full gradient maps of the simulatedcolorblind views. (e) Detected inaccessible points. The original images and theircolorblind views can be found in Fig. 1.
channel in LAB color space [3], [5], we only estimate the gra-
dient maps in this channel, which can be obtained as
(2)
(3)
where and are the values of the compo-nent and the gradient at th pixel in the original image, and
and are the corresponding values in its color-
blind view. Therefore, the information loss around point
can be estimated as .
It is worth noting that we need to select the points around
which the patches do not only have significant information loss
but also are not prominent in colorblind view. If we only se-
lect points according to the information loss criterion, we may
obtain several points that are still able to be recognized by col-
orblind viewers even if there exists significant information loss.
Therefore, we also compute the full gradient maps of the color-
blind view of the image , which is the sum of the gradientmaps of , , and channels, and the inaccessible point de-
tection is accomplished based on the following criterion.
Criterion: Point is inaccessible if
and , where and are
two pre-defined thresholds.
Fig. 3 illustrates the gradient maps of the three exemplary
images illustrated in Fig. 1 as well as the detected inaccessible
points. To make the figures clear, we have normalized the gra-
dient maps such that the maximum value of , ,
and for each image is 255. To generalize the method to
deal with tritanopia, we just need to replace and with
the full gradient maps of the original image and its colorblind
view, respectively, and the region location step does not need tochange.
-
8/4/2019 It Dip 05
4/7
WANG et al.: IN-IMAGE ACCESSIBILITY INDICATION 333
Fig. 4. Greedy strategy to find the solution of (5).
B. Inaccessible Region Location
Based on the detected inaccessible points, we locate inacces-
sible regions with bounding boxes.1 The task is actually finding
a set of regions that cover the inacces-
sible points. If we fix the number of regions, the problem can be
formulated as
(4)
where indicates the area size of the region . The above
equation is straightforward since minimizing the size of the re-
gions is helpful in obtaining accurate indication. However, there
is a dilemma about the number of regions: more regions can lead
to more accurate indication, but it will be distractive for users.
Therefore, we add a regularizer on the number of regions in (4),
which thus turns to
(5)
where is the size of the whole image and is a weighting
factor. The above optimization problem is difficult to solve,
since the solution space scales exponentially with the number
of inaccessible points. Therefore, we propose a greedy strategy
to obtain with an incremental process. The algorithm is illus-
trated in Fig. 4.
Now we analyze this process. We can see that it works in an
incremental way and in each step two regions are merged, and
the selection of the two regions is optimal with respect to the
objective in (5). The number of regions is decided by traversing
all the possibilities and then selecting the one that minimizes
. This process is actually analogous to the agglomerative
1In order to validate our approach, we have conducted a simple user study toinvestigate different presentation methods, including 1) indicating inaccessibleregions with bounding boxes; 2) indicating inaccessible regions with boundingpolygons; and 3) directly showing inaccessible points. There are 12 persons thatare familiar with poster design involved in the study. For each person, we illus-trate the detection results for those poster images with accessibility problemusing different presentation methods (the dataset is introduced in Section IV),and then the designer is asked to choose his/her preference. The study results
show that ten among the 12 participants choose bounding boxes, and they agreethat this method achieves a good tradeoff between accuracy, simplicity, andclearness.
clustering approach [15]. The difference is that, in the agglom-
erative clustering algorithm, two clusters are selected to merge
according to their distance, but in our method, we select two re-
gions based on the increase of area size after they are merged.
This is due to the difference of the objectives: the objective
of clustering is the minimization of distance between clusters,whereas our objective is the minimization of the sizes of regions.
This method may not obtain the optimal solution of (5), but it is
computationally efficient and shows encouraging performance
in our experiments.
IV. EXPERIMENTS
A. Experimental Settings
We collect a poster image dataset as follows. First we per-
form image search with the query poster as well as its trans-
lations in Dutch and Chinese on Google and collect 3000 re-
turned results. Of course, some of the collected images are ac-
tually not posters and there are many duplicates. We then con-duct a filtering process and obtain 1994 distinct poster images in
all. Fig. 5 illustrates several exemplary images. Every image is
scaled such that its width is 240 pixels. Fig. 6 illustrates the dis-
tributions of the R, G, and B components of the pixels in these
images as well as a distribution of their dominant colors. From
the figure, we can see that the images are diverse in colors. We
implement the accessibility indication algorithm on these im-
ages. The parameter and for inaccessible point detection
are empirically set to 15, and the parameter in (5) is set to
0.05. In the experiments, we also set a threshold in the indica-
tion of inaccessible regions: a region is ignored if it occupies
less than 25 pixels since in most cases, it is caused by the noisesin images and typically such a small region does not convey im-
portant information. Fig. 7 illustrates the detection results for
several images that encounter the accessibility problem.
B. Evaluation
Three red-green colorblind viewers and an experienced
poster designer with normal view participated in the annotation
of ground truths. The labeling process is as follows: for each
image, it is examined by the colorblind viewers and the designer
and then they have a discussion about the details of the image
such as objects and characters. If an image contains objects or
regions that are clear for the normal viewer but can hardly berecognized by a colorblind viewer, it is labeled as disqualified.
-
8/4/2019 It Dip 05
5/7
334 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 4, JUNE 2010
Fig. 5. Several exemplary poster images.
Fig. 6. (a) Distributions of R, G, and B components of the pixels in the posterimages. (b) Distributions of images with different dominant colors.
In this way, the images are grouped into two classes, i.e., quali-
fied and disqualified. For simplicity, we denote them by class 0
and class 1, respectively. The labels show that 222 among the
1994 images are disqualified. This number also demonstrates
that such an accessibility indication tool is highly desired.
The detection results are shown in Table I, where indi-
cates the number of images that labeled as class and predicted
as class . The precision and recall measurements of accessi-bility detection are 0.900 and 0.977, respectively.
We also study the sensitivity of the two parameters and .
Figs. 8 and 9 illustrate the performance of accessibility detec-
tion when and vary from 10 to 20, respectively. From theresults, we can see that the precision and recall measurements
are able to be above 0.8 when these two parameters vary in a
wide range.
We then evaluate our region location approach for disqual-
ified images. Each located region is judged to be correct
or incorrect according to whether the content in the region
can hardly be recognized by colorblind viewers. The labeling
process is carried out via the discussion of the designer and the
three colorblind viewers, which is analogous to the labeling
of accessibility ground truths. The annotators also point out
if they consider there are regions that should be indicated but
missed by our algorithm. The statistical results show that thereare 36 regions among the 242 indicated ones that are consid-
ered unsuitable by the annotators. In addition, the annotators
believe that there are 42 regions missed. As previously men-
tioned, the parameter is used to achieve a tradeoff for the
dilemma on the number of regions, as fractional regions can
lead to more accurate indication but they will be distractive
for users. Fig. 10 illustrates the region location results with
different for the three exemplary images. Here we have only
shown the simulated views, and the original images can be
found in Fig. 1. We can see that there will be too many frac-
tional regions and they are distractive when tends to be small,
and too great value of will degrade the accuracy of the re-gions. As a weighting parameter in regularization, currently
there is no method to automatically adjust it for each specific
image, but our results show that a global setting of
is already able to achieve fairly good results. We can estimate
that the precision and recall measurements of region location
are 0.870 and 0.918, respectively.
We then further categorize the poster dataset according to
the simplicity levels of the images. Each image is labeled to be
simple, complex, or neutral by a human according to the rich-
ness of its content. The labeling results show that the simple,
complex, and neutral subsets contain 627, 422, and 945 im-
ages, respectively. We then estimate the detection performanceon these three subsets with the parameter settings, namely
-
8/4/2019 It Dip 05
6/7
WANG et al.: IN-IMAGE ACCESSIBILITY INDICATION 335
Fig. 7. Several examples of in-image accessibility indication. Here we have simultaneously illustrated the original images and their simulated views.
TABLE IACCESSIBILITY DETECTION RESULTS, WHERE n ( i ;j ) INDICATESTHE NUMBER OF IMAGES THAT ARE LABELED AS CLASS i ANDPREDICTED AS CLASS j . HERE CLASS 0 AND CLASS 1 INDICATE
QUALIFIED AND DISQUALIFIED, RESPECTIVELY
Fig. 8. Performance variation of accessibility detection with respect to T .Here the parameter T is set to 15.
Fig. 9. Performance variation of accessibility detection with respect to T .Here the parameter T is set to 15.
Fig. 10. Region location results with different . We can see that there will betoo many fractional regions when tends to be small and it will be distractivefor user. On the other hand, too great value of will degrade the accuracy ofthe regions. We set the parameter to 0.05 in this work, and this value achievesa good compromise of the above two issues for most images.
. Table II illustrates the results. We also illustrate
the results achieved with optimal parameter settings for each
parameter, which are tuned by maximizing F-score that com-promises precision and recall. We can see that the setting of
-
8/4/2019 It Dip 05
7/7
336 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 12, NO. 4, JUNE 2010
TABLE IIACCESSIBILITY DETECTION PERFORMANCE ON DIFFERENT SUBSETS. THE 2ND
AND 3RD COLUMNS ILLUSTRATE THE PERFORMANCE WITH OUR EMPIRICALPARAMETER SETTINGS AND OPTIMAL SETTINGS OF T AND T . P
AND R INDICATE PRECISION AND RECALL, RESPECTIVELY
TABLE IIIINACCESSIBLE REGION LOCATION PERFORMANCE ON DIFFERENT SUBSETS.
P AND R INDICATE PRECISION AND RECALL, RESPECTIVELY
achieves good results on each subset, and they
are close to the optimal results. Table III illustrates the perfor-mance of inaccessible region location on different subsets with
the setting of , and we can see that the precision and re-
call measurements are all above 0.8. This demonstrates that the
parameters are not very sensitive with respect to different data.
We can analyze that the computational cost of our approach
mainly consists of two parts: one is for inaccessible point detec-
tion and the other is for inaccessible region location. The two
costs scale as and , respectively, where and
are the number of image pixels and inaccessible points, respec-
tively. In our experiments, the time cost of processing one image
is less than 40 ms in average on a PC with Pentium 4 3.0-G CPU
and 1 G of memory.
V. CONCLUSION
This paper introduces an in-image accessibility indication
scheme that aims to automatically point out regions that can
hardly be recognized by colorblind viewers in a manually
designed image. The proposed method first establishes a set
of points around which the patches are not prominent for
colorblind viewers due to the loss of color information. The
inaccessible regions are detected and indicated based on these
points via a regularization framework. The method is simple yet
effective, and it is able to process an image in less than 40 ms.
Experiments are conducted on a large set of poster images, and
empirical results have demonstrated the effectiveness of our
approach.
REFERENCES
[1] AccessibilityResearch:Adesigner. [Online]. Available: http://www.re-search.ibm.com/trl/projects/acc_tech/adesigner.htm.
[2] Vischeck. [Online]. Available: http://www.vischeck.com.[3] H. Brettel, F. Vienot,and J. Mollon, Computerized simulation of color
appearance for dichromats, J. Optic. Soc. Amer., vol. 14, no. 10, pp.26472655, 1997.
[4] R. Hong, C. Wang, Y. Ge, M. Wang, and X. Wu, Salience preservingmulti-focus image fusion, in Proc. Int. Conf. Multimedia and Expo,2009, pp. 16631666.
[5] J. B. Huang, Y. C. Tseng, S. I. Wu, and S. J. Wang, Information pre-serving color transformation for protanopia and deuteranopia, IEEESignal Process. Lett., vol. 14, no. 10, pp. 711714, Oct. 2007.
[6] G. Iaccarino, D. Malandrino, M. D. Percio, and V. Scarano, Efficientedge-services for colorblind users,in Proc. Int. World Wide Web Conf.,2006, pp. 919920.
[7] L. Jefferson and R. Harvey, An interface to support color blind com-puter users, in Proc. SIGCHI, 2007, pp. 15351538.[8] B. Liu, M. Wang, L. Yang, X. Wu, and X. S. Hua, Efficient image and
video recoloring, in Proc. Int. Conf. Multimedia and Expo, 2009, pp.906909.
[9] D. Marr, Vision. San Francisco, CA: Freeman, 1982.[10] L. Moreno, P. Martinez, and B. Ruiz-Mezcua, Disability standardsfor
multimedia on the web, IEEE Multimedia, vol. 15, no. 4, pp. 5254,Jan.Mar. 2008.
[11] J. Nam, Y. M. Ro, Y. Huh, and M. Kim, Visual content adaptationaccording to user perception characteristics, IEEE Trans. Multimedia,vol. 7, no. 3, pp. 435445, Jun. 2005.
[12] K. Rasche, R. Geist, and J. Westall, Detail preserving reproductionof color images for monochromats and dichromats, IEEE Comput.Graph. Appl., vol. 25, no. 3, pp. 2230, Jul.Aug. 2005.
[13] K. Rasche, R. Geist, and J. Westall, Re-coloring images for gamuts of
lower dimension, in Proc. Eurographics, 2005, pp. 423432.[14] M. Wang, B. Liu, and X. S. Hua, Accessible image search, in Proc.
ACM Multimedia, 2009, pp. 291300.[15] R. Xu and D. W. , II, Survey of clustering algorithms, IEEE Trans.
Neural Netw., vol. 16, no. 3, pp. 645678, May 2005.[16] S. Yang and Y. M. Ro, Visual content adaptation for color vision de-
ficiency, in Proc. Int. Conf. Image Processing, 2003, pp. 453456.[17] S. Yang, Y. M. Ro, E. K. Wong, and J. H. Lee, Quantification and
standardized description of color vision deficiency caused by anoma-lous trichromatsPart I: Simulation and measurement, EURASIP J.
Image Video Process., vol. 2008, no. 1, pp. 19, 2008.[18] S. Yang, Y. M.Ro, E. K. Wong, andJ. H. Lee, Quantificationand stan-
dardized description of color vision deficiency caused by anomaloustrichromatsPart II: Modeling and color compensation, EURASIP J.
Image Video Process., vol. 2008, no. 1, pp. 112, 2008.