a new method for automatic clothing tagging utilizing image-click-ads introduction conclusion can we...

Design Pattern of HBase Configuration

A New Method for Automatic Clothing TaggingUtilizing Image-Click-AdsIntroductionConclusionCan We Do Better to Reduce Workload?AUTOMATIC CLOTHING TAGGINGBased on Can we do better to reduce workloadA Tradeoff between Workload and Accuracy

Low workloadLow accuracyHigh workloadHigh accuracy

Low Workload, High Accuracy?Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGManual Tagging ToolAssisted Tagging ToolWhile we developing our image tagging system, we find out that web masters prefer low workload tagging tools. However, this bounding box tagging tool suffers from low accuracy problem, because it can only tell where this object roughly is

On the other hand, the tool we developed for crowdsourcing workers can produce tags of high accuracy, but it also requires more work.

So the question we ask in this part is that can we develop a tagging tool that has low workload, while producing high accuracy tags.The ObjectiveAutomatically segment object regions in an image With intuitive tagging interfaceFor the purpose of achieving low workload, high accuracy for the tagging taskUtilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGThe Research ProblemTo design a new automatic segmentation method for:Reducing workloadProducing high accuracy tagged imagesIn the domain of clothing imagesUtilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGWe focus on fashion images, because they are playing a major role in fashion market, e-commerce, andimage-click-ads platforms.

Background: Image Parsing ModelsUtilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGImage Parsing (Semantic Segmentation)

Unary TermPairwise TermMRFProbability of each node for each classLocal neighborhood knowledgeCRFProbability of each node for each classMore complex prior knowledgeCRF is better

CRF example

Traditional methods for image parsing usually fall into two categories: modeling the problem using either MRF and CRFA MRF model usually encodes local neighborhood knowledge. It computes the label assignment for each pixel, which is the probability here by minimizing an energy functionIf we look at the factor graph for the basic MRF mode, each predicted label is only dependent on the observed variable

A CRF model usually can encode more complex prior knowledge. For example, the pairwise term here

One way to distinguish CRF model from MRF is that CRF model has no pure prior.

In recent years, CRF is usually considered outperform MRF because of its ability to encode more complex prior knowledge. However, in our work, well show that this is not always true, especially for some particular problems, and fashion image is one such exampleColor-category labels [Liu et al 2014]: blue shirtData driven method [Yamaguchi et al 2013]: retrieve similar images from datasetCRF based method [Yamaguchi et al 2012]

Related Work: Clothing ParsingUtilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGfashion images == common images?

Unary termPrior distribution: pairwise co-occurrence of clothing labelsPrior distribution: Probability of neighboring pairs having the same labelClothing parsing for fashion images is the focus of our work.

However, all existing works ignore characteristics of fashion images.

Drawbacks (1/5)Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGProblem 1: Fails to Preserve Local ContrastCause: Large variations in configuration and appearance, resulting in unreliable prior distribution knowledge, a common problem of CRF.

A state-of-the-art clothing parsing method that we mainly referred to in our work is proposed by in 2012. It is an CRF based method. In her model, it encodes two prior distributions learned from the training dataset

Taking this image as an example,It learns tops are near short, skins are near top, shorts, and shoesPrior 1: e.g. tops are near shortsPrior 2: e.g.

Drawbacks (2/5)Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGProblem 2: Background SpillCause: feature similarity



Drawbacks (3/5)Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGProblem 3: Incomplete Region PredictionCause: occlusion



Drawbacks (4/5)Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGProblem 4: Over Smoothing of Infrequent LabelMentioned in [Yamaguchi et al 2012]Cause: unbalanced labeled data (low frequency, small region)

Unbalanced Labeled Data

For exampleProbability = 0.4This problem is first mentioned in the work of Yamaguchi

We think the reason for causing this problem is unbalanced labeled data.Because infrequent labels, such as necklace and purse, usually appear in very low frequency and very small regionsDrawbacks (5/5)Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGProblem 5: High Computational CostCause: hierarchical segmentation method



A deviation of MRF (Re-Weighted MRF) by introducingBackground Prior: addressing background spill problemOcclusion Prior: addressing occlusion problemA Re-weighted Pairwise Term: addressing over smoothing of infrequent labels problemIntegrated with SLIC superpixel segmentation methodAddressing high computational costObtains better performance of clothing parsing inParsing accuracyProcessing efficiency

The Proposed MethodUtilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGHowever, the design of this method ignored many characteristics of fashion images:First of all, clothing items have large variations in configuration and appearance. This means not only tops can be near shorts, but also blazer, t-shirt, sweater even suits can also be near shorts. Also, each clothing item does not have an unified appearance. It could be white shorts, yellow shorts, or shorts with different color patterns.Second, clothing usually have lots of layering, one item could easily occlude the other. So clothing images naturally have lots of occlusions.Third, people wearing clothing are usually positioned in the center of the photoLastly, there are many infrequent labels learned from small regions, such as necklace and purse. In the parsing process, those infrequent label regions are easily over smoothed with neighborhood regions.

According to these observations, we propose to use a MRF model, using occlusion prior, background prior, and a re-weighted pairwise term to produce better parsing results.

In the following, I will demonstrate each of the problem in more detail, and why the proposed method can resolve the problemThe Proposed Method: Workflow

pose estimationsuperpixelbackgroundpriorocclusionpriorRe-Weighted MRFUtilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGTwo New PriorsUtilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGING

Background Prior

Occlusion PriorNON: neighbors of neighborhoodFor VFor V*Propose two new priors.

One is occlusion prior. Because clothing images are naturally occluded, only considering neighborhood nodes are not enough, so we should take neighbors of neighborhood into account. We do so by adding edges between nodes that has neighbors of neighborhood, which is computed by the square of graph, represented in the red lines.

We propose background prior. Because people wearing clothing are silent figures, positioned in the center of image, so image boarder part should be background.

Re-weighted Pairwise TermRe-weighted Pairwise Term

Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGING0.90.4Probability prediction

OriginalPairwise Termso we propose to a re-weighted pairwise term, which adjusts pairwise term based on data term computation.

What this does is that when a patch has small probability value for predicting the label, it tune down the pairwise term as well.

Now this pairwise term is adjusted based on its neigh

Now, rather than blindly merging two neighboring regions together based on feature similarity, it considers the predicted data term for the neighboring region. If one region has relatively low data term, it also tune down the pairwise term accordingly.Online Image-Click-Ads Tagging System: EyeDentifyIt 3.0

Here is a demo video to show the automatic tagging tool we developed using our clothing parsing methodEvaluation: Training and InferenceTraining data setFashionista [Yamaguchi et al. 2012] in EyeDentifyItFeature vectorRGB color [m X n X 3]CIELAB color [m X n X 3]Gabor feature [m X n X 4]Absolute 2D coordinates [m X n X 2]Relative 2D coordinates [m X n X 28]Training and inferenceLogistic regress (liblinear library)Max-flow min-cut (gco-v3.0)Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGING37Quantitative Evaluation (1/2)Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGCompare Pixel ACC and MAGR amongMRF Reweighted Pairwise Term (RW)Reweighted Pairwise Term + Background Prior (RW+BP)Reweighted Pairwise Term + Background Prior + occlusion prior (RW+BP+OP)

Result 1: steady improvements with more priorsResult 2: RW reaches the best MAGR38Average garment recallmethodPixel AccMAGRTraining TimeProcessing TimeRe-Weighted MRF90.5%63.0%631.8 sec5.2 sec[Yamaguchi et al 2012]85.1%57.2%4546.7 sec81.5 secBaseline77.6%12.8%N/AN/AQuantitative Evaluation (2/2)Utilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGINGCompare betweenRe-Weighted MRFCRF [Yamaguchi et al 2012]Baseline: naively predict to be all backgroundResult 3: 5.4% gain on pixel ACC, 5.8% gain on MAGR, 86.1% gain on training time, 93.6% on processing time39

Qualitative Evaluation (1/3) CRF[Yamaguchi et al 2012] Re-Weighted MRF

MRFResolved P1: Fails to preserve local contrastP4: Over smoothing of infrequent label40

Qualitative Evaluation (2/3)

CRF[Yamaguchi et al 2012] RW+BPRW+BP+OP

ResolvedP2: Background spillP3: Incomplete region prediction41ContributionsDeveloped a new automated clothing parsing methodProposed background prior, occlusion prior to resolve background spill and occlusion problem in clothing parsingProposed re-weighted pairwise term for MRF model to justify infrequent small label predictionDemonstrated MRF performs better than CRF in conditions that the local knowledge is more trust worthy than the statistical model learned from training dataset Integrated in EyeDentifyIt 3.0, driven by Image-Click-Ads frameworkUtilizing Image-Click-AdsIntroductionConclusionAUTOMATIC CLOTHING TAGGING42Thank You Questions and comments46

a new method for automatic clothing tagging utilizing image-click-ads introduction conclusion can we...

Documents