fashion classification and detection using convolutional...

FashionClassificationandDetectionUsingConvolutionalNeuralNetworks

Prateek Srivastava&Archit Arora

November10,2016

FashionClassificationProblem

• Given an image, identify the apparel a particular person is wearing

Applications

• Targetedads

• Productrecommendations

• IdentificationofsuspectsthroughCCTVfootage

• Brandadoptiontrendforclothingcompanies

Challenges

• Intra-classvariation• Viewpointandlocationalvariation• Scalevariation• Backgroundclutter

Challenges

Overviewoffashionclassificationprocess

• Datacollection• Dataaugmentation• Learningfromdata• Imageclassification&objectdetection

Datacollection

• UsedScrapy tooltosearchandcrawlMacy’swebsite• Generatedalabelleddatasetofaround3500imagesbelongingtothesecategories

Jacket Long-dress Poloshirts Suit Short-dress

T-shirt Sweater Trousers Shoes Robe

Shirt Jeans Swimsuit Shorts

14 Fashion Categories

Dataaugmentation/preprocessing

• Implementedtoimproverobustnessoftheclassifier• Effectsintroducedtoaccountfordeviationsfromtrainingdataset:

1. Backgroundnoise(whitebackgroundconvertedtonoisybackground)2. Randomjitter(Randomjittercontrast:ApplyPCAtoallpixelsintraining

set,samplea”coloroffset”alongprincipalcomponentandthenaddoffsettoallpixels)

3. Rotation4. Contrastchange5. Croptotacklescalevariances6. Resizingtotackleresolutionvariances7. Imagenormalization

(subtractingthemeanfromrespectiveRGBchannels)

• scipy,scikit-imagewereusedtoaddtheaboveeffects

HiddenLayer-1 HiddenLayer-2

OutputLayer

InputLayer(pixelsofeachimage)

Learningfromdata:Multi-layerperceptron

OutputLayer

InputLayer(pixelsofeachimage)

OutputLayer

InputLayer

Non-linearactivationfunctiontoidentify

complexdatapatterns.Eg:f(x)=max(0,x)(ReLu)

OutputLayer

InputLayer

OutputLayer

InputLayer

OutputLayer

InputLayer

Softmax functiontoassign

probabilitiestoclasses.

Learningfromdata:ConvolutionalNeuralNetworks(CNN)

[SlidestakenfromLec-7CS231NConvolutionalNeuralNetworksforComputerVision]

Imageinputasvolume

MainIssue:Multi-LayerPerceptrondoesnotscalewelltoimagedata.UseCNNs!

Depthoffilteristhesamethedepthofinputvolume

Evaluatesimagesimilaritywithfiltertemplate

Non-linearactivation

functionlikeReLu

Non-linearactivation

functionlikeReLu

StepsinvolvedinTrainingData

• DefineLossfunctionforminimization:Cross-entropyorNegativeLikelihoodused

𝐿 𝒙, 𝒚,𝑾 =1𝑛))𝐼{,-./}

log 𝑃(𝑌: = 𝑐|=

𝑥:) + 𝝀| 𝑾 |𝟐

• Usemini-batchSGDtominimizeloss• Initializeweights𝑾𝟎forallthelayers• Updaterule:

𝑾(𝒌G𝟏) = 𝑾(𝒌) − 𝒔𝒌(𝜵𝑾𝑳(𝑾))

Gradient(Difficulttocomputedirectly)

GradientComputationsbasedonComputationalGraphs

0.73𝟏

𝒑 = 𝒘𝟎𝒙𝟎

𝒒 = 𝒑+𝒘𝟏

𝒛 =𝟏

𝟏+ 𝒆R𝒒 =

𝑓 𝑥T,𝑤T,𝑤> =1

1 + 𝑒R(WXYXGWZ)Gradientof (2,1,1)at

ForwardPropagation

GradientComputationsbasedonComputationalGraphs

𝒑 = 𝟐

𝒛 = 0.73𝒒 = 𝟏

𝒅𝒛𝒅𝒒 =

𝒅𝒛𝒅𝒘𝟏

=𝒅𝒛𝒅𝒒

𝒅𝒒𝒅𝒘𝟏

𝒅𝒛𝒅𝒑 =

𝒅𝒛𝒅𝒒

𝒅𝒒𝒅𝒑

𝒅𝒛𝒅𝒙𝟎

=𝒅𝒛𝒅𝒑

𝒅𝒑𝒅𝒙𝟎

𝒅𝒛𝒅𝒘𝟎

=𝒅𝒛𝒅𝒑

𝒅𝒑𝒅𝒘𝟎

𝑓 𝑥T,𝑤T,𝑤> =1

1 + 𝑒R(WXYXGWZ)Gradientof (2,1,1)at

MainIdea:Calculateglobalgradientsusingchainruleonlocalgradients

BackwardPropagation

UsingPre-trainedmodels

• Inpractice,trainingCNNsfromscratchishardlyusedduetocomputationalconsiderations• Instead,pre-trainedmodelstrainedonotherdatasetareused• Mainidea:Lowlevelfeaturesarecommontoallimageclassificationproblems

• ForourprojectweusedVGG-16modeltrainedforImageNetclassificationchallengeon1000categories

VGG-16Architecture

• Input: 224x224 size images with 3 channels1. Convolutional (224x224x64)2. Convolutional (224x224x64)Max Pool (112x112x64)3. Convolutional (112x112x128)4. Convolutional (112x112x128)Max Pool (56x56x128)5. Convolutional (56x56x256)6. Convolutional (56x56x256)7. Convolutional (56x56x256)Max Pool (28x28x256)8. Convolutional (28x28x512)9. Convolutional (28x28x512)10. Convolutional (28x28x512)Max Pool (14x14x512)

11. Convolutional (14x14x512)12. Convolutional (14x14x512)13. Convolutional (14x14x512)Max Pool (7x7x512)14. Fully Connected (1x1x4096)15. Fully Connected (1x1x4096)16. Fully Connected (1x1x1000) Soft-max (1x1x1000)

Reducesspatialdimensions

Implementation:ModifiedVGG-16

Treatthepreviouslayers(weights)asfixedandtunetheparametersonthelast3fullyconnectedlayers

Implementation:ModifiedVGG-16

ImplementedCNNarchitectureusingTheano(library)andLasagne(wrapper)withCUDAGPUonAWSEC2server

Cross-entropylossversusnumberofiterationsusingNesterov momentumupdaterule

ResultsusingMacy’sdata

• Initially,resultswereobservedusingthetrain-testsplitwithinMacy’s

data(withoutdataaugmentation)

• ~92%accuracywasachieved

TrainingData TestData

• Resultswereexceptionallybad(~20%accuracy)whenusingthesametrainingdatatopredictatestdataofreallifeexamplesfromGoogleImages• Hence,therewasaneedfordataaugmentation

Resultsfortestdataafterdataaugmentation

• Thetestdataconsistedofaround60imagesfromGoogleImages

comprisingofrealworldphotos

• ~71%accuracywasachieved

Cross-entropylossversusnumberofiterationsusingNesterov momentumupdaterule

Resultsfortestdataafterdataaugmentation

NextSteps…

• Extendanalysistomultipleclothingobjectdetection• Approach:UseaslidingwindowapproachwithfastlinearclassifierstoidentifyregionsofinterestandapplyCNNs• DeterminespecificobjectlocationusingregressionheadwithaL2lossfunctioninsteadofclassificationheadwithcross-entropyloss

THANKYOU

Label Category

0 Jacket

1 T-Shirt

2 Shirt

3 Jeans

4 Long-dress

5 Shoes

6 Trousers

7 Polo-shirts

8 Robe

9 Short-dress

10 Shorts

11 Suit

12 Sweater

13 Swimsuit

fashion classification and detection using convolutional...

Documents

austin real estate data scrappy, analysis and interactive...

convolutional neural networks

training convolutional neural networksraul/rs/training...

metadata validation using a convolutional neural...

convolutional neural networks · 2017-10-10 ·...

image denoising using gaussian scale mixture...

visualizing and understanding convolutional neural ... -...

convolutional code

deep convolutional inverse graphics...

convolutional neural networks for fashion classiﬁcation...

bicycle routing - ned...

convolutional neural networks for object...

convolutional encoding

chapter 5 convolutional codes - chencode.cn€¦ · chapter...

convolutional codes

convolutional codes puncturing

convolutional neural networks analyzed via convolutional...

imagenet classification with deep deep convolutional...

using support vector machines, convolutional … support...

convolutional neural networks analyzed via convolutional ......