ar deepcaloriecam:anios app for food calorie estimation with augmented...

1
AR DeepCalorieCam:An iOS App for Food Calorie Estimation with Augmented Reality Conclusions [1] T. Miyazaki, G. Chaminda, D. Silva, and K. Aizawa. Image - based calorie content estimation for dietary assessment. In Proc. of IEEE ISM Workshop on Multimedia for Cooking and Eating Activities, 2011. [2] J. J. Chen and C. W. Ngo. Deep-based ingredient recognition for cooking recipe retrival. In Proc. of ACM International Conference Multimedia, 2016. [3] J. Y. Zhu, T.Park, P. Isola, A.A. Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proc. of IEEE International Conference on Computer Vision, 2017. [4] A. Odena, C. Olah, and J. Shlens. Conditional Image Synthesis With Auxiliary Classifier GANs. In Proc. of the 34th International Conference on Machine Learning, 2017. [5] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, Improved Training of Wasserstein GANs. Advances in Neural Information Processing Systems, 2017. We proposed food calorie estimation app with a multi-task CNN using Augmented Reality. Multi-task learning improved both food calorie and category estimation. - - , - ) - (- - Background: My Pet Project Introduction: Input Curry Fried Rice Gyudon Hiyashi Chuka Meet Supa Ramen Rice Soba Unagi Yakisoba [ Loss function ] Total loss : ! " =$ % &'()'* +, -./ $ 012 3402 ! 5 =$ ' &'()'* +, -./ $ 012 6074 +, ../ $ 112 Real Image In domain A Real Image In domain B Cycle Consistency Loss Method: conditional CycleGAN GANs are a kind generative models designed by Goodfellow et all in 2014. In a GAN setup, two differentiable functions, represented by neural networks, are locked in a game. !The Generator trying to maximize the probability of making the discriminator mistakes its inputs as real. !The Discriminator guiding the generator to produce more realistic images. !The Generator would capture the general training data distribution. !The Discriminator is always unsure of whether its inputs are real or not. The two players, the Generator and the Discriminator, have different roles in this framework. Input noise vector fake or real D G 100 dim Training set fake image real image The Generator tries to produce data that come from some probability distribution. The Discriminator, acts like a judge. It gets to decide if its input comes from the G or from the true training set. In summary, the game follows with: Food Image Transfer using Generative Adversarial Networks(GANs) Generator Discriminator Player 1 Player 2 In the perfect equilibrium, as a result, Objective: Transfer food images to multiple domains with high quality using the GAN method for dietary images Input Domain Select One-hot Verctor signal C [0, 0, 1, …, 0] Output Related work " : CycleGAN[3] ICCV2017 Related work # : ACGAN[4] ICML2017 [ Our network ] Capture the intuition that if they translate from one domain to the other and back again they should arrive at where they started. Cycle Consistency Loss(CCL): An approach for learning to translate an image from a source domain X to a target domain Y. They introduce a Cycle Consistency Loss to push F(G(X))≈X (and vice versa). Auxiliary Classifier Loss(ACL): Every generated sample has a corresponding class label New methods for the improved training of GANs for image synthesis. They introduce a Auxiliary Classifier Loss to make high quality image . fake or real Adversarial Loss class[0, 1, …, 0] Auxiliary Classifier Loss D fake image real image Fake Image In domain B Reconstructed Image Domain Select C[1, 0] Domain Select C[0, 1] fake or real Adversarial Loss class[1, 0] Auxiliary Classifier Loss D 8 9 8 9 [ DATASET ] Curry Fried Rice Gyudon Hiyashi Chuka Meet Supa Ramen Rice Soba Unagi Yakisoba For training our network, we collected 10 category food photos from the Twitter. Food 10 categories. A total of 230,053 images. 24,760 5,329 3,350 24,760 74,007 7,138 13,499 18,396 27,854 34,216 We proposed food image transfer using conditional CycleGAN. Conditional CycleGan can convert multiple domains while keeping the shape of the food. Result: Background & Objective Meal management apps enable us to record food calories. Some of them need human help for calorie estimation. Image-based food calorie estimation using recipe information with Search-based food calorie estimation with conventional features. Similar image retrieval with SURF and color histograms and so on. Calculate food calories from retrieved images’ calories. Method: Multi-task CNN for calorie estimation It is expected to improve the accuracy of each task. Multi-task estimation of food categories and food ingredients. Multi-task CNN[3] of food categories and food ingredients Multi-task learning improve both task’s performance. We propose regression-based method using CNN. We use multi-task CNN for calorie estimation. Related work " : Miyazaki et al.[1] 2011 Related work# : Chen and Ngo[2] 2016 Estimation target (calories) Conv layers ( VGG16 [4]) 335 kcal Potato salad Calorie estimation Category recognition fc6 fc7a fc8b fc7b fc8a Food calorie for one person. [ Our network ] $ 10: =−< = > log B > C >DE (softmax cross entropy) (2) Food Category loss: [ Loss function ] $ 102 GH ! GH -I ! -I ! GH = B−= = ! -I = B−= B is an estimated food calorie. = is ground-truth. absolute err. loss (1) Calorie estimation loss: relative err. loss ( λ is the weight on the loss. ) Nikujaga Pilaf Curry Fried rice Fried noodle Spaghetti Gratin Miso soup Stew Hamburg steak Cold tofu Sushi bowl Omelet with fried rice Potato salad Mixed rice For training our network, we collected calorie-annotated food photos from the online cooking recipe sites. [ DATASET ] Food 15 categories. A total of 4877 images. Application Demo: Augmented Reality [ Calorie Estimation with AR ] [Food Image Transfer using GANs] [ Calorie Estimation with AR ] [Food Image Transfer using GANs] 8 9 J K J E J L J C J K =1 J E =1 J L =1 J CNE =1 J C =1 [ Implementation ] (1) Train with Keras(backend TensorFlow) framework (2) Convert Keras model to CoreML model for iOS deployment (3) Display calorie estimation result using Apple ARKit framework [DeepCalorieCam] [AR DeepCalorieCam]

Upload: others

Post on 13-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AR DeepCalorieCam:AniOS App for Food Calorie Estimation with Augmented Realityimg.cs.uec.ac.jp/pub/conf17/180205tanno_2_ppt.pdf · 2018. 2. 3. · AR DeepCalorieCam:AniOS App for

AR DeepCalorieCam:An iOS App for Food Calorie Estimationwith Augmented Reality

Conclusions

[1] T. Miyazaki, G. Chaminda, D. Silva, and K. Aizawa. Image - based calorie content estimation for dietary assessment. In Proc. of IEEE ISM Workshop on Multimedia for Cooking and Eating Activities, 2011.[2] J. J. Chen and C. W. Ngo. Deep-based ingredient recognition for cooking recipe retrival. In Proc. of ACM International Conference Multimedia, 2016.[3] J. Y. Zhu, T.Park, P. Isola, A.A. Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proc. of IEEE International Conference on Computer Vision, 2017.[4] A. Odena, C. Olah, and J. Shlens. Conditional Image Synthesis With Auxiliary Classifier GANs. In Proc. of the 34th International Conference on Machine Learning, 2017.[5] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, Improved Training of Wasserstein GANs. Advances in Neural Information Processing Systems, 2017.

• We proposed food calorie estimation app with a multi-task CNN using Augmented Reality.• Multi-task learning improved both food calorie and category estimation.

- - , - )- ( - -

Background: My Pet Project Introduction:

Input Curry Fried Rice GyudonHiyashiChuka Meet Supa Ramen Rice Soba Unagi Yakisoba

[ Loss function ]Total loss :!" = $%

&'()'* + ,-./$0123402

!5 = $'&'()'* + ,-./$012

6074

+,../$112Real ImageIn domain A

Real ImageIn domain B

Cycle Consistency Loss

Method: conditional CycleGAN

GANs are a kind generative models designed by Goodfellow et all in 2014.In a GAN setup, two differentiable functions, represented by neural networks, are locked in a game.

The Generator trying to maximize the probability of making the discriminator mistakes its inputs as real.The Discriminator guiding the generator to produce more realistic images.

The Generator would capture the general training data distribution. The Discriminator is always unsure of whether its inputs are real or not.

The two players, the Generator and the Discriminator, have different roles in this framework.

Input

noisevector

fakeor

realDG

100dim

Trainingset

fakeimage

realimage

The Generator tries to produce data that come from some probability distribution.

The Discriminator, acts like a judge. It gets to decide if its input comes from the G or from the true training set.

In summary, the game follows with:

Food Image Transfer using Generative Adversarial Networks(GANs)

Generator

Discriminator

Player 1

Player 2

In the perfect equilibrium, as a result,

Objective: Transfer food images to multiple domains with high quality using the GAN method for dietary images �

��

Input

Domain Select One-hot Verctor signal

C [0, 0, 1, …, 0]

Output

Related work : CycleGAN[3] ICCV2017

Related work : ACGAN[4] ICML2017

[ Our network ]

Capture the intuition that if they translate from one domain to the otherand back again they should arrive at where they started.

Cycle Consistency Loss(CCL):

An approach for learning to translate an image from a source domain X to a target domain Y.

They introduce a Cycle Consistency Loss to push F(G(X))≈X (and vice versa).

Auxiliary Classifier Loss(ACL): Every generated sample has a corresponding class label

New methods for the improved training of GANs for image synthesis.They introduce a Auxiliary Classifier Loss

to make high quality image .fake or real

Adversarial Loss

class[0, 1, …, 0]

Auxiliary Classifier LossD

fakeimagereal

image

Fake ImageIn domain B

ReconstructedImageDomain Select

C[1, 0]Domain Select

C[0, 1]

fake or real

Adversarial Loss

class[1, 0]

Auxiliary Classifier Loss

D

89 89

[ DATASET ] Curry Fried Rice GyudonHiyashiChuka Meet Supa

Ramen Rice Soba Unagi Yakisoba

For training our network, we collected 10 category food photosfrom the Twitter.

Food 10 categories.A total of 230,053 images.

24,7605,3293,35024,76074,007

7,13813,49918,39627,85434,216

• We proposed food image transfer using conditional CycleGAN.• Conditional CycleGan can convert multiple domains while keeping the shape of the food.

Result:

Background & ObjectiveMeal management apps enable us to record food calories.Some of them need human help for calorie estimation.

Image-based food calorie estimation using recipe information with

Search-based food calorie estimation with conventional features.Similar image retrieval with SURF and color histograms and so on.Calculate food calories from retrieved images’ calories.

Method: Multi-task CNN for calorie estimation

It is expected to improve the accuracy of each task.

Multi-task estimation of food categories and food ingredients.Multi-task CNN[3] of food categories and food ingredientsMulti-task learning improve both task’s performance.

We propose regression-based method using CNN.

We use multi-task CNN for calorie estimation.

Related work : Miyazaki et al.[1] 2011

Related work : Chen and Ngo[2] 2016

Estimationtarget (calories)

Conv layers ( VGG16 [4])

335 kcal

Potato salad

Calorie estimation

Category recognition

fc6

fc7a

fc8bfc7b

fc8a

Food calorie for one person.

[ Our network ]

$10: = −<=> log B>

C

>DE

(softmax cross entropy)

(2) Food Category loss:

[ Loss function ]$102 = λGH!GH + λ-I!-I

!GH =B − == !-I = B − =

B is an estimated food calorie. = is ground-truth.

absolute err. loss

(1) Calorie estimation loss:

relative err. loss

( λ is the weight on the loss. )

Nikujaga

Pilaf Curry Fried rice Fried noodle Spaghetti

Gratin Miso soup Stew Hamburg steak

Cold tofu Sushi bowl Omelet with fried rice

Potato salad Mixed rice

For training our network, we collected calorie-annotated food photos from the online cooking recipe sites.

[ DATASET ]

Food 15 categories.A total of 4877 images.

Application Demo:

Augmented Reality

[ Calorie Estimation with AR ] [Food Image Transfer using GANs]

[ Calorie Estimation with AR ]

[Food Image Transfer using GANs]

89

JK JE JL JC

JK = 1

JE = 1

JL = 1

JCNE = 1

JC = 1

[ Implementation ](1) Train with Keras(backend TensorFlow) framework(2) Convert Keras model to CoreML model for iOS deployment(3) Display calorie estimation result using Apple ARKit framework

[DeepCalorieCam] [AR DeepCalorieCam]