multilabel deep learning - indico · supervised learning • general problem, desire a labeling...
TRANSCRIPT
![Page 1: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/1.jpg)
Multilabel Classification and Deep Learning
Zachary Chase Lipton
Critical Review of RNNs: http://arxiv.org/abs/1506.00019
Learning to Diagnose:http://arxiv.org/abs/1511.03677
Conditional Generative RNNS:http://arxiv.org/abs/1511.03683
![Page 2: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/2.jpg)
Outline • Introduction to Multilabel Learning
• Evaluation
• Efficient Learning & Sparse Models
• Deep Learning for Multilabel Classification
• Classifying Multilabel Time Series with RNNs
![Page 3: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/3.jpg)
Supervised Learning• General problem, desire a labeling function
• ERM principle - choose the model in hypothesis class that minimizes loss on the training sample
• Most research assumes simplest case
• Real world much messier
f : X ! Y
f̂ HS 2 {X ⇥ Y}n
X = Rd,Y = {0, 1}
![Page 4: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/4.jpg)
Binary Classification
y 2 {0, 1}
![Page 5: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/5.jpg)
Multiclass Classification
y 2 {c1, c2, ..., cL}
![Page 6: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/6.jpg)
Multilabel Classification
y ✓ {c1, c2, ..., cL}
![Page 7: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/7.jpg)
Why Multilabel?• Superset of both BC and MC:
BC when = 1, MC when
• Natural for many real problems:Clinical diagnosis Predicting purchasesAuto-tagging news articles Activity recognition Object detection
• Easy to formulate:Take L tasks and slap them together
y 2 L|L|
![Page 8: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/8.jpg)
Naive Baseline• Binary relevance:
Separately train
• Pros:Simple to execute, easy to understandstrong baseline
• Cons:Computational cost: Leaves some information on the table (correlation betw. labels)
|L| classifiers fl : X ! {0, 1}
|L|⇥
![Page 9: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/9.jpg)
Challenges• Efficiency
Develop classifiers that do not scale in time or space complexity with the number of labels
• Performance Make use of the extra labels to achieve better accuracy, generalization
• EvaluationHow do we evaluate a multilabel classifier’s performance across 10s, 100s, 1000s, or even 1M labels?
![Page 10: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/10.jpg)
Outline • Introduction to Multilabel Learning
• Evaluation
• Efficient Learning & Sparse Models
• Deep Learning for Multilabel Classification
• Classifying Multilabel Time Series with RNNs
![Page 11: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/11.jpg)
Why not accuracy?
• Often extreme class imbalanceWhen blind classifier gets 99.99%, can be optimal to be uninformative
• Varying base rates across labels E.g.: MeSH dataset: Human applies to 71% of articles, platypus in <.0001%
![Page 12: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/12.jpg)
F1 Score• Easy to calculate from confusion matrix
• Harmonic mean of precision and recall F1 = 2·tp
2·tp+fp+fn
tp
tp+ fp
tp
tp+ fn
![Page 13: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/13.jpg)
F1 given fixed base rate
![Page 14: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/14.jpg)
Compared to Accuracy
![Page 15: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/15.jpg)
Expected F1 for Uninformative Classifier
![Page 16: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/16.jpg)
Multilabel Variations
Example 1 TP FP FN TN
Example 2 FP FP FN TP
Example 3 FN TP FN FP
… TN TP TP TN
Micro F1 calculated over all entries
![Page 17: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/17.jpg)
Macro F1
• Macro: F1 calculated separately for each label and averaged
Label 1 Label 2 Label 3 Label 4
Example 1 TP FP FN TN
Example 2 FP FP FN TP
Example 3 FN TP FN FP
… TN TP TP TN
![Page 18: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/18.jpg)
Characterizing the Optimal Threshold
• Threshold can be expressed in terms of the conditional probabilities of scores given labels
• When scores are calibrated probabilities, optimal threshold is precisely half the F1 it achieves.
![Page 19: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/19.jpg)
Problems with F1
• Sensitive to thresholding strategy
• Hard to tell who has the best algorithms and who is smart about thresholding
• Micro-F1 biased towards common labels
• Macro-F1 biased against them
![Page 20: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/20.jpg)
Some alternatives• Any threshold indicates a cost sensitivity:
When you know the cost, specify it and use weighted accuracy
• AUC exhibits same dynamic range for every label (blind classifier gets 0, perfect is 1)
• Macro-averaged AUC scores may give a better sense of performance across all labels **high AUC for rare labels can be misleading.can achieve AUC of .99 produce useless results for IR
![Page 21: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/21.jpg)
Outline • Introduction to Multilabel Learning
• Evaluation
• Efficient Learning & Sparse Models
• Deep Learning for Multilabel Classification
• Classifying Multilabel Time Series with RNNs
![Page 22: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/22.jpg)
The problem
• With many labels, binary relevance models can be huge and slow
• 10k labels + 1M features = 80GB of parameters
• We want compact models Fast to train and evaluate, cheap to store
![Page 23: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/23.jpg)
Linear Regression• The bulk of computation is label agnostic (compute
inverse
• Can do this especially fast when we reduce dimensionality of X via SVD.
• Problem: Unsupervised dim reduction -> lose signal of rare features -> mess up rare labels
✓ = (XTX)�1XT b
(XTX)�1
✓ = (XTX)�1XTB
![Page 24: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/24.jpg)
Sparsity• For auto-tagging tasks, features are often high-dimensional sparse
bag-of-words or n-grams
• Datasets for web-scale information retrieval tasks are large in the number of examples, thus SGD is the default optimization procedure
• Absent regularization, the gradient is sparse and training is fast
• Regularization destroys the sparsity of the gradient
• Number of features and labels are large, dense stochastic updates are computationally infeasible
![Page 25: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/25.jpg)
Regularization• Goals: achieve model sparsity, prevent overfitting • regularization is induces sparse models • regularization is thought to achieve more accurate
models in practice • Elastic net, balances the two
`22
`1
![Page 26: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/26.jpg)
Balancing Regularization with Efficiency
• To regularize while maintaining efficiency, can use a lazy updating scheme, first described by Carpenter (2008)
• For each feature, remember the last time it was nonzero
• When a feature is nonzero at some step t+k, perform a closed form update
• We derive lazy updates for elastic net regularization on both standard SGD and FoBoS (Duchi & Singer)
![Page 27: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/27.jpg)
Lazy Updates for Elastic NetTheorem 1 To bring the weight wj current from time j to time k using SGD,
the constant time update is
(1 )
w(k)j = sgn(w
( j)j )
|w( j)
j | P (k � 1)
P ( j � 1)
� P (k � 1) · (B(k � 1)�B( j � 1))
�
+
where P (t) = (1 � ⌘(t)�2) · P (t � 1) with base case P (�1) = 1 and B(t) =Pt⌧=0 ⌘
(⌧)/P (⌧ � 1) with base case B(�1) = 0.
Theorem 2 A constant-time lazy update for FoBoS with elastic net regulariza-
tion and decreasing learning rate to bring a weight current at time k from time
j is
(2 )
w(k)j = sgn(w
( j)j )
|w( j)
j | �(k � 1)
�( j � 1)�
�(k � 1) · �1 (�(k � 1)� �( j � 1))
�
+
where �(t) = �(t� 1) · 11+⌘t�2
with base case �(�1) = 1 and �(t) = �(t� 1) +⌘(t)
�(t�1) with base case �(�1) = 0.
![Page 28: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/28.jpg)
Empirical Validation• On two largest datasets in Mulan repository of
multilabel datasets, we can train to convergence on a laptop in just minutes
• rcv1: 490x speedup, bookmarks: 20x speedup
bookmarksrcv1
![Page 29: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/29.jpg)
Outline • Introduction to Multilabel Learning
• Evaluation
• Efficient Learning & Sparse Models
• Deep Learning for Multilabel Classification
• Classifying Multilabel Time Series with RNNs
![Page 30: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/30.jpg)
Performance• Efficiency is nice, but we’d also like performance
• Neural networks can learn shared representations across labels.
• Both regularizes each label’s model and exploits correlations between labels
• In extreme multilabel, may use significantly less parameters than logistic regression
![Page 31: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/31.jpg)
Neural Network
![Page 32: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/32.jpg)
Training w Backpropagation
• Goal: calculate the derivative of loss function with respect to each parameter (weight) in the model
• Update the weights by gradient following:
![Page 33: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/33.jpg)
Forward Pass
![Page 34: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/34.jpg)
Backward Pass
![Page 35: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/35.jpg)
Multilabel MLP
![Page 36: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/36.jpg)
Outline • Introduction to Multilabel Learning
• Evaluation
• Efficient Learning & Sparse Models
• Deep Learning for Multilabel Classification
• Classifying Multilabel Time Series with RNNs
![Page 37: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/37.jpg)
To Model Sequential Data: Recurrent Neural Networks
![Page 38: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/38.jpg)
Recurrent Net (Unfolded)
![Page 39: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/39.jpg)
LSTM Memory Cell (Hochreiter & Schmidhuber, 1997)
![Page 40: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/40.jpg)
LSTM Forward Pass
![Page 41: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/41.jpg)
LSTM (full network)
![Page 42: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/42.jpg)
Unstructured Input
![Page 43: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/43.jpg)
Modeling Problems
• Examples: 10,401 episodes
• Features: 13 time series (sensor data, lab tests)
• Complications: Irregular sampling, missing values, varying-length sequences
![Page 44: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/44.jpg)
How to models sequences?
• Markov models
• Conditional Random Fields
• Problem: Cannot model long range dependencies
![Page 45: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/45.jpg)
Simple Formulation
![Page 46: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/46.jpg)
Target Replication
![Page 47: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/47.jpg)
Auxiliary Targets
![Page 48: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/48.jpg)
Results
![Page 49: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/49.jpg)
Outline • Introduction to Multilabel Learning
• Evaluation
• Efficient Learning & Sparse Models
• Deep Learning for Multilabel Classification
• Jointly Learning to Generate and Classify Beer Reviews
![Page 50: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/50.jpg)
RNN Language Model
![Page 51: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/51.jpg)
Past Supervised Approaches relied upon Encoder-Decoder Model
![Page 52: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/52.jpg)
Bridging Long Time Intervals with Concatenated Inputs
![Page 53: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/53.jpg)
Example
A.5 FRUIT/VEGETABLE BEER <STR>On tap at the brewpub. A nice dark red color with a nice head that left a lot of lace on the glass. Aroma is of raspberries and chocolate. Not much depth to speak of despite consisting of raspberries. The bourbon is pretty subtle as well. I really don’t know that I find a flavor this beer tastes like. I would prefer a little more carbonization to come through. It’s pretty drinkable, but I wouldn’t mind if this beer was available. <EOS>
![Page 54: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/54.jpg)
Character-based Classification
![Page 55: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/55.jpg)
“Love the Strong Hoppy Flavor”
![Page 56: multilabel deep learning - Indico · Supervised Learning • General problem, desire a labeling function • ERM principle - choose the model in hypothesis class that minimizes loss](https://reader034.vdocument.in/reader034/viewer/2022042308/5ed409d08d46b66d226354d5/html5/thumbnails/56.jpg)
Thanks!
Critical Review of RNNs: http://arxiv.org/abs/1506.00019 Learning to Diagnose:http://arxiv.org/abs/1511.03677 Conditional Generative RNNS:http://arxiv.org/abs/1511.03683
Contact: [email protected] zacklipton.com