weakly-supervised action segmentation with iterative soft...

1

Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment Li Ding (MIT) Chenliang Xu (University of Rochester) [email protected] [email protected] take bowl pour cereals pour milk stir cereals take bowl pour cereals pour milk stir cereals take bowl pour cereals pour milk stir cereals take bowl pour cereals pour milk stir cereals pour cereals pour cereals pour milk take bowl pour cereals pour milk stir cereals pour cereals pour cereals pour milk training video transcript initialize target inference assignment new target … iterate till meeting stop criteria take bowl pour cereals pour milk stir cereals ground truth training inference squeeze orange pour juice transcript initialize target SIL SIL inference Δ> Δ> Δ< new transcript squeeze orange pour juice SIL SIL squeeze orange SIL frame (t) probability video feature 1D Conv + BN + ReLU + Pool feature 1D Conv + BN + ReLU + Pool feature feature feature Upsample + feature 1x1 Conv 1x1 Conv Upsample + 1x1 Conv (Frame-wise) FC + Softmax prediction Encoder Decoder 1D Conv 1D Conv 1D Conv Overview of the Weakly-Supervised Action Segmentation System Iterative Soft Boundary Assignment (ISBA) Temporal Convolutional Feature Pyramid Network (TCFPN) Experiment Conclusion • We iteratively train a temporal segmentation network with target generated from action transcript, and refine the action transcript based on the inference of current network. Code About • We propose ISBA as a novel training strategy for weakly- supervised sequence learning, and TCFPN as an advanced temporal convolutional network for supervised action recognition. • The whole system TCFPN+ISBA outperforms state-of-the- art on both action segmentation task and alignment tasks. • Our training strategy for weakly-supervised sequence learning is general and can be extended to other tasks, such as speech recognition, video segmentation, etc. • ISBA uses a simple-yet- effective algorithm, trying to generate a reasonable probabilistic distribution of different actions through numbers of iterations.

Upload: others

Post on 03-Jul-2020

3 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Weakly-Supervised Action Segmentation with Iterative Soft ...liding/materials/ding2018weakly_poster.pdf · Li Ding (MIT) Chenliang Xu (University of Rochester) liding@mit.edu chenliang.xu@rochester.edu

Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment

Li Ding (MIT) Chenliang Xu (University of Rochester)[email protected] [email protected]

take bowl pour cereals pour milk stir cereals

take bowl pour cereals pour milk stir cereals

take bowl pour cereals pour milk stir cereals

take bowl pour cereals pour milk stir cerealspour cereals pour cereals pour milk

take bowl pour cereals pour milk stir cerealspour cereals pour cereals pour milk

training

video

transcript

initialize target

inference

assignment

new target

… iterate till meeting stop criteria

take bowl pour cereals pour milk stir cereals

ground truth

training

inference

squeeze orange pour juice

transcript

initialize target

SIL SIL

inference

Δ > 𝜌

Δ > 𝜌 Δ < 𝜌

new transcript

squeeze orange pour juiceSIL SILsqueeze orangeSIL

frame (t)

probability

video feature

1D Conv + BN + ReLU + Pool

feature

1D Conv + BN + ReLU + Pool

feature

feature

feature

Upsample

+

feature

1x1 Conv

1x1 Conv

Upsample

+1x1 Conv

(Frame-wise) FC + Softmax

prediction

Encoder Decoder

1D Conv

1D Conv

1D Conv

Overview of the Weakly-Supervised Action Segmentation System

Iterative Soft Boundary Assignment (ISBA) Temporal Convolutional Feature Pyramid Network (TCFPN)

Experiment

Conclusion

• We iteratively train a temporal segmentation network with

target generated from action transcript, and refine the action

transcript based on the inference of current network.

Code

About

• We propose ISBA as a novel training strategy for weakly-

supervised sequence learning, and TCFPN as an

advanced temporal convolutional network for supervised

action recognition.

• The whole system TCFPN+ISBA outperforms state-of-the-

art on both action segmentation task and alignment tasks.

• Our training strategy for weakly-supervised sequence

learning is general and can be extended to other tasks,

such as speech recognition, video segmentation, etc.

• ISBA uses a simple-yet-

effective algorithm, trying

to generate a reasonable

probabilistic distribution of

different actions through

numbers of iterations.

Employee Forum - rochester.edu

Supporting Information · Supporting Information A Belt-Like One-Dimensional Dy Chain Exhibiting Slow Magnetic Relaxation Behavior Zhi Chen,*a,e Yanhua Lan,b Chenliang Su,a Yi-Quan

UR Financials Project - rochester.edu

NYSS-AAPT meeting SUNY - Brockport October 1, 2005 [email protected] manly/ Much ado about nothing: the 30 minute

Gila Balman Industrial background with over 33 years at Eastman Kodak Company in Patent Legal Tele: 585.273.4512 [email protected] Aeisha Salem

IT Leadership at the University of Rochester Stanley Wilder University of Rochester (585) 275 9328 [email protected]

Nussbaum’s Capabilities and Self-Determination ......& Cody R. DeHaan [email protected] Tadashi Hirai [email protected] Richard M. Ryan [email protected] 1 Department

Steven Manly Univ. of Rochester REU seminar June 19, 2008 [email protected] Intro to High-energy Physics

UBMS Application Updated 2018 - rochester.edu · worthwhile college portfolio ... Math/Science Love Math & Science? 01 Interested in going to college? ... ACKNOWLEDGEMENT CONSENT

[email protected] Jonathan Carroll-Nellenback

Switched Control Strategies of Aggregated …...energies Article Switched Control Strategies of Aggregated Commercial HVAC Systems for Demand Response in Smart Grids Kai Ma 1, Chenliang

UR Financials - rochester.edu · UR Financials User Group – February 2016 UR Financials Announcements • February’s Events – NCL Reporting Classes [February; Basic 2/10 , Grants

Research Article Dynamic Model Identification for 6 …downloads.hindawi.com/journals/jr/2015/471478.pdfResearch Article Dynamic Model Identification for 6-DOF Industrial Robots LiDing,

PROGRESS REPORT: 2018 2020 - rochester.edu

3:00 - 6:30 pm - rochester.edu · William Gblerkpor (UGhana, Archaeology and Heritage Studies): ‘Conserving a Cultural Fortress: Krobo Mountain, Ghana’

f [email protected] Abstract arXiv:2003.12628v1 [cs.LG] 27

46 ˇ˘ - rochester.edu · Leadership Cabinet, the Barbara J. Burger iZone Advisory Council Occupation: Senior philanthropic advisor, Rochester Area Community Foundation Emily Hart

Faculty Special Interest Group - rochester.edu

5 RochRev Sept11 Gazette - rochester.edu ROCHESTER REVIEW September–October 2011 AdAm FEnSTER Alumni Gazette 5_RochRev_Sept11_Gazette.indd 40 8/18/11 10:43 PM

University of Rochester July 28, 2005 [email protected] manly/main/manlyhome.htm Much ado about nothing: the

Class Notes - rochester.edu · CHICAgO:Hajim Mark goldstein ’78 was recognized with the John N. Wilder Award. Arts, Sciences & Engineering Dean’sMeDal ed chairmanHajim ’58,

Revisiting the mass-radius relation of super Earth with ... · Revisiting the mass-radius relation of super Earth with new ice EOS measurement Chenliang Huang 1, Zachary M. Grande1,

These are your - rochester.edu

Audio-Visual Event Localization in the Wild...Audio-Visual Event Localization in the Wild Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, and Chenliang Xu University of Rochester 1

The frequency and distribution of recent landslides …...liding triggered by earthquakes may be the most significant type of major landslide hazard in Puerto Rico in the long term

Presentación de PowerPoint · 2020. 7. 17. · 15 Li Ding Lawyer China Desk T. +34 917 816 160 [email protected] Experience: Senior Lawyer of ECIJA's China Desk. Li has extensive

Modeling Language Discrepancy for Cross-Lingual …...Modeling Language Discrepancy for Cross-Lingual Sentiment Analysis Qiang Chen1, Chenliang Li 2*, Wenjie Li3, and Yanxiang He 1

Audio-visual Source Association for String Ensembles ......Audio-visual Source Association for String Ensembles through Multi-modal Vibrato Analysis Bochen Li, Chenliang Xu, Zhiyao

5 RochRev Nov2011 Gazette - rochester.edu

College Town - rochester.edu

Leaving Your Own Island - rochester.edu

@rochester.edu arXiv:1903.11748v1 [cs.LG] 28 Mar 2019

A Capsule Network for Recommendation and Explaining What ... · A Capsule Network for Recommendation and Explaining What You Like and Dislike Chenliang Li1, Cong Quan2, Li Peng3,

nys inventory cip - rochester.edu fileschool district administrator ms ad2 ms - administration (sda) 13.0499 educational administration and supervision, other ... teaching and curriculum

Cell-cycle-specific Cellular Responses to Sonoporation · Pengfei Fan1*, Yi Zhang 1*, Xiasheng Guo 1*, Chenliang Cai 1, Maochen Wang 1, ... G2 and M (mitosis), is a sequential event