automatic sequential pattern mining in data streamsyasushi/...•e.g., iot sensors/web click logs...
TRANSCRIPT
![Page 1: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/1.jpg)
Automatic Sequential Pattern Miningin Data Streams
Koki Kawabata*, Yasuko Matsubara & Yasushi SakuraiISIR-AIRC, Osaka University
*Supported by SIGIR Student Travel Grants
![Page 2: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/2.jpg)
MotivationGiven: time-evolving data streams• e.g., IoT sensors/Web click logs• contain multiple patterns
CIKM2019 Sakurai Lab. K.Kawabata et al. 2
![Page 3: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/3.jpg)
MotivationGiven: time-evolving data streams• e.g., IoT sensors/Web click logs• contain multiple patterns
Answer: the following questions:
1. What kind of patterns?2. How many patterns?3. When do patterns change?
CIKM2019 Sakurai Lab. K.Kawabata et al. 3
![Page 4: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/4.jpg)
?
MotivationGiven: time-evolving data streams• e.g., IoT sensors/Web click logs• contain multiple patterns
Requirements:• Incremental• We cannot access all historical data
• Automatic• # of patterns are unknown in advance• without any parameter tunings
CIKM2019 Sakurai Lab. K.Kawabata et al. 4
![Page 5: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/5.jpg)
MotivationGiven: time-evolving data streams• e.g., IoT sensors/Web click logs• contain multiple patterns
Requirements:• Incremental• We cannot access all historical data
• Automatic• # of patterns are unknown in advance• without any parameter tunings
CIKM2019 Sakurai Lab. K.Kawabata et al. 5
StreamScope: automatic & incremental approach
![Page 6: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/6.jpg)
Demo movie
CIKM2019 Sakurai Lab. K.Kawabata et al. 6
![Page 7: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/7.jpg)
Demo movie
CIKM2019 Sakurai Lab. K.Kawabata et al. 7
#1: Arm curl
#2: Rowing
![Page 8: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/8.jpg)
Demo movie
CIKM2019 Sakurai Lab. K.Kawabata et al. 8
![Page 9: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/9.jpg)
Demo movie
CIKM2019 Sakurai Lab. K.Kawabata et al. 9
#1: Arm curl
#2: Rowing
#5: Push up
#4: side raise
#3: Intervals
![Page 10: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/10.jpg)
Outline1. Motivation2. Problem definition3. Model4. Streaming Algorithm5. Experiments6. Conclusions
CIKM2019 Sakurai Lab. K.Kawabata et al. 10
![Page 11: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/11.jpg)
Problem definition
CIKM2019 Sakurai Lab. K.Kawabata et al. 11
1000 2000 3000 4000 5000Time
0
0.5
1
Value
L-armR-armL-legR-leg
12
34
Given:• Data stream X
Find:1. Segment: 𝒮2. Regime: Θ3. Segment-
membership: ℱ
![Page 12: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/12.jpg)
Problem definitionData stream: set of d-dimensional vectors
CIKM2019 Sakurai Lab. K.Kawabata et al. 12
1000 2000 3000 4000 5000Time
0
0.5
1
Value
L-armR-armL-legR-leg
𝑋 = 𝑥!, … , 𝑥"
Given
𝑑 = 4
![Page 13: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/13.jpg)
Problem definitionSegment: start/end positions of each pattern
CIKM2019 Sakurai Lab. K.Kawabata et al. 13
1000 2000 3000 4000 5000Time
0
0.5
1
Value
L-armR-armL-legR-leg
𝒮 = 𝑠!, … , 𝑠#
𝑚 = 8
Hidden
𝑠! 𝑠" 𝑠# 𝑠$ 𝑠% 𝑠& 𝑠' 𝑠(
![Page 14: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/14.jpg)
Problem definitionRegime: segment groups
CIKM2019 Sakurai Lab. K.Kawabata et al. 14
1000 2000 3000 4000 5000Time
0
0.5
1
Value
L-armR-armL-legR-leg
12
34
Θ = 𝜃!, … , 𝜃$ , Φ
𝑟 = 4
Hidden
![Page 15: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/15.jpg)
Problem definitionSegment-membership: regime-assignment
CIKM2019 Sakurai Lab. K.Kawabata et al. 15
1000 2000 3000 4000 5000Time
0
0.5
1
Value
L-armR-armL-legR-leg
12
34
ℱ = 𝑓 1 ,… , 𝑓 𝑚
1, 1,2, 2, 2,4, 4,3,ℱ = { }
e.g., 𝑓 3 = 4
Hidden
![Page 16: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/16.jpg)
Problem definitionGiven: d-dimensional data stream 𝑋 = 𝑥!, … , 𝑥)
Find: compact description 𝒞 = 𝑚, 𝑟, 𝒮, Θ, ℱ of 𝑋
•𝑚 segments 𝒮• 𝑟 regimes Θ• segment-membership ℱ
CIKM2019 Sakurai Lab. K.Kawabata et al. 16
𝒞 = 𝑚, 𝑟, 𝒮, Θ, ℱ
![Page 17: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/17.jpg)
Outline1. Motivation2. Problem definition3. Model4. Streaming Algorithm5. Experiments6. Conclusions
CIKM2019 Sakurai Lab. K.Kawabata et al. 17
![Page 18: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/18.jpg)
Proposed modelGoal: find compact description C in a streaming setting
CIKM2019 Sakurai Lab. K.Kawabata et al. 18
Challenges:Q1. How can we represent regimes?
Idea (1): Hierarchical probabilistic modelQ2. How can we decide # of segments/regimes?
Idea (2): Model description cost
![Page 19: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/19.jpg)
Idea (1): hierarchical probabilistic model
Q. How to describe patterns?
CIKM2019 Sakurai Lab. K.Kawabata et al. 19
𝜃* 𝑎""
𝑎#"𝑎"#𝑎"$ 𝑎$"
state1
state3state2
Idea: HMM-based probabilistic model• ‘within-regime’ transitions: A hidden Markov model 𝜃 = 𝜋, 𝐴, 𝐵• ‘across-regime’ transitions: Regime transition matrix Φ = 𝜙!" !,"$%
&
Model
Data stream
stand
run
walk
tRegimes
![Page 20: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/20.jpg)
Idea (1): hierarchical probabilistic model
Full model Θ = 𝜃!, … , 𝜃+ , Φ
CIKM2019 Sakurai Lab. K.Kawabata et al. 20
𝜃* 𝑎""
𝑎#"𝑎"#
𝑎##
𝑎#$𝑎$#𝑎$$
𝑎"$ 𝑎$"state1
state3state2
𝜃* = 𝜋* , 𝐴* , 𝐵*Single HMM parameters:
Θ = 𝜃!, … , 𝜃$ , Φ
stand
run
walk
Regimes
![Page 21: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/21.jpg)
Idea (1): hierarchical probabilistic model
Full model Θ = 𝜃!, … , 𝜃+ , Φ
CIKM2019 Sakurai Lab. K.Kawabata et al. 21
𝜃* 𝑎""
𝑎#"𝑎"#
𝑎##
𝑎#$𝑎$#𝑎$$
𝑎"$ 𝑎$"state1
state3state2
𝜃* = 𝜋* , 𝐴* , 𝐵*Single HMM parameters:
Φ = 𝜙*, *,,.!+
Regime transition matrix:
Θ = 𝜃!, … , 𝜃$ , Φ
stand
run
walk
Regimes
𝜙%% 𝜙''
𝜙%(𝜙(% 𝜙'( 𝜙('
𝜙((
𝜙%'𝜙'%
![Page 22: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/22.jpg)
Idea (2): Incremental encoding schemeQ. How to decide # of segments/regimes?
CIKM2019 Sakurai Lab. K.Kawabata et al. 22
Idea: Minimum description length (MDL)• Minimize the total description cost of a data stream• Update ‘optimal’ # of segments/regimes
![Page 23: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/23.jpg)
Idea (2): Incremental encoding schemeIdea: Minimize total encoding cost
CIKM2019 Sakurai Lab. K.Kawabata et al. 23
Goodcompression
Gooddescription
CostM(C) + CostC(X|C)Model cost Coding cost
min ( )
1 2 3 4 5 6 7 8 9 10
CostM
CostC
CostT
(# of r, m)
![Page 24: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/24.jpg)
Q. How many new components does 𝒞 need?
Idea (2): Incremental encoding scheme
CIKM2019 Sakurai Lab. K.Kawabata et al. 24
A regimeA segment A stateKeep compact!
𝒞
𝑋!
regime?How many?
segment?
![Page 25: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/25.jpg)
Q. How many new components does 𝒞 need?
Idea (2): Incremental encoding scheme
CIKM2019 Sakurai Lab. K.Kawabata et al. 25
Keep compact!
𝒞
𝑋!
regime?How many?
segment?
A regimeA segment A stateDetails in paper
![Page 26: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/26.jpg)
Outline1. Motivation2. Problem definition3. Model4. Streaming Algorithm5. Experiments6. Conclusions
CIKM2019 Sakurai Lab. K.Kawabata et al. 26
![Page 27: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/27.jpg)
Streaming algorithms• Algorithms
CIKM2019 Sakurai Lab. K.Kawabata et al. 27
StreamScopeOptimize/update parameter set 𝒞
1. SegmentAssignmentIdentify regime transitions & segments
2. RegimeGeneration
Estimate new regimes 𝜃
Main
![Page 28: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/28.jpg)
StreamScope• Overview
CIKM2019 Sakurai Lab. K.Kawabata et al. 28
𝑋:
1. Keep current window:• The latest segment, 𝑠%• New observations, 𝑥&, …
𝑋) = 𝑠* ∪ 𝑥+
Data stream 𝑋
𝑡 →
𝒞
![Page 29: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/29.jpg)
StreamScope• Overview
CIKM2019 Sakurai Lab. K.Kawabata et al. 29
𝑋:
1. Keep current window:• The latest segment, 𝑠%• New observations, 𝑥&, …
𝑋) = 𝑠* ∪ 𝑥+
Data stream 𝑋
𝑡 →
𝒞
2. Update model set 𝓒• Minimize Δ𝐶𝑜𝑠𝑡'(𝑋(|𝒞)
Increase segments?(SegmentAssignment)
vs.Increase states/regimes?(RegimeGeneration)
![Page 30: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/30.jpg)
StreamScope• Overview
CIKM2019 Sakurai Lab. K.Kawabata et al. 30
Data stream 𝑋
𝑡 →
𝒞
𝑋:
1. Keep current window:• The latest segment, 𝑠%• New observations, 𝑥&, …
𝑋) = 𝑠* ∪ 𝑥+
3. Update 𝑿𝒄• If pattern has changed
2. Update model set 𝓒• Minimize Δ𝐶𝑜𝑠𝑡'(𝑋(|𝒞)
Increase segments?(SegmentAssignment)
vs.Increase states/regimes?(RegimeGeneration)
![Page 31: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/31.jpg)
1. SegmentAssignmentGiven:• Observation 𝒙𝒕• Model parameter set Θ = {𝜃%, … , 𝜃&, Φ}
Find:• Optimal cut point between regimes: 𝑚, 𝒮, ℱ
CIKM2019 Sakurai Lab. K.Kawabata et al. 31
![Page 32: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/32.jpg)
1. SegmentAssignmentOverview
CIKM2019 Sakurai Lab. K.Kawabata et al. 32
𝜃!
𝜃"
𝜃#
1 2 3 4
𝑡 →𝑥" 𝑥# 𝑥$ 𝑥* 𝑥+ 𝑥, 𝑥-
Dynamic programing algorithm to compute
𝑃(𝑥"|Θ) 𝜙"$
![Page 33: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/33.jpg)
1. SegmentAssignmentOverview
CIKM2019 Sakurai Lab. K.Kawabata et al. 33
𝜃!
𝜃"
𝜃#
1 2 3 4
𝜙"#Dynamic programing algorithm to compute
𝑃(𝑥"|Θ)Keep all candidate
cut pointsℒ = 𝐿!, 𝐿%, …
𝐿# = 2, 3
𝐿# = 4, 2
𝑥" 𝑥# 𝑥$ 𝑥* 𝑥+ 𝑥, 𝑥-
𝑡 →
𝑠𝑤𝑖𝑡𝑐ℎ? ?
𝑠𝑤𝑖𝑡𝑐ℎ? ?𝜙"$
![Page 34: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/34.jpg)
1. SegmentAssignmentOverview
CIKM2019 Sakurai Lab. K.Kawabata et al. 34
𝜃!
𝜃"
𝜃#𝑡 →
Dynamic programing algorithm to compute
𝑃(𝑥"|Θ)
𝛾 − guarantee:𝛾 ∝ 𝑚𝑒𝑎𝑛( 𝑠 )
𝐿# = 2, 3
𝛾
𝑥" 𝑥# 𝑥$ 𝑥* 𝑥+ 𝑥, 𝑥-
Keep all candidate cut points
ℒ = 𝐿!, 𝐿%, …
𝜙"$
![Page 35: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/35.jpg)
2. RegimeGenerationGiven:• Current window 𝑋)
Find:• New regimes: parameter set 𝑚, 𝑟, 𝒮, Θ, ℱ for 𝑋)
CIKM2019 Sakurai Lab. K.Kawabata et al. 35
![Page 36: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/36.jpg)
2. RegimeGeneration1. Two phase iterative approach
• Phase1: split segments into 2 groups
• Phase2: update 2 model parameters
CIKM2019 Sakurai Lab. K.Kawabata et al. 36
Phase 1
Phase 2
S1 =S2 =
𝜃!, 𝜃%, Φ
𝜃!𝜃"
𝑋/
![Page 37: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/37.jpg)
2. RegimeGeneration1. Two phase iterative approach
• Phase1: split segments into 2 groups
• Phase2: update 2 model parameters
2. Recursively split new regimes
• While total cost can be reduced
CIKM2019 Sakurai Lab. K.Kawabata et al. 37
𝜃!𝜃"
𝑋/
𝜃!𝜃"𝜃#
⋯
![Page 38: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/38.jpg)
Outline1. Motivation2. Problem definition3. Model4. Streaming Algorithm5. Experiments6. Conclusions
CIKM2019 Sakurai Lab. K.Kawabata et al. 38
![Page 39: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/39.jpg)
Experiments•We answer the following questions:
CIKM2019 Sakurai Lab. K.Kawabata et al. 39
Q1. Effectiveness:How successful is it in discovering patterns?
Q2. Accuracy:How well does it find cut-points & regimes?
Q3. Scalability:How does it scale in terms of time & memory consumption?
![Page 40: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/40.jpg)
Q1. Effectiveness - #Mocap• MoCap sensor stream
CIKM2019 Sakurai Lab. K.Kawabata et al. 40
1000 2000 3000 4000 5000Time
0
0.5
1
Value
L-armR-armL-legR-leg
?
![Page 41: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/41.jpg)
Q1. Effectiveness - #Mocap• MoCap sensor stream
CIKM2019 Sakurai Lab. K.Kawabata et al. 41
1000 2000 3000 4000 5000Time
0
0.5
1
Value
L-armR-armL-legR-leg
12
34
#1 Going straight
#2 Stretching arms
#4 Stretching left arm
#3 Stretching right arm
StreamScope can find intuitive patterns automatically
![Page 42: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/42.jpg)
Q1. Effectiveness - #Bicycle• Bicycle dataset
CIKM2019 Sakurai Lab. K.Kawabata et al. 42
≈Acceleration– (X,Y,Z)
![Page 43: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/43.jpg)
Q1. Effectiveness - #Bicycle• Bicycle dataset
CIKM2019 Sakurai Lab. K.Kawabata et al. 43
#1 #2
#3 #4
#5
![Page 44: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/44.jpg)
Q2. Accuracy• Segmentation accuracy (higher is better)
CIKM2019 Sakurai Lab. K.Kawabata et al. 44
#Mocap #Bicycle #Workout0
0.5
1
Mac
ro-F
1 sc
ore
StreamScope AutoPlait TICC-2 TICC-4 TICC-8 pHMM
Good accuracy compared with other methods
![Page 45: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/45.jpg)
Q2. Accuracy• Clustering accuracy (higher is better)
CIKM2019 Sakurai Lab. K.Kawabata et al. 45
#Mocap #Bicycle #Workout0
0.5
1
Accuracy
StreamScope AutoPlait TICC-2 TICC-4 TICC-8 pHMM
Good accuracy compared with other methods
![Page 46: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/46.jpg)
Q3. Scalability•Wall clock time vs. stream length
CIKM2019 Sakurai Lab. K.Kawabata et al. 46The complexity is independent of the data length
1 1.5 2 2.5 3 3.5 4Time 104
10-4
10-2
100
102
104
Wal
l clo
ck ti
me
(s)
StreamScope AutoPlait TICC pHMM
100x
![Page 47: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/47.jpg)
Q3. Scalability• Memory space vs. stream length
CIKM2019 Sakurai Lab. K.Kawabata et al. 47The complexity is independent of the data length
1 1.5 2 2.5 3 3.5 4Time 104
104
105
106
Mem
ory
spac
e (b
yte)
StreamScopeO(n)
100x
![Page 48: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/48.jpg)
ConclusionsStreamScope has the following advantages:
Effective:Find optimal segments/regimes
Adaptive:Automatic and incremental
Scalable:It does not depend on data length
CIKM2019 Sakurai Lab. K.Kawabata et al. 48
![Page 49: Automatic Sequential Pattern Mining in Data Streamsyasushi/...•e.g., IoT sensors/Web click logs •contain multiple patterns Requirements: •Incremental •We cannot access all](https://reader033.vdocument.in/reader033/viewer/2022043004/5f87023a785c1d49b93d57b0/html5/thumbnails/49.jpg)
Thank you !
CIKM2019 Sakurai Lab. K.Kawabata et al. 49
≈