data stream warehousing
TRANSCRIPT
![Page 2: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/2.jpg)
Outline • What is Stream Warehousing? • Case Study: Darkstar • Technical Issues
– Update propagaIon – Temporal consistency – Scheduling
![Page 3: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/3.jpg)
What is Stream Warehousing • Data Stream Management System
– Fast processing in main memory • Small data windows, 1-‐pass processing
– Data reducIon and alerIng • Data Warehouse
– Long-‐term data storage – Curated data sets – Fuse data from many sources – Derived data products, complex analyIcs
• Data Stream Warehouse – A data warehouse with conInual real-‐Ime updates.
![Page 4: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/4.jpg)
Data Feeds
Base Tables
Derived Data Tables
![Page 5: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/5.jpg)
Data TransformaIon in the Warehouse
![Page 6: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/6.jpg)
Building ApplicaIon in the Warehouse
![Page 7: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/7.jpg)
ApplicaIon: Darkstar • AT&T Labs – Research project. • Collect diverse and large-‐scale data feeds from network
elements • Use for
– networking research, – data mining (e.g. correlate network events with failures), – alerIng, – troubleshooIng
• The network is a large and complex system – Not just IPV4.
• Argus – He Yan, Zihui Ge
• Ptolemy – Zihui Ge, Don Caldwell, Bill Becke=
![Page 8: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/8.jpg)
Darkstar: Mining Vast Amounts of Data
Network
Route monitors (OSPFmon, BGPmon)
Device service monitoring (CIQ, MTANet, STREAM)
AcIve service and connecIvity monitoring
Syslog Config
SNMP Polling (router, link) Nealow
Deep Packet InspecIon (DPI)
Alarms
Tickets
AuthenIcaIon/ logging (tacacs)
Customer feedback – IVR, Ickets, MTS
IP Backhaul Enterprise IP, VPNs
Ethernet Access
IPTV
Layer one
Mobility
![Page 9: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/9.jpg)
ARGUS: DetecIng Service Issues… • Goal: detect and isolate ac#onable anomaly events using comprehensive end-‐to-‐end performance measurements (e.g. GS tool) • SophisIcated anomaly detecIon and heurisIcs • SpaIal localizaIon • Accurately accounts for service performance that varies considerably by Ime-‐of-‐day
and locaIon • Impact: • Reduced detecIon Ime from days to approx. 15 mins for detecIng data service issues
• OperaIonal naIon-‐wide monitoring data service performance for 3G and LTE (TCP retransmission, RTT, throughput from GS Tool)
![Page 10: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/10.jpg)
Market
Sub-‐Market Sub-‐Market …
SGSN SGSN
… RNC RNC
…
SITE SITE …
SITE
SITE
RNC
SITE
SITE
RNC
SITE
SITE RNC
SGSN
SGSN GGSN
GGSN
Collect end-‐to-‐
end Performance
Data
Approach: Mobility LocalizaIon Hierarchy
![Page 11: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/11.jpg)
Case Example: Silent CE Overload CondiIon • ARGUS detected event: 2 Columbia 3G Ericsson SGSN’s impacIng RNC’s in West Virginia, Norfolk, and Richmond • No other indicaIon of issue • Topology highlighted CE used by only impacted SGSNs
• RCA: “6148 48 port 1gig card is limited to a shared 1 gig bus for each set
of 8 gig ports”
ARGUS alarm: clmamdorpn2 (TCP retransmissions) CE UElizaEon flaHening
![Page 12: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/12.jpg)
ARGUS As A General Capability… Spike in call drop rate on MSC hrndvacxca1 RTT anomalies (SGSN level)
Outage start 5:30 GMT
First Anomaly 5:40 GMT
CTS Ticket Created 08:21 GMT
Social media (Twi=er) NY outage
LA outage
Node metrics, acIve measurements (CBB, IPAG WIPM delay)…
Mobility customer Ickets (Boston market – PE isolaIon)
![Page 13: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/13.jpg)
• 1. At-‐a-‐glance view of network topology and state
• VisualizaIon to summarize important informaIon on network health • Color-‐coded
• Complimentary to IckeIng system – reporIng issues below “alarming” status
Page 13
h=p://ptolemy.research.a=.com/
Use network visualiza9on and convenient data explora9on to help network operators with network health monitoring and service problem troubleshoo9ng
Ptolemy
h=p://ptolemy.research.a=.com/mobility
![Page 14: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/14.jpg)
Assess damage, idenIfy remaining capacity
Page 14
Loss of many links out of Japan. What’s lem?
Example 1: Japan Earthquake, March 11th 2011
![Page 15: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/15.jpg)
IdenIfy traffic shims, no congesIon
Page 15
Increase in link load as traffic re-‐routed
Link load
Example 1: Japan Earthquake, March 11th 2011
![Page 16: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/16.jpg)
DataDepot • Data warehousing system developed for stream warehousing – (RelaIvely) independent of the underlying database.
• Technologies for pushing updates through a warehouse – Update PropagaIon – Temporal consistency – Real-‐Ime scheduling in a stream warehouse – Lukasz Golab, Vladislav Shkapenyuk
![Page 17: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/17.jpg)
Managing a Stream Warehouse
• ConInually arriving data
Raw data
![Page 18: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/18.jpg)
Managing a Stream Warehouse
• ConInually arriving data • Is loaded into temporally parIIoned base tables
Raw data
Ime
![Page 19: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/19.jpg)
Managing a Stream Warehouse
• ConInually arriving data • Is loaded into temporally parIIoned base tables • Updates propagate to higher level data products.
Raw data
Ime
![Page 20: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/20.jpg)
Incremental Updates
• Only propagate the increment. • Update only those parIIons whose sources have new data.
• How can we determine if a source parIIon has more recent data?
![Page 21: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/21.jpg)
“make” doesn’t work
B1 B2
S1 S2
D
1 2
3 4
5
![Page 22: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/22.jpg)
“make” doesn’t work
B1 B2
S1 S2
D
1 6 2 7
3 4 8
5
7
![Page 23: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/23.jpg)
“make” doesn’t work
B1 B2
S1 S2
D
1 6 2 7
3 10 4 8
5 11
9 7
8
![Page 24: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/24.jpg)
Update PropagaIon • We can build complex apps if we’re confident that all updates get propagated.
• 1st version of DD: used make-‐style algorithm – Not correct for complex configuraIons
• Developed update propagaIon theory • 2nd version : has scheduling restricIons (read/write locks) – Led to poor real-‐Ime responsiveness
• 3rd version : no scheduling restricIons – Uses a small amount of addiIonal metadata. – Similar to a vector Imestamp. – SSDBM 2011
![Page 25: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/25.jpg)
Effects of Scheduling RestricIons
Update propagation from CPU_RAW to CPU
0
200
400
600
800
1000
1200
1400
1E+09 1E+09 1E+09 1E+09 1E+09 1E+09 1E+09 1E+09 1E+09 1E+09 1E+09 1E+09
time
seco
nd
s
C_POLL_COUNTS
AGG_60_C
StarIng-‐Imestamp update protocol Interval-‐Imestamp update protocol
Update propagation from CPU_RAW to CPU
0
200
400
600
800
1000
1200
1400
1E+09 1E+09 1E+09 1E+09 1E+09 1E+09 1E+09 1E+09
time
seco
nd
s
![Page 26: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/26.jpg)
Consistency in a Stream Warehouse • TradiIonal noIon of consistency : a snapshot of the system.
• In a big, complex system, you can’t take a NOW snapshot.
• In most cases, you eventually reach a point where you are reasonably confident about the state of the system in the recent past.
• CIDR 2011
![Page 27: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/27.jpg)
Data Arrives in a Smear Over Time ParIally filled parIIon
Late arriving data
Very late data
![Page 28: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/28.jpg)
0
2
4
6
8
10
12
0 100000 200000 300000 400000 500000 600000
Num
ber o
f Windo
ws
Time ( seconds)
Number of windows per package
![Page 29: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/29.jpg)
Many Data Feeds • The value of a warehouse is the ability to correlate different streams of informaIon. – Correlate periods of high packet loss on the acIve measurement probes with the router CPU and Memory uIlizaIon on the routers on the path between the measurement probes.
• Different feeds have different Ime lags – AcIve measurements: 45 minutes, SNMP: 15 minutes
• Darkstar – Warehouse of network performance, configuraIon, and alert data
– Used for research, billing, network troubleshooIng – 100 data feeds, 700 tables as of December 2010.
![Page 30: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/30.jpg)
Query Stability • How do I know when the data is stable enough to query?
• What is stable enough? – Data will never change – Data won’t change much. – I’ll take whatever is there.
![Page 31: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/31.jpg)
Consistency Levels • PunctuaIons on parIIons that indicate completeness. • Vagueness of real-‐life means that they are best guesses.
• We use the following in our running examples – Open : The parIIon should have some data in it. – Closed : The parIIon will not change. – Complete : the parIIon will not change, and all data has been received.
• E.g. we know that there are five packages per window, and they will arrive at most 1 hour late.
• MoIvated by specific needs of DataDepot users.
• Closed is a guess – WeaklyClosed, StronglyClosed
![Page 32: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/32.jpg)
Base ParIIon Consistency Levels
• Label each base table parIIon with a temporal consistency level.
• Use source-‐specific informaIon to infer how certain we can be that all data for a parIIon has arrived. – Tends to be a hazy noIon.
• SomeImes we have a hierarchy – Complete > StronglyClosed > Closed > Open – But not in general.
![Page 33: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/33.jpg)
Derived Table Consistency Inference
• Infer on a parIIon-‐wise basis, for each consistency level separately
• Simple rule: a parIIon has consistency level C if all source parIIons have consistency level C.
• Can make use of the properIes of the defining query to improve the inference.
![Page 34: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/34.jpg)
Update Consistency • We might know that some tables naturally require their parIIons to have a parIcular consistency (update consistency) to be useful. – Router alerts : Open – Per-‐day usage summaries : Closed
• We can reduce update cost by only updaIng a parIIon if it would achieve a parIcular level of consistency – Per-‐day summary fed by 5-‐minute updates: 288 updates when only 1 is needed.
• Labeling all tables in the warehouse is an excessive burden on the DBA. – Label important final-‐result tables, and infer the update consistency for the others.
![Page 35: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/35.jpg)
Consistency Levels • Many consistency levels are possible.
• Closed is a guess. – WeaklyClosed : probably stable.
– StronglyClosed : almost certainly stable.
• Other levels – MostlyClosed : Few values will change
– MostlyFull : Most expected records have arrived.
• The consistency levels might not form a hierarchy
Complete
StronglyClosed
WeaklyClosed
MostlyClosed
MostlyFull
Open
![Page 36: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/36.jpg)
Scheduling • Need to schedule updates to avoid resource thrashing.
• Real Time scheduling problem: some very long jobs, some very short jobs. – Global scheduling is the most efficient – But, it is easy to generate infeasible task sets with low resource uIlizaIon using global scheduling.
• Catch-‐up processing can generate temporary overloads – Due to broken feeds, data quality debugging, etc. – Can’t discard updates during overload (unlike DSMS) – Need to perform catch-‐up without affecIng real-‐Ime tasks.
![Page 37: Data Stream Warehousing](https://reader034.vdocument.in/reader034/viewer/2022051521/586ccba61a28abda3a8bfc3c/html5/thumbnails/37.jpg)
Conclusions • OpImizaIon, service quality, and security of large-‐scale, complex systems require a stream monitoring infrastructure.
• Data Stream Warehousing enables near real-‐Ime applicaIons – AlerIng and troubleshooIng using near real-‐Ime and historical data
• Next steps: – Moving to cloud infrastructure