technical deep-dive in a column-oriented in-memory database
DESCRIPTION
The technical document to get more knowledge on In-Memory computing.TRANSCRIPT
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
1/130
Technical Deep-Dive in a
Column-OrientedIn-Memory DatabaseProf. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
TEC215
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
2/130
Motivation
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
3/130
All Areas have to taken into
account
Changed
Hardware
Advances in
data processing
(software)
Complex
Enterprise Applications
Our focus
3
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
4/130
Why a New Data
Management?! DBMS architecture has not changed over decades
Redesign needed to handle the changes in:
Hardware trends (CPU/cache/memory)
Changed workloads Data characteristics/amount
New application requirements
Some academic prototypes:
MonetDB, C-store, HyPer, HYRISE
Several database vendors picked up
the idea and have new databases in place
(e.g., SAP, Vertica, Greenplum, Oracle)
Buffer pool
Query engine
Traditional DBMS Architecture
4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
5/130
A
Changes in Hardware
give an opportunity to re-think the assumptions of
yesterday because of what is possible today.
Main Memory becomes cheaper and larger
Multi-Core Architecture
(96 cores per server)
One blade ~$50.000 =
1 Enterprise Class Server
Parallel scaling across blades
64 bit address space
2TB in current servers
25GB/s per core
Cost-performance ratio rapidly
declining
Memory hierarchies
5
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
6/130
In the Meantime
Research as come up with
Column-oriented data organization(the column store)
Sequentialscans allow best bandwidth utilizationbetween CPU cores and memory
Independence of tuples within columns allows easypartitioning and therefore parallel processing
Lightweight Compression Reducing data amount, while..
Increasing processing speed through late materialization
And more, e.g., parallel scan/join/aggregation
several advance in software for processing data
+
6
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
7/130
Data Management
for
Enterprise Applications
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
8/130
Challenge: Diverse
ApplicationsTransactional
Data Entry
Sources: Machines, Transactional Apps,
User Interaction, etc.
Text Analytics,
Unstructured Data
Sources: web, social, logs,
support systems, etc.
Event Processing,
Stream Data
Sources: machines, sensors,
high volume systems
Real-time Analytics,
Structured Data
Sources: Reporting, Classical
Analytics, planning, simulation
Data Management
CPUs
(multi-Core + Cache)
Main Memory
8
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
9/130
OLTP vs. OLAP
Modern enterprise resource planning (ERP) systems are
challenged by mixedworkloads, including OLAP-style
queries. For example:
OLTP-style: create sales order, invoice, accounting documents,
display customer master data or sales order
OLAP-style: dunning, available-to-promise, cross selling, operational
reporting (list open sales orders) But: Todays data management systems are optimized
eitherfor daily transactional oranalytical workloads
(storing their data along rows or columns)
Online TransactionProcessing
Online AnalyticalProcessing
9
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
10/130
Drawbacks of any
Separation
OLAP/sub system does not have the latestdata
OLAP/sub system does only have predefinedsubsetof the
data
Cost-intensiveETLprocess has to synch both systems
There is a lot of redundancy
Differentdataschemasintroduce complexity for applications
combining sources
10
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
11/130
Workarounds in OTLP
As in OLAP-systems OLTP-systems facilitate redundant data to
overcome shortcoming in todays data management
Materialized Views
Materialized Aggregates
Pre-computed and materialized result sets
Since the database has been the bottleneck, complex data
processing is done on application server
Simple SQL statements
Nested loop joins (SELECT . SELECT SINGLE ENDSELECT)
Batch-processing lead to
Long running business processes
Inflexibility (e.g. ATP rescheduling)
11
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
12/130
Enterprise Applications Have a
Specific Database Footprint
Today's Enterprise Applications
Complex processes
Increased data set (but real-world events driven)
Separated into transactional (OLTP) and analytical (OLAP) applications Enterprise Data Management
Wide schemas
Sparse data with limited domain
Workload consists of complex analytical queries
Workload characteristics
Set processing
Read access
Insert operations instead of updates12
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
13/130
Enterprise Data Characteristics
Enterprise data is sparse data:
Most tables are empty (~150 important tables)
Many columns are not used even once (~50%)
Many columns have a low cardinality of values
NULL values/default values are dominant
Sparse distribution facilitates high compression
0
12500
25000
37500
50000
0 1-100 100-1000 1000-10000 10K-100K 100K - 1M 1M-10M >10M
14157992513852685
6290
15553
46418
Tablesthatfitinto
cate
gory
Number of Records
13
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
14/130
Enterprise Data Characteristics
Enterprise data is sparse data:
Most tables are empty (~150 important tables)
Many columns are not used even once (~50%)
Many columns have a low cardinality of values NULL values/default values are dominant
Sparse distribution facilitates high compression
0%
10%
20%
30%
40%
50%
60%70%
80%
1 - 32 33 - 1023 1024 - 100000000
13%
9%
78%
24%
12%
64%
Number of Distinct Values
Inventory Management
Financial Accounting
%ofColumns
14
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
15/130
Enterprise Workloads are
Read-Mostly
0 %
10 %
20 %
30 %
40 %
50 %
60 %
70 %
80 %
90 %
100%
OLTP OLAP
Workload
0 %
10 %
20 %
30 %
40 %
50 %
60 %
70 %
80 %
90 %
100%
TPC-C
Workload
Select
Insert
Modification
Delete
Write:
Read:
Lookup
Table Scan
Range Select
Insert
Modification
Delete
Write:
Read:
Enterprise applications have evolved:
Not just OLAP vs. OLTP
Workload in enterprise applications constitutes of
Mainly read queries (OLTP 83%, OLAP 94%)
Many queries access large sets of data
15
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
16/130
Approach
Change overall data management system assumption
In-Memory only
Start with read-optimized data structures
Transactional features as needed
Vertically partitioned (column store)
CPU-cache optimized
Only one optimization objectivemain memory access
Rethink how enterprise application persistence is build Single data management system
No redundant data, no materialized views, cubes
Computational application logic closer to the database
(i.e. complex queries, stored procedures)Backup
Column
ROW TEXT
IN-Memory Column + Row
OLTP + OLAP + Text
16
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
17/130
In-Memory Data Processing
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
18/130
Recap:
Memory Hierarchy
18
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
19/130
Recap:
Latency Numbers
L1 cache reference (cached data word) 0.5ns
Branch mispredict 5ns
L2 cache reference 7ns
Main memory reference 100ns 0.1s
Send 2K bytes over 1 Gbps network 20,000ns 20s
SSD random read 150,000ns 150s
Read 1 MB sequentially from memory 250,000ns 250s
Disk seek 10,000,000ns 10ms
Send packet CANetherlandsCA 150,000,000ns 150ms
19
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
20/130
In-Memory Data Processing
In DBMS, on disk as well as in memory,data processing is often:
Not CPU bound
But bandwidth bound
I/O Bottleneck
CPU could process data faster
Memory Access:
Not truly random (in the sense of constant latency) Data is read in blocks/cache lines
Even if only parts of a block are requested
Potential waste of bandwidth V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
Cache Line 1 Cache Line 220
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
21/130
Data Layout in Main Memory
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
22/130
Basics (1)
Memory in todays computers has a linear address layout:
addresses start at 0x0 and go to 0xFFFFFFFFFFFFFFFF for 64bit
Not every system is 64bit addressable, e.g. modern Intel
systems use only 48bits which allows up to 256TB RAM on asingle machine
Virtual memory allocated by the program can distribute over
this space
Each UNIX process has its own view on the address space Address translation is done in hardware by the CPU
22
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
23/130
Basics (2)
The memory layout is only linear, every higher-dimensional
access is mapped to this linear band.
23
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
24/130
Physical Data Representation
Rowstore: Rows are stored consecutively
Optimal for row-wise access (e.g. SELECT *)
Columnstore:
Columns are stored consecutively
Optimal for attribute focused access
(e.g. SUM, GROUP BY)
Note: concept is independentfrom storage type
But only in-memory implementation allows fast tuple
reconstruction in case of a column store
Doc
Num
Doc
Date
Sold-
To
Value
Status
Sales
Org
Row4
Row3
Row2
Row1
Row-Store Column-store
+
24
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
25/130
Row Data Layout
Data is stored tuple-wise
Leverage co-location of attributes for a single tuple
Low cost for reconstruction, but higher cost for sequential
scan of a single attribute
25
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
26/130
Columnar Data Layout
Data is stored attribute-wise
Leverage sequential scan-speed in main memory
for predicate evaluation
Tuple reconstruction is more expensive
26
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
27/130
Row-oriented storage
27
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
28/130
Row-oriented storage
28
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
29/130
Row-oriented storage
29
A1 B1 C1 A2 B2 C2
A3 B3 C3
A4 B4 C4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
30/130
Row-oriented storage
30
A1 B1 C1 A2 B2 C2 A3 B3 C3
A4 B4 C4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
31/130
Row-oriented storage
31
A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
32/130
Column-oriented storage
32
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
33/130
Column-oriented storage
33
B1 C1
B2 C2
B3 C3
B4 C4
A1 A2 A3 A4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
34/130
Column-oriented storage
34
A1 B1
C1
A2 B2
C2
A3 B3
C3
A4 B4
C4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
35/130
Column-oriented storage
35
A1 B1 C1A2 B2 C2A3 B3 C3A4 B4 C4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
36/130
Example: OLTP-Style Query
36
struct Tuple {
int a,b,c;
};
Tuple data[4];
fill(data);
Tuple third = data[3];
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
37/130
Example: OLTP-Style Query
37
struct Tuple {
int a,b,c;
};
Tuple data[4];
fill(data);
Tuple third = data[3];
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
Row Oriented Storage
Column Oriented Storage
Cache line
1 2
1 2 3
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
38/130
Example: OLAP-Style Query
38
struct Tuple {
int a,b,c;
};
Tuple data[4];
fill(data);
int sum = 0;
for(int i = 0;i
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
39/130
Example: OLAP-Style Query
39
struct Tuple {
int a,b,c;
};
Tuple data[4];
fill(data);
int sum = 0;
for(int i = 0;i
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
40/130
Mixed Workloads
40
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
OLTP-style queries OLAP-style queries
Mixed Workloads involve attribute- andentity-focused queries
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
41/130
Mixed Workloads:
Choosing the Layout
41
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
A1 B1 C1
A2 B2 C2
A3 B3 C3
A4 B4 C4
OLTP-style queries OLAP-style queries
Layout OLTP-Misses OLAP-Misses Mixed
Row 2 3 5
Column 3 1 4
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
42/130
Dictionary Encoding
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
43/130
Motivation
Main memory access is the new bottleneck
Idea: TradeCPU time to compress and decompress data
Compression reduces number of I/O operations to main
memory
Leads to less cache misses due to more information on a cache line
Operation directly on compressed data
Offsetting with bit-encoded fixed-length data types
Based on limited value domain
43
D E d
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
44/130
Dictionary Encoding
Example8 billion humans
Attributes
first name
last name
gender
country
city
Birthday
200 byte Each attribute is dictionary
encoded
44
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
45/130
Sample Data
45
rec ID fname lname gender city country birthday
39 John Smith m Chicago USA 12.03.1964
40 Mary Brown f London UK 12.05.1964
41 Jane Doe f Palo Alto USA 23.04.1976
42 John Doe m Palo Alto USA 17.06.1952
43 Peter Schmidt m Potsdam GER 11.11.1975
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
46/130
Dictionary Encoding a Column
A column is split into a dictionary and an attribute vector
Dictionary stores all distinct values with implicit value ID
Attribute vector stores value IDs for all entries in the column
Position is stored implicitly Enables offsetting with bit-encoded fixed-length data types
46
Rec ID fname
39 John
40 Mary
41 Jane
42 John
43 Peter
Dictionary for fname
Value ID Value
23 John
24 Mary
25 Jane
26 Peter
Attribute Vector for fname
position Value ID
39 23
40 24
41 25
42 23
43 26
Q i D i
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
47/130
Querying Data using
Dictionaries
Search for Attribute Value
Search Value IDs for requested value in Dictionary
Scan Attribute Vector for Value ID
Replace Value IDs in result with corresponding
dictionary value
47
Dictionary for fname
Value ID Value
23 John
24 Mary
25 Jane
26 Peter
Attribute Vector for fname
position Value ID
39 23
40 24
41 25
42 23
43 26
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
48/130
Sorted Dictionary
Dictionary entries are sorted either by their numeric value or
lexicographically
Dictionary lookup complexity: O(log(n)) instead of O(n)
Dictionary entries can be compressed to reduce the amount
of required storage
Selection criteria with ranges are less expensive (order-
preserving dictionary)
48
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
49/130
Compression Rate
Depends on cardinality / entropy
Cardinality
Table cardinality: number of tuples in a relation
Column cardinality: number of distinct values in a column
Entropy
measure for information density
Entropy = column cardinality / table cardinality
49
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
50/130
Data Size Examples
50
Column Column
Cardinality
Entropy Item Size Plain Size Size with Dictionary
(Dictionary + Column)
First names 5 million
23 bit
6.25 10-4 49 Byte 365.10 GB
373,840 MB
234 MB + 21.42 GB
22,168 MB
Last names 8 million
23 bit
1 10-3 50 Byte 372.5 GB
381,470 MB
381 MB + 21.42 GB
22,316 MB
Gender 2
1 bit
2.5 10-10 1 Byte 7.45 GB
7,630 MB
2 Byte + 0.93 GB
954 MB
City 1 million
20 bit
1.25 10-4 49 Byte 365.08 GB
373,840 MB
46.73 MB + 18.62 GB
19,120 MB
Country 200
8 bit
2.5 10-8 49 Byte 365.08 GB
373,840 MB
6.09 KB + 7.45 GB
7,629 MB
Birthday 40,000
16 bit
5 10-6 2 Byte 14.90 GB
15,260
MB
76.29 KB + 14.90 GB
15,259 MB
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
51/130
In-Memory Database
Operations
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
52/130
Scan Performance (1)
8 billion humans
Attributes
First Name
Last Name
Gender
Country
City
Birthday
200 byte
Question: How many men/women?
Assumed scan speed: 2MB/ms/core
52
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
53/130
Scan Performance (2)
53
Row Store Layout
Table size =
8 billion tuples x
200 bytes per tuple
1.6 TB
Scan through all rows
with 2MB/ms/core
800 seconds
with 1 core
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
54/130
Scan Performance (3)
54
Row Store Full Table Scan
Table size =
8 billion tuples x
200 bytes per tuple
1.6 TB
Scan through all rows
with 2MB/ms/core
800 seconds
with 1 core
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
55/130
Scan Performance (4)
55
8 billion cache accesses
64 byte
512 GB
Read with 2MB/ms/core256 seconds
with 1 core
Row Store Stride Access Gender
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
56/130
Scan Performance (5)
56
Column Store Layout
Table size
Attribute vectors: 91 GB
Dictionaries: 700 MB
Total: 92 GB Compression
factor: 17
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
57/130
Scan Performance (6)
57
Column Store Full Column Scan on Gender
Size of attribute vector
gender =
8 billion tuples x
1 bit per tuple1 GB
Scan through
attribute vector with
2MB/ms/core
0.5 seconds with 1 core
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
58/130
Scan Performance (7)
58
Column Store Full Column Scan on Birthday
Size of attribute vector
birthday =
8 billion tuples x
2 byte per tuple =16 GB
Scan through
column with
2MB/ms/core
8 seconds with 1 core
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
59/130
Scan PerformanceSummary
59
How many women, how many men?
Column
Store Row Store
Full tablescan
Strideaccess
Time inseconds
0.5 800 256
1,600xslower
512xslower
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
60/130
SELECT Example
SELECT first_name, last_name FROM world_populationWHERE country=IT and gender=m
id fname lname country gender
2394 Gianluigi Buffon Italy m
3010 Lena Gercke Germany f
3040 Mario Balotelli Italy m
3949 Manuel Neuer Germany m
4902 Lukas Podolski Germany m
20102 Klaas-Jan Huntelaar Netherlands m
60
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
61/130
Query Plan
Predicates are evaluated and generate position lists
Intermediate position lists are logically combined
Final position list is used for materialization
61
s s
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
62/130
Query Execution
Position
0
3
Position
0
2
3
4
5
Position
0
3
country = 3 (Italy)
gender = 1 (m)
AND
Value ID Dictionary
forcountry
0 Algeria
1 France
2 Germany
3 Italy
4 Netherlands
Value
ID
Dictionary
forgender
0 f
1 m
id fname lname country gender
2394 Gianluigi Buffon 3 1
3010 Lena Gercke 2 0
3040 Mario Balotelli 3 1
3949 Manuel Neuer 2 1
4902 Lukas Podolski 2 1
20102 Klaas-Jan Huntelaar 4 1
62
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
63/130
Tuple Reconstruction
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
64/130
Tuple Reconstruction #1
All attributes are
stored
consecutively
200 byte4 cacheaccesses 64 byte
256 byte
Read with
2MB/ms/core
0.128
microseconds
with 1 core
Accessing a record in a row store
64
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
65/130
Tuple Reconstruction #2
All attributes are
stored in separate
columns
Implicit record IDsare used to
reconstruct rows
65
Virtual Record IDs
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
66/130
Tuple Reconstruction #3
1 cache access for
each attribute
6 cache accesses
64 byte384byte
Read with
2MB/ms/core
0.192
microseconds
with 1 core
66
Virtual Record IDs
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
67/130
Insert
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
68/130
Example
rowID fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-19683 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
... ... ... ... ... ... ...
world_population
INSERT INTO world_population
VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
68
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
69/130
INSERT (1) w/o new Dictionary entry
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruc
k
10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
... ... ... ... ... ... ...
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
0 0
1 1
2 3
3 2
4 3
0 Albrecht
1 Berg
2 Meyer
3 Schulze
DAV
Attribute Vector (AV)
Dictionary (D)
69
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
70/130
0 Albrecht
1 Berg
2 Meyer
3 Schulze
1. Look-up on Dentry found
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
Attribute Vector (AV)
Dictionary (D)
D
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruc
k
10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
... ... ... ... ... ... ...
AV
0 0
1 1
2 3
3 2
4 3
INSERT (1) w/o new Dictionary entry
70
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
71/130
0 0
1 1
2 3
3 2
4 3
5 3
1. Look-up on Dentry found
2. Append ValueID to AV
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
Attribute Vector (AV)
Dictionary (D)
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruc
k
10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze
... ... ... ... ... ... ...
AV
0 Albrecht
1 Berg
2 Meyer
3 Schulze
D
INSERT (1) w/o new Dictionary entry
71
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
72/130
INSERT (2) with new Dictionary Entry I/II
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze
... ... ... ... ... ... ...
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
0 0
1 0
2 1
3 2
4 3
DAV
0 Berlin
1 Hamburg
2 Innsbruck
3 Potsdam
Attribute Vector (AV)
Dictionary (D)
72
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
73/130
1. Look-up on Dno entry found
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
DAV
0 0
1 0
2 1
3 2
4 3
0 Berlin
1 Hamburg
2 Innsbruck
3 Potsdam
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze
... ... ... ... ... ... ...
Attribute Vector (AV)
Dictionary (D)
INSERT (2) with new Dictionary Entry I/II
73
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
74/130
1. Look-up on Dno entry found
2. Append new value to D (no re-sorting necessary)
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
0 Berlin
1 Hamburg
2 Innsbruck
3 Potsdam4 Rostock
D
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze
... ... ... ... ... ... ...
Attribute Vector (AV)
Dictionary (D)
AV
0 0
1 0
2 1
3 24 3
INSERT (2) with new Dictionary Entry I/II
74
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
75/130
1. Look-up on Dno entry found
2. Append new value to D (no re-sorting necessary)
3. Append ValueID to AV
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze Rostock
... ... ... ... ... ... ...
Attribute Vector (AV)
Dictionary (D)
0 0
1 0
2 1
3 24 3
5 4
0 Berlin
1 Hamburg
2 Innsbruck
3 Potsdam4 Rostock
DAV
INSERT (2) with new Dictionary Entry I/II
75
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
76/130
0 Anton
1 Hanna
2 Martin
3 Michael4 Sophie
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze Rostock
... ... ... ... ... ... ...
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
DAV
0 2
1 3
2 1
3 04 4
Attribute Vector (AV)
Dictionary (D)
INSERT (2) with new Dictionary Entry I/II
76
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
77/130
1. Look-up on Dno entry found
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze Rostock
... ... ... ... ... ... ...
Attribute Vector (AV)
Dictionary (D)
0 Anton
1 Hanna
2 Martin
3 Michael4 Sophie
DAV
0 2
1 3
2 1
3 04 4
INSERT (2) with new Dictionary Entry II/II
77
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
78/130
1. Look-up on Dno entry found
2. Append new value to D
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze Rostock
... ... ... ... ... ... ...
Attribute Vector (AV)
Dictionary (D)
0 Anton
1 Hanna
2 Martin
3 Michael4 Sophie
5 Karen
DAV
0 2
1 3
2 1
3 04 4
INSERT (2) with new Dictionary Entry II/II
78
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
79/130
1. Look-up on Dno entry found
2. Append new value to D
3. Sort D
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze Rostock
... ... ... ... ... ... ...
Attribute Vector (AV)
Dictionary (D)
0 Anton
1 Hanna
2 Martin
3 Michael4 Sophie
5 Karen
D (old)AV
0 2
1 3
2 1
3 04 4
0 Anton
1 Hanna
2 Karen
3 Martin4 Michael
5 Sophie
D (new)
INSERT (2) with new Dictionary Entry II/II
79
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
80/130
1. Look-up on Dno entry found
2. Append new value to D
3. Sort D
4. Change ValueIDs in AV
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Schulze Rostock
... ... ... ... ... ... ...
Attribute Vector (AV)
Dictionary (D)
AV (old)
0 Anton
1 Hanna
2 Karen
3 Martin4 Michael
5 Sophie
D (new)AV (new)
0 3
1 4
2 1
3 04 5
0 2
1 3
2 1
3 04 4
INSERT (2) with new Dictionary Entry II/II
80
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
81/130
1. Look-up on Dno entry found
2. Append new value to D
3. Sort D
4. Change ValueIDs in AV
5. Append new ValueID to AV
INSERT INTO world_population VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Sophie Schulze f GER Potsdam 09-03-1977
5 Karen Schulze Rostock
... ... ... ... ... ... ...
Attribute Vector (AV)
Dictionary (D)
0 Anton
1 Hanna
2 Karen
3 Martin4 Michael
5 Sophie
DAV
0 3
1 4
2 1
3 04 5
5 2
INSERT (2) with new Dictionary Entry II/II
81
RESULT
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
82/130
RESULT
rowID fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-1955
1 Michael Berg m GER Berlin 03-05-1970
2 Hanna Schulze f GER Hamburg 04-04-1968
3 Anton Meyer m AUT Innsbruck 10-20-1992
4 Ulrike Schulze f GER Potsdam 09-03-1977
5 Karen Schulze f GER Rostock 06-20-2012
... ... ... ... ... ... ...
world_population
INSERT INTO world_population
VALUES (Karen, Schulze, w, GER, Rostock, 06-20-2012)
82
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
83/130
Insert-Only
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
Facts about Insert Only
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
84/130
Facts about Insert-Only
Principles
Never delete any data
Invalidate outdated tuples instead
Logical Update is changed to an technical Insert/Append
Advantages
Gap-less time travel possible,
Legal requirements, e.g. for auditability can easily be met
Implicit logging
Snapshot Isolation and locking is simplified
Disadvantage: Increased memory consumption
but applied compression reduces overhead
84
Implementation possibilities
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
85/130
Implementation possibilities1. Point Representation
Store complete tuple on attribute change Save insert timestamp in column valid_from
Writes faster, reads slower
2. Interval representation
Store complete tuple on attribute change Update replaced tuple, store current timestamp in valid_to
Same timestamp is stored into valid_from in new tuple
Reads faster, writes slower
3. History reconstruction on demand
Update existing tuple on changes (not insert-only any longer)
Store outdated values in separate history table
Status updates can be done in-place with timestamps
Timestamps are not compressed85
Insert Only 1
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
86/130
Insert-Only 1Point Representation
Introduce one additional column: valid_from
Primary key is composed of id and valid_from
Insert is easy: valid_from = current timestamp
Select, Group By, Join: requires nested join to determine the valid_from
timestamp for each object
id fname lname gender country city birthday valid_from
0 Martin Albrecht m GER Berlin 08-05-1955 10-11-2011
1 Michael Berg m GER Berlin 03-05-1970 10-11-2011
2 Hanna Schulze f GER Hamburg 04-04-1968 10-11-2011
3 Anton Meyer m AUT Innsbruck 10-20-1992 10-11-2011
4 Ulrike Schulze f GER Potsdam 09-03-1977 10-11-2011
5 Sophie Schulze f GER Rostock 06-20-2012 10-11-2011
1 Michael Berg m GER Potsdam 03-05-1970 07-02-2012
... ... ... ... ... ... ... ...
Zacharias Perdopolus m GRE Athen 03-12-1979 10-11-20118109
86
Insert Only 2
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
87/130
Insert-Only 2Interval Representation
Introduce 2 additional columns: valid_from and valid_to
Primary key is composed of id and valid_from
Insert requires update of the formerly current tuple
Select, Group By, Join is easy: Where clause eliminates tuples out of range
Finding up-to-date entries can be supported by an additional bit-vector on
column valid_to
id fname lname gender country city birthday valid_from valid_to
0 Martin Albrecht m GER Berlin 08-05-1955 10-11-2011
1 Michael Berg m GER Berlin 03-05-1970 10-11-2011 07-02-2012
2 Hanna Schulze f GER Hamburg 04-04-1968 10-11-2011
3 Anton Meyer m AUT Innsbruck 10-20-1992 10-11-2011
4 Ulrike Schulze f GER Potsdam 09-03-1977 10-11-2011
5 Sophie Schulze f GER Rostock 06-20-2012 10-11-2011
1 Michael Berg m GER Potsdam 03-05-1970 07-02-2012
... ... ... ... ... ... ... ... ...
Zacharias Perdopolus m GRE Athen 03-12-1979 10-11-20118109
87
Snapshot Isolation
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
88/130
Snapshot Isolation Snapshot Isolation guarantees consistent reads during a
transaction, all reads retrieve the values that were active inthe moment the transaction started
Conflicts like lost updates may happen theoretically, but
are prevented through
Pre-write checks in the database or Application-level locks
wants to insert.
An alert is raised since the read
values are no longer valid.
transaction2
time
read1 insert1
read2
insert2
transaction1
transaction2
88
St t U d t
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
89/130
Status Updates
When updates of status fields are changed by replacement, do weneed to insert a new version of the tuple?
Insert Only would lead to an overhead (e.g. clearing in FI)
Most status fields are binary
Uncompressed in-place updates
with row timestamp
Unpaid Paid
t = 2009/06/30t = NULL
89
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
90/130
Handling Data Modifications
Prof. Hasso Plattner
Stephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
Motivation
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
91/130
Motivation
Inserting new tuples directly into a compressed structure can be
expensive
New values can require reorganizing the dictionary
Number of bits required to encode all dictionary values can change,
attribute vector has to be reorganized
Deletion of tuples is expensive
All attribute vectors have to be reorganized, value IDs of following tuples
have to be moved
91
Differential Buffer
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
92/130
Differential Buffer
New values are written to a dedicated differential buffer (Delta)
Cache Sensitive B+ Tree (CSB+) used for faster search on Delta
DictionaryAttribute Vector
fname
(compressed)
Main Store
Dictionary
Attribute Vector
CSB+
fname
Differential Buffer/ Delta
WriteRead
world_population
0 0
1 1
2 1
3 3
4 2
5 1
0 Anton
1 Hanna
2 Michael
3 Sophie
0 Angela
1 Klaus
2 Andre
0 0
1 1
2 1
3 2
8 Billion entriesup to 50,000
entries
92
Diff ti l B ff
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
93/130
Differential Buffer
Inserts of new values are faster, because dictionary andattribute vector does not need to be resorted
Range select on differential buffer expensive, based on
unsorted dictionary
Differential Buffer requires more memory: no attribute vector compression
additional CSB+ Tree for dictionary
93
Tuple Lifetime
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
94/130
Differential Buffer
Main Store
Tuple Lifetime
recId fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-
1955
1 Michael Berg m GER Berlin 03-05-
1970
2 Hanna Schulze f GER Hamburg 04-04-
1968
3 Anton Meyer m AUT Innsbruck 10-20-
1992
4 Ulrike Schulze f GER Potsdam 09-03-
1977
5 Sophie Schulze f GER Rostock 06-20-2012
... ... ... ... ... ... ...
8 * 109 Zacharias Perdopolus m GRE Athen 03-12-
1979
Main Table: world_population
Michael moves from Berlin to Potsdam
UPDATE world_population
SET city=Potsdam
WHERE fname= Michael ANDlname=Berg
94
Tuple Lifetime
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
95/130
Differential Buffer
Main Store
Tuple Lifetime
recId fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-
1955
1 Michael Berg m GER Berlin 03-05-
1970
2 Hanna Schulze f GER Hamburg 04-04-
1968
3 Anton Meyer m AUT Innsbruck 10-20-
1992
4 Ulrike Schulze f GER Potsdam 09-03-
1977
5 Sophie Schulze f GER Rostock 06-20-2012
... ... ... ... ... ... ...
8 * 109 Zacharias Perdopolus m GRE Athen 03-12-
1979
UPDATE world_population
SET city=Potsdam
WHERE fname= Michael ANDlname=Berg
95
Main Table: world_population
Michael moves from Berlin to Potsdam
Tuple Lifetime
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
96/130
Tuple Lifetime
recId fname lname gender country city birthday
0 Martin Albrecht m GER Berlin 08-05-
1955
1 Michael Berg m GER Berlin 03-05-
1970
2 Hanna Schulze f GER Hamburg 04-04-
1968
3 Anton Meyer m AUT Innsbruck 10-20-
1992
4 Ulrike Schulze f GER Potsdam 09-03-
1977
5 Sophie Schulze f GER Rostock 06-20-2012
... ... ... ... ... ... ...
8 * 109 Zacharias Perdopolus m GRE Athen 03-12-
1979
UPDATE world_population
SET city=Potsdam
WHERE fname= Michael ANDlname=Berg
Differential Buffer
Main Store
0 Michael Berg m GER Potsdam 03-05-
1970
96
Main Table: world_population
Michael moves from Berlin to Potsdam
Tuple Lifetime
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
97/130
Tuple Lifetime
Problem: Tuples are now available in Main Store and Differential
Buffer
Tuples of a table are marked by a validity vector to reduce the
required amount of reorganization steps
Like an attribute vector for validity
Invalidated tuples stay in the database table, until the next
reorganization takes place
Search results are reduced using the validity vector
1 bit required per database tuple
97
Tuple Lifetime
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
98/130
recId fname lname gender country city birthday valid
0 Martin Albrecht m GER Berlin 08-05-
1955
1
1 Michael Berg m GER Berlin 03-05-
1970
0
2 Hanna Schulze f GER Hamburg 04-04-
1968
1
3 Anton Meyer m AUT Innsbruck 10-20-
1992
1
4 Ulrike Schulze f GER Potsdam 09-03-
1977
1
5 Sophie Schulze f GER Rostock 06-20-2012
1
... ... ... ... ... ... ...
8 * 109 Zacharias Perdopolus m GRE Athen 03-12-
1979
1
Tuple Lifetime
UPDATE world_population
SET city=Potsdam
WHERE fname= Michael ANDlname=Berg
0 Michael Berg m GER Potsdam 03-05-
1970
1 DifferentialBuffer
Main Store
98
Main Table: world_population
Michael moves from Berlin to Potsdam
Tuple Lifetime
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
99/130
Tuple Lifetime
UPDATE world_population
SET city=Potsdam
WHERE fname= Michael ANDlname=Berg
recId fname lname gender country city birthday valid
0 Martin Albrecht m GER Berlin 08-05-
1955
1
1 Michael Berg m GER Berlin 03-05-
1970
0
2 Hanna Schulze f GER Hamburg 04-04-
1968
1
3 Anton Meyer m AUT Innsbruck 10-20-
1992
1
4 Ulrike Schulze f GER Potsdam 09-03-
1977
1
5 Sophie Schulze f GER Rostock 06-20-2012
1
... ... ... ... ... ... ...
8 * 109 Zacharias Perdopolus m GRE Athen 03-12-
1979
1
0 Michael Berg m GER Potsdam 03-05-
1970
1 DifferentialBuffer
Main Store
99
Main Table: world_population
Michael moves from Berlin to Potsdam
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
100/130
Stored Procedures
Prof. Hasso PlattnerStephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
Facts about
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
101/130
Stored Procedures Basically a procedural program stored in the database Written in a (vendor-) specific language (i.e. PL/SQL, Java, )
Usually support constructs like loops, conditions, i.e.
Takes parameters as input
Returns a result set Main usage: set operations that cannot be expressed in SQL
(or are very hard to express)
Additional usages:
Access control Data validation
Data conversion
101
Advantages
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
102/130
Advantages
Performance No data transfer between database and application server
Code reduction
Share stored procedure written in a generic way
Less code in the application layer
Improved security
No SQL injection possible in stored procedures
102
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
103/130
Implications on ApplicationDevelopment
Prof. Hasso PlattnerStephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
How does it all come together?
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
104/130
How does it all come together?
1. Mixed Workload combining OLTP andanalytic-style queries Column-Stores are best suited for analytic-style queries
In-memory database enables fast tuple re-construction
In-memory column store allows aggregation on-the-fly
2. Sparse enterprise data Lightweight compressionschemes are optimal
Increases query execution
Improves feasibility of in-memory database
3. Mostly read workload Read-optimized stores provide best throughput
i.e. compressed in-memory column-store
Write-optimized store as delta partition
to handle data changes is sufficient
ChangedHardware
Advances indata processing
(software)
Complex
Enterprise Applications
Our focus
104
An In-Memory Database for
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
105/130
Enterprise Applications
In-Memory Database (IMDB)
Data resides permanently
in main memory
Main Memory is the
primary persistence Still: logging to disk/recovery
from disk
Main memory access is
the new bottleneck
Cache-conscious algorithms/data structures are crucial
(locality is king)
105
Main Memoryat Blade i
Log
SnapshotsPassive Data (History)
Non-VolatileMemory
RecoveryLoggingTimetravel
Dataaging
Query Execution Metadata TA Manager
Interface Services and Session Management
Distribution Layerat Blade i
Main Store Differential
Store
Active Data
MergeC
olumn
Column
Combined
Column
Column
Column
Combined
Column
Indexes
Inverted
ObjectData Guide
Simplified Application Development
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
106/130
p pp p
Traditional Column-oriented
Application
cache
Database
cache
Prebuilt
aggregates
Raw data
106
Fewer caches necessary No redundant data
(OLAP/OLTP, LiveCache)
No maintenance of
materialized views oraggregates
Minimal index
maintenance
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
107/130
Enterprise ApplicationPoCs
Prof. Hasso PlattnerStephan Mller
Research Group of Prof. Hasso Plattner
Hasso Plattner Institute for Software EngineeringUniversity of Potsdam
SAP ERP Financials on
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
108/130
In-Memory Technology
In-memory column database for an ERP system
Combined workload (parallel OLTP/OLAP queries)
Leverage in-memory capabilities to
Reduce amount of data
Aggregate on-the-fly
Run analytic-style queries (to replace materialized views)
Execute stored procedures
Use Case: SAP ERP Financials solution
Post and change documents
Display open items
Run dunning job
Analytical queries, such as balance sheet108
Current Financials Solutions
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
109/130
Current Financials Solutions
109
The Target Financials Solution
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
110/130
Only base tables, algorithms, and some indexes
110
Feasibility of Financials on In-
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
111/130
Memory Technology in 2009
Modifications on SAP Financials
Removed secondary indices, sum tables and pre-calculated and
materialized tables
Reduce code complexity and simplify locks
Insert Only to enable history (change document replacement) Added stored procedureswith business functionality
European division of a retailer
ERP 2005 ECC 6.0 EhP3
5.5 TB system database size Financials:
23million headers / 8 GB in main memory
252million items / 50 GB in main memory(including inverted indices for join attributes and insert only extension)
111
In-Memory Financialson
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
112/130
BKPF
accounting documents
BSEG
sum tables
secondary indices
dunning data
change documents
CDHDR
CDPOS
MHNK
MHNDBSAD
BSAK
BSAS
BSID
BSIK
BSIS
LFC1
KNC1
GLT0
SAP ERP
112
In-Memory Financialson
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
113/130
BKPF
accounting documents
BSEG
SAP ERP
113
Reduction by a Factor 10
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
114/130
DBMS IMDB
BKPF 8.7 GB 1.5 GB
BSEG 255 GB 50 GB
Secondary
Indices255 GB -
Sum Tables 0.55 GB -
Complete 519.25 GB 51.5 GB
263.7 GB 51.5 GB
114
Booking an accounting
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
115/130
document
Insert into BKPF and BSEG only Lack of updates reduces locks
115
Dunning Run
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
116/130
Dunning Run
Dunning run determines all open and due invoices
Customer defined queries on 250M records
Current system: 20 min
New logic: 1.5 sec
In-memory column store
Parallelized stored procedures
Simplified Financials
116
Why?
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
117/130
Why?
Being able to perform the dunning run in such a short timelowers TCO
Add more functionality!
Run other jobs in the meantime! - in a multi-tenancy cloud
setup hardware must be used wisely
117
Bring Application Logic
Cl h S
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
118/130
Closer to the Storage
Layer
Select accounts to be dunned, for each:
Select open account items from BSID, for each: Calculate due date
Select dunning procedure, level and area
Create MHNK entries
Create and write dunning item tables
118
Bring Application Logic
Cl h S
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
119/130
Closer to the Storage
Layer
Select accounts to be dunned, for each:
Select open account items from BSID, for each: Calculate due date
Select dunning procedure, level and area
Create MHNK entries
Create and write dunning item tables
1 SELECT
10000 SELECTs
10000 SELECTs
31000 Entries
119
Bring Application Logic
Cl h S
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
120/130
Closer to the Storage
Layer
Select accounts to be dunned, for each:
Select open account items from BSID, for each: Calculate due date
Select dunning procedure, level and area
Create MHNK entries
Create and write dunning item tables
1 SELECT
10000 SELECTs
10000 SELECTs
31000 Entries
One single
stored
procedure
executedwithin newDB
120
Bring Application Logic
Cl h S
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
121/130
Closer to the Storage
Layer
Select accounts to be dunned, for each:
Select open account items from BSID, for each: Calculate due date
Select dunning procedure, level and area
Create MHNK entries
Create and write dunning item tables
One single
stored
procedure
executedwithin newDB
Calculated on-the-fly
121
Factor: 800x AccelerationQuantity: 250 mio items, 380k open, 200k due
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
122/130
# Operation Original Version 1 Variant 2 Variant 3
1 Select open items 0.63s 1.01s(incl. T047 & KNB5
Join)
0.6s(incl. T047 & KNB5 Join)
2Due date,
dunning level27s
Deferred to
aggregation0.5s
3
Filter 1
(verify dunninglevels)
~19s 1.1s 0.5s
4
Filter 2
(check last
dunning)
~15s 0.8s 0.4s
5 Generate MHNK(aggregate)
done in #1 1.2s Done in #1
6Generate MHND
(execute filters)done in #1 140ms Done in #1
Total~20 minutes ~1 minute ~3.0s
(#3, #4 exec. in parallel)
~1.5s(#2, #3, #4 exec. in parallel)
AdditionalInformation
Hardware:4CPU
sx6cores,
256GBRAM
y p
122
Dunning Application
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
123/130
Dunning Application
123
Dunning Application
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
124/130
Dunning Application
124
Available-to-Promise Check
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
125/130
Available to Promise Check
Can I get enough quantities of a requested product on a desireddelivery date?
Goal: Analyze and validate the potential of in-memory andhighly parallel data processing for Available-to-Promise (ATP)
Challenges Dynamic aggregation
Instant rescheduling in minutes vs. nightly batch runs
Real-time and historical analytics
Outcome
Real-time ATP checks without materialized views Ad-hoc rescheduling
No materialized aggregates
125
In-Memory Available-to-
P i
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
126/130
Promise
126
Demand Planning
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
127/130
g
Flexible analysis of demandplanning data
Zooming to choose granularity
Filter by certain products orcustomers
Browse through time spans
Combination of location-basedgeo data with planning data in anin-memory database
External factors such as thetemperature, or the level ofcloudiness can be overlaid toincorporate them in planningdecisions
127
GORFID
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
128/130
HANA for Streaming Data Processing
Use Case: In-Memory RFID DataManagement
Evaluation of SAP OER
Prototypical implementation of: RFID Read Event Repository on HANA
Discovery Service on HANA (10 Billion datarecords with ca. 3 seconds response time)
Frontends for iPhone, iPad2
Key Findings:
HANA is suited for streaming data(using bulk inserts)
Analytics on streaming data is now possible
128
GORFID: Near Real-Time
C t
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
129/130
as a Concept
Bulk load every2-3 seconds:
> 50,000 inserts/s
129
Thanks!
-
5/28/2018 Technical Deep-Dive in a Column-Oriented in-Memory Database
130/130
Questions?
Online Lecture (starting Sept 3rd) http://openhpi.de