introduction to data-oriented design
DESCRIPTION
Slides for a talk I gave an IT Weekend Rivne, November 2014. And I'm too lazy to add comments for each slide :)TRANSCRIPT
![Page 1: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/1.jpg)
Introduction to Data-Oriented Design
@YaroslavBunyak Senior Software Engineer, SoftServe
![Page 2: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/2.jpg)
Programming, M**********r Do you speak it?
![Page 3: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/3.jpg)
Story
![Page 4: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/4.jpg)
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
![Page 5: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/5.jpg)
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
![Page 6: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/6.jpg)
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
![Page 7: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/7.jpg)
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
![Page 8: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/8.jpg)
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
![Page 9: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/9.jpg)
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
![Page 10: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/10.jpg)
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
![Page 11: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/11.jpg)
Sieve of Eratosthenes
Simple algorithm
Easy to implement
![Page 12: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/12.jpg)
Sieve of Eratosthenesint array[SIZE];
array[i] = 1;
if (array[i]) ...
!
int bits[SIZE / 32];
bits[i / 32] |= 1 << (i % 32);
if (bits[i / 32] & (1 << (i % 32))) ...
![Page 13: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/13.jpg)
Sieve of Eratosthenes
Simple algorithm
Easy to implement
But...
unexpected results
![Page 14: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/14.jpg)
Sieve of Eratosthenes
The second implementation (bitset) is 3-5x faster than first (array)
Even though it actually does more work
![Page 15: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/15.jpg)
Why?!.
![Page 16: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/16.jpg)
Fast Forward
![Page 17: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/17.jpg)
...
• Years have passed
• I become a software engineer
• And one day...
![Page 18: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/18.jpg)
This Graph
Slide 17
CPU/Memory performance
Computer architecture: a quantitative approachBy John L. Hennessy, David A. Patterson, Andrea C. Arpaci-Dusseau
![Page 19: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/19.jpg)
This Table1980 Modern PC Improvement, %
Clock speed, Mhz 6 3000 +500x
Memory size, MB 2 2000 +1000x
Memory bandwidth, MB/s 137000 (read) 2000 (write)
+540x +150x
Memory latency, ns 225 ~70 +3x
Memory latency, cycles 1.4 210 -150x
![Page 20: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/20.jpg)
• CPU registers
• Cache Level 1
• Cache Level 2
• RAM
• HDD
Memory HierarchyCPU
RAM
Disk
L1i Cache
L1d Cache
L2 Cache
![Page 21: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/21.jpg)
Distance Metaphor
• L1 cache: it's on your desk, pick it up.
• L2 cache: it's on the bookshelf in your office, get up out of the chair.
• Main memory: it's on the shelf in your garage downstairs, might as well get a snack while you're down there.
• Disk: it's in, um, California. Walk there. Walk back. Really.
http://hacksoflife.blogspot.com/2011/04/going-to-california-with-aching-in-my.html
![Page 22: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/22.jpg)
Fact
• Memory access is expensive
• CPU cycles are cheap
![Page 23: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/23.jpg)
Modern Programming
• High-level languages and abstractions
• OOP
• everywhere!
• objects scattered throughout the address space
• memory access patterns are unpredictable
![Page 24: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/24.jpg)
Meet Data-Oriented Design
![Page 25: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/25.jpg)
Ideas
• code transforms data
• data >> code
• hardware is not a black box
![Page 26: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/26.jpg)
Program
data dataxform
![Page 27: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/27.jpg)
Example 1: AoS vs SoAstruct Tile
{
bool ready;
Data pixels; // big chunk of data
};
Tile tiles[SIZE];
vs
struct Image
{
bool ready[SIZE]; // hot data
Data pixels[SIZE]; // cold data
};
![Page 28: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/28.jpg)
Example 1: AoS vs SoAfor (int i = 0; i < SIZE; ++i)
{
if (tiles[i].ready)
draw(tiles[i].pixels);
}
!vs
for (int i = 0; i < SIZE; ++i)
{
if (image.ready[i])
draw(image.pixels[i]);
}
![Page 29: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/29.jpg)
Example 1: AoS vs SoA!!!!!!
vs
!!!!
![Page 30: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/30.jpg)
By The Way
• Memory loads in chunks, not single bytes
• One such chunk is called a cache line
• Typical size: 64 or 128 bytes
![Page 31: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/31.jpg)
Example 1: AoS vs SoA!!!!!!
vs
!!!!
![Page 32: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/32.jpg)
Example 2: Existencestruct Image
{
bool ready[SIZE];
Data pixels[SIZE];
};
Image image;
vs
Data ready_pixels[N];
// N ≤ SIZE
!
![Page 33: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/33.jpg)
Example 2: Existencefor (int i = 0; i < SIZE; ++i)
{
if (image.ready[i])
draw(image.pixels[i]);
}
!vs
for (int i = 0; i < N; ++i)
{
draw(ready_pixels[i];
}
![Page 34: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/34.jpg)
Example 3: Locality!array<float> numbers;
float sum = 0.0f;
for (auto it : numbers)
sum += *it;
!vs
list<float> numbers;
float sum = 0.0f;
for (auto it : numbers)
sum+ = *it;
![Page 35: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/35.jpg)
Example 3: Locality!!!!!!
vs
!!!!
![Page 36: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/36.jpg)
Advice
• Keep your data closer to registers and cache (hot data)
• Don’t touch what you don’t have to (cold data)
• Predictable access patterns (e.g. linear arrays) - good
• What’s good for memory - good for you
![Page 37: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/37.jpg)
DOD Patterns
• A to B transform
• In-place transform
• Existence based processing
• Data normalization
• DB design says hello!
• Task, gather, dispatch, and more...
![Page 38: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/38.jpg)
DOD Benefits
• Maximum performance
• CPU doesn’t wait & starve
• Easy to parallelize
• data is grouped, transforms separated
• ready for Parallel Processing, OOP doesn’t
• Simpler code
• surprise!
![Page 39: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/39.jpg)
References: Memory
• Ulrich Drepper “What Every Computer Programmer Should Know About Memory”
• Крис Касперски “Техника оптимизации програм. Еффективное использование памяти”
• Christer Ericson “Memory Optimization”
• Igor Ostrovsky “Gallery of Processor Cache Effects”
![Page 40: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/40.jpg)
References: DOD• Noel Llopis “Data-Oriented Design”, Game Developer
Magazine, September 2009
• Richard Fabian “Data-Oriented Desing”, book draft http://www.dataorienteddesign.com/dodmain/
• Tony Albrecht “Pitfalls of Object-Oriented Programming”
• Niklas Frykholm “Practical Examples of Data Oriented Design”, also everything on http://bitsquid.blogspot.com/
• Mike Acton “Typical C++ Bullshit”
• Data Oriented Design @ Google+
![Page 41: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/41.jpg)
Bonus: Object or not?Q: What is a table?
A: Flat top and 4 legs.
Q: Object? (OOP)
A: Yes.
Q: If we remove one leg. Is it still an object?
A: …
DOD: There is no table :)
![Page 42: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/42.jpg)
Bonus: Object or not?Q: You are modelling a pile of sand. Is it an object?
A: Yes.
Q: What is the border line number of particles N after which just a bunch of sand particles start forming a pile? 10? 1000? 1000000?
(i.e. can we say that N particles are just a bunch of particles, but N+1 particles become a pile of sand?)
A: …
DOD: Sand particles are data.
![Page 43: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/43.jpg)
Thank You!
![Page 44: Introduction to Data-Oriented Design](https://reader033.vdocument.in/reader033/viewer/2022061216/54b20e8f4a79596f298b456e/html5/thumbnails/44.jpg)
Q?