quantum performance e ects - qcon · system under test intel ® core i5-5300u [2.3 ghz] 1x2x2...
TRANSCRIPT
![Page 1: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/1.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
“Quantum” Performance Effects:Beyond The Core
Sergey Kuksenko
Java Platform Group, Oracle
October, 2018
![Page 2: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/2.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Safe Harbor Statement
The following is intended to outline our general product directon. It is intended for informaton purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functonality, and should not be relied upon in making purchasing decisions. The development, release, tming, and pricing of any features or functonality described for Oracle’s products may change and remains at the sole discreton of Oracle Corporaton.
![Page 3: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/3.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
About me
• Java/JVM Performance Engineer at Oracle, @since 2010
• Java/JVM Performance Engineer, @since 2005
• Java/JVM Engineer, @since 1996
3
![Page 4: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/4.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
System Under Test
• Intel® Core� i5-5300U [2.3 GHz] 1x2x2– μarch: Haswell– launched: Q1’2015s
• OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic)
• Java 8 (64-bits)
• Java 11 (64-bits)
4
![Page 5: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/5.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo code
https://github.com/kuksenko/quantum2
• Required: JMH (Java Microbenchmark Harness)– http://openjdk.java.net/projects/code-tools/jmh/
5
![Page 6: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/6.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo code
https://github.com/kuksenko/quantum2
• Required: JMH (Java Microbenchmark Harness)– http://openjdk.java.net/projects/code-tools/jmh/
5
![Page 7: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/7.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: How to copy 2 Mbytes.
6
![Page 8: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/8.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1
int[] a = new int[512*1024];int[] b = new int[512*1024];
@Benchmarkpublic void arraycopy() {
System.arraycopy(a, 0, b, 0, a.length);}
@Benchmarkpublic void reversecopy() {
for(int i = a.length - 1; i >= 0; i--) {b[i] = a[i];
}}
740 μs
300 μs??
* Using Java 8
7
![Page 9: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/9.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1
int[] a = new int[512*1024];int[] b = new int[512*1024];
@Benchmarkpublic void arraycopy() {
System.arraycopy(a, 0, b, 0, a.length);}
@Benchmarkpublic void reversecopy() {
for(int i = a.length - 1; i >= 0; i--) {b[i] = a[i];
}}
740 μs
300 μs??
* Using Java 8
7
![Page 10: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/10.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Conclusions?
• Oracle engineers - rubbish!
– I know how to copy faster!
8
![Page 11: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/11.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Conclusions?
• Oracle engineers - rubbish!
– I know how to copy faster!
8
![Page 12: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/12.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Shared results within team
• What I got:
– arraycopy vs reversecopy: 740 vs 300 μs
• What Bob got (on some MacBook Pro):
– arraycopy vs reversecopy: 190 vs 185 μs
• What Alice got (she already migrated to JDK11):
– arraycopy vs reversecopy: 270 vs 280 μs
• What if copy less data ”2Mbytes - 32 bytes”:
– arraycopy vs reversecopy: 280 vs 720 μs
9
![Page 13: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/13.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Shared results within team
• What I got:
– arraycopy vs reversecopy: 740 vs 300 μs
• What Bob got (on some MacBook Pro):
– arraycopy vs reversecopy: 190 vs 185 μs
• What Alice got (she already migrated to JDK11):
– arraycopy vs reversecopy: 270 vs 280 μs
• What if copy less data ”2Mbytes - 32 bytes”:
– arraycopy vs reversecopy: 280 vs 720 μs
9
![Page 14: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/14.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Shared results within team
• What I got:
– arraycopy vs reversecopy: 740 vs 300 μs
• What Bob got (on some MacBook Pro):
– arraycopy vs reversecopy: 190 vs 185 μs
• What Alice got (she already migrated to JDK11):
– arraycopy vs reversecopy: 270 vs 280 μs
• What if copy less data ”2Mbytes - 32 bytes”:
– arraycopy vs reversecopy: 280 vs 720 μs
9
![Page 15: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/15.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Shared results within team
• What I got:
– arraycopy vs reversecopy: 740 vs 300 μs
• What Bob got (on some MacBook Pro):
– arraycopy vs reversecopy: 190 vs 185 μs
• What Alice got (she already migrated to JDK11):
– arraycopy vs reversecopy: 270 vs 280 μs
• What if copy less data ”2Mbytes - 32 bytes”:
– arraycopy vs reversecopy: 280 vs 720 μs
9
![Page 16: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/16.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
spent a billion on research
• MacOS doesn’t support ”Large Pages”!
– Ubuntu - ”Transparent Huge Pages”
• G1 is default GC since Java 9!
– Java 8 default GC - ”ParallelOld”
Conclusions:• Large Pages - Rubbish!
• G1 GC - Cool!
10
![Page 17: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/17.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
spent a billion on research
• MacOS doesn’t support ”Large Pages”!
– Ubuntu - ”Transparent Huge Pages”
• G1 is default GC since Java 9!
– Java 8 default GC - ”ParallelOld”
Conclusions:• Large Pages - Rubbish!
• G1 GC - Cool!
10
![Page 18: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/18.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
spent a billion on research
• MacOS doesn’t support ”Large Pages”!
– Ubuntu - ”Transparent Huge Pages”
• G1 is default GC since Java 9!
– Java 8 default GC - ”ParallelOld”
Conclusions:• Large Pages - Rubbish!
• G1 GC - Cool!
10
![Page 19: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/19.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
To Be Continued ...
11
![Page 20: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/20.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: How many data?
12
![Page 21: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/21.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: The Last Jedi Refactoring
public class MyData {
private byte[] bytes;private int length;
public MyData(int length) {this.bytes = new byte[length];this.length = length;
}
public int length() { return length; }
public byte[] bytes() { return bytes; }}
13
![Page 22: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/22.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: dataSize(MyData)
MyData[] data = new MyData[256];
@Setuppublic void setup() {
Random rnd = new Random();Arrays.setAll(data, i -> new MyData(512 * 1024 + rnd.nextInt(64 * 1024)));
}
@Benchmarkpublic int dataSize() {
int s = 0;for (MyData a : data) {
s += a.length();}return s;
}
14
![Page 23: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/23.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: dataSize(byte[])
byte[][] data = new byte[256][];
@Setuppublic void setup() {
Random rnd = new Random();Arrays.setAll(data, i -> new byte[512 * 1024 + rnd.nextInt(64 * 1024)]);
}
@Benchmarkpublic int dataSize() {
int s = 0;for (byte[] a : data) {
s += a.length;}return s;
}
15
![Page 24: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/24.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: results (Java 8)
DataSize(MyData) 145 nsDataSize(byte[]) 200 ns
What if turn on G1? (-XX:+UseG1GC)
DataSize(MyData) 145 nsDataSize(byte[]) 13045 ns
?
???
16
![Page 25: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/25.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: results (Java 8)
DataSize(MyData) 145 nsDataSize(byte[]) 200 ns
What if turn on G1? (-XX:+UseG1GC)
DataSize(MyData) 145 nsDataSize(byte[]) 13045 ns
?
???
16
![Page 26: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/26.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: results (Java 8)
DataSize(MyData) 145 nsDataSize(byte[]) 200 ns
What if turn on G1? (-XX:+UseG1GC)
DataSize(MyData) 145 nsDataSize(byte[]) 13045 ns
?
???
16
![Page 27: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/27.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: results
What if turn off ”Large Pages”?
ParallelOld GC:DataSize(MyData) 145 nsDataSize(byte[]) 250 ns
G1 GC:DataSize(MyData) 145 nsDataSize(byte[]) 635 ns
??
17
![Page 28: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/28.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: Conclusions
Conclusions:• Large Pages - Rubbish!
• G1 GC - Rubbish!
18
![Page 29: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/29.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: Conclusions
Conclusions:• Large Pages - Rubbish!
• G1 GC - Rubbish!
18
![Page 30: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/30.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
To Be Continued ...
19
![Page 31: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/31.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Why we are here?
20
![Page 32: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/32.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Caches, caches everywhere
21
![Page 33: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/33.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Caches in numbers (Intel Core i5-5300U)
L1 - 32K, 8-way, latency: 4 cycles
L2 - 256K, 8-way, latency: 12 cycles
L3 - 3M, 12-way, latency: 35(and more) cycles
- cache line - 64 bytes
22
![Page 34: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/34.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 3: memory access cost.
23
![Page 35: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/35.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 3: walking on memory
Node root;
@Benchmark@OperationsPerInvocation(COUNT)public int walk() {
return forward(root, COUNT);}
public int forward(Node node, int cnt) {for(int i=0; i < cnt; i++) {
node = node.next;}return node.value;
}
24
![Page 36: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/36.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 3: walking on memory
2.2 ns
5.2 ns
15.4 ns
35 ns
25
![Page 37: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/37.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 3: walking on memory
What about HW prefetching?
26
![Page 38: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/38.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 3: different mix
27
![Page 39: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/39.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 3: different mix
28
![Page 40: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/40.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 3: different mix
2.7x
29
![Page 41: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/41.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 3: different mix
4x
30
![Page 42: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/42.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 3: different mix
12.6x
31
![Page 43: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/43.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: to split or not to split?
32
![Page 44: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/44.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: Good old Unsafe!Unsafe UNSAFE;
long from; // page alignment
@Param({"-8", "-4", "-2", "0", "2", "4", "8" })int offset; // offset in bytes
@Benchmarkpublic long getlong() {
return UNSAFE.getLong(a, from + offset);}
@Benchmarkpublic void putlong() {
UNSAFE.putLong(a, from + offset, 42L);}
33
![Page 45: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/45.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: Results
offset getlong putlong-8 5.0 1.8-4 19.1 17.80 5.0 1.860 5.2 2.564 5.0 1.8
time, ns/op
unaligned data:
Page Split!
Line Split!
34
![Page 46: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/46.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: Results
offset getlong putlong-8 5.0 1.8-4 19.1 17.80 5.0 1.860 5.2 2.564 5.0 1.8
time, ns/op
unaligned data:
Page Split!
Line Split!
34
![Page 47: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/47.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: Misalignment
But wait!
Java doesn’t have misaligned data!
There are no misaligned data,
but there are misaligned operations.
35
![Page 48: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/48.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: Misalignment
But wait!
Java doesn’t have misaligned data!
There are no misaligned data,
but there are misaligned operations.
35
![Page 49: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/49.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: Misalignment
Java misaligned access:• Unsafe/VarHandle
– Buffers
– Offheap
• SIMD instructions (SSE, AVX ...)
– HotSpot intrinsics (System.arraycopy, Arrays.fill ...)– Automatic vectorization
36
![Page 50: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/50.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: Arrays.fill
int from; // alignment to page boundary
int size;
int offset;
byte[] a;
@Benchmarkpublic void fill() {
Arrays.fill(a, from + offset, from + offset + size, (byte)42);}
37
![Page 51: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/51.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: Arrays.fill, 512 bytes
38
![Page 52: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/52.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 4: Arrays.fill, 512 bytes
39
![Page 53: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/53.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 5: upside down
40
![Page 54: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/54.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 5: matrix transpose
int size;
double[][] matrix = new double[size][size];
@Benchmarkpublic void transpose() {
for (int i = 1; i < size; i++) {for (int j = 0; j < i; j++) {
double tmp = matrix[i][j];matrix[i][j] = matrix[j][i];matrix[j][i] = tmp;
}}
}
41
![Page 55: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/55.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 5: results (NxN)
NN+0
88 μs
N+1
350 μs
N+2
94 μs
N+3 80 μs
42
![Page 56: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/56.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 5: results (NxN)
NN+0
88 μs
N+1
350 μs
N+2 94 μsN+3 80 μs
42
![Page 57: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/57.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 5: results (NxN)
NN+0 88 μsN+1
350 μs
N+2 94 μsN+3 80 μs
42
![Page 58: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/58.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 5: results (NxN)
NN+0 88 μsN+1 350 μsN+2 94 μsN+3 80 μs
42
![Page 59: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/59.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 5: results (NxN)
N253 88 μs254 350 μs255 94 μs256 80 μs
42
![Page 60: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/60.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 5: matrix transpose
43
![Page 61: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/61.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Cache Associativity
44
![Page 62: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/62.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Critical Stride
⟨Crtc Strde⟩ = ⟨Cche Sze⟩⟨Assoctty⟩
• L1 (32K, 8-way) ⇒ 4K
• L2 (256K, 8-way) ⇒ 32K
• L3 (3M, 12-way) ⇒ 256K
45
![Page 63: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/63.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 2: How many data?(cont.)
46
![Page 64: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/64.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
”critical stride” hit
Let’s count:
• G1 GC– all arrays are aligned to 1M (256K, 32K, 4K)
• ParallelOld GC– 256 arrays ⇒ 254 different ”index sets” в L3– 256 arrays ⇒ 251 different ”index sets” в L2– 256 arrays ⇒ 62 different ”index sets” в L1
– number of hits to L1 index sets:10, 9, 8, 8, 8, 7, 7...
47
![Page 65: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/65.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 6: the rich get richer
48
![Page 66: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/66.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 6: Walking dead threads
@Benchmark@Group("pair")@OperationsPerInvocation(COUNT)public int bob() {
return forward(root, COUNT);}
@Benchmark@Group("pair")@OperationsPerInvocation(COUNT)public int alice() {
return forward(root, COUNT);}
Each thread has it’s own rootand independent data.
49
![Page 67: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/67.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 6: 128K per thread
Iteration 1:bob: 5.246 ns/opalice: 5.241 ns/op
Iteration 2:bob: 5.254 ns/opalice: 5.272 ns/op
Iteration 3:bob: 5.233 ns/opalice: 5.244 ns/op
Iteration 4:bob: 5.244 ns/opalice: 5.232 ns/op
50
![Page 68: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/68.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 6: 1M per thread
Iteration 1:bob: 14.495 ns/opalice: 14.614 ns/op
Iteration 2:bob: 14.289 ns/opalice: 14.331 ns/op
Iteration 3:bob: 14.242 ns/opalice: 14.296 ns/op
Iteration 4:bob: 14.332 ns/opalice: 14.332 ns/op
51
![Page 69: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/69.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 6: 2M per thread
Iteration 1:bob: 17.199 ns/opalice: 48.845 ns/op
Iteration 2:bob: 46.777 ns/opalice: 20.850 ns/op
Iteration 3:bob: 17.046 ns/opalice: 48.686 ns/op
Iteration 4:bob: 46.422 ns/opalice: 20.704 ns/op
Fight for LLC!
52
![Page 70: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/70.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 6: 2M per thread
Iteration 1:bob: 17.199 ns/opalice: 48.845 ns/op
Iteration 2:bob: 46.777 ns/opalice: 20.850 ns/op
Iteration 3:bob: 17.046 ns/opalice: 48.686 ns/op
Iteration 4:bob: 46.422 ns/opalice: 20.704 ns/op
Fight for LLC!
∼ 1 ≤ ⟨Tot Workng Set⟩⟨LLC sze⟩ ≤∼ 2.5
52
![Page 71: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/71.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 7: Bytes histogram
53
![Page 72: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/72.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 7: count bytes frequency
byte[] source; // SIZE == 16 * K;
@Benchmarkpublic int[] count1() {
int[] table = new int[256];for (byte v : source) {
table[v & 0xFF]++;}return table;
}
54
![Page 73: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/73.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 7: count bytes frequency
byte[] source; // SIZE == 16 * K;
@Benchmarkpublic int[] count1() {
int[] table = new int[256];for (byte v : source) {
table[v & 0xFF]++;}return table;
}
13.7 μs
54
![Page 74: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/74.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 7: count bytes frequency
byte[] source; // SIZE == 16 * K;
@Benchmarkpublic int[] count1() {
int[] table = new int[256];for (byte v : source) {
table[v & 0xFF]++;}return table;
}
13.7 μs
What if the data is unevenly distributed?
54
![Page 75: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/75.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Results
55
![Page 76: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/76.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Store Buffer
CPU
L1 Cache
4-5clocks
56
![Page 77: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/77.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Store Buffer
CPU
L1 Cache
4-5clocks
StoreBuffer
56
![Page 78: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/78.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Store Forwarding
Store A;
Load B;
Load A;
57
![Page 79: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/79.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Store Forwarding
Store A;
Load B;
Load A;
* Store Buffer
Store A;
57
![Page 80: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/80.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Store Forwarding
Store A;
Load B;
Load A;
*
Store Buffer
Store A;
No “B” in Store BufferExecute!
even before Store
57
![Page 81: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/81.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Store Forwarding
Store A;
Load B;
Load A;*
Store Buffer
Store A;
“A” exists inStore Buffer What to do?
57
![Page 82: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/82.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Hit to “Store Buffer”
• Wait until “Store A” reaches L1 (expensive)
• Take value from Store Buffer (a.k.a. “Store Forwarding”)
58
![Page 83: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/83.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Let’s do this
@Benchmarkpublic int[] count2() {
int[] table0 = new int[256];int[] table1 = new int[256];for (int i = 0; i < source.length; ) {
table0[source[i++] & 0xFF]++;table1[source[i++] & 0xFF]++;
}for (int i = 0; i < 256; i++) {
table0[i] += table1[i];}return table0;
}
59
![Page 84: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/84.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
... and this@Benchmarkpublic int[] count4() {
int[] table0 = new int[256];int[] table1 = new int[256];int[] table2 = new int[256];int[] table3 = new int[256];for (int i = 0; i < source.length; ) {
table0[source[i++] & 0xFF]++;table1[source[i++] & 0xFF]++;table2[source[i++] & 0xFF]++;table3[source[i++] & 0xFF]++;
}for (int i = 0; i < 256; i++) {
table0[i] += table1[i] + table2[i] + table3[i];}return table0;
}
60
![Page 85: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/85.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Results
61
![Page 86: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/86.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 8: bytes ⇔ int
62
![Page 87: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/87.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 8: bytes ⇔ int
ByteBuffer buf = ByteBuffer.allocateDirect(4);
@Benchmarkpublic int bytesToInt() {
buf.put(0, b0);buf.put(1, b1);buf.put(2, b2);buf.put(3, b3);return buf.getInt(0);
}
@Benchmarkpublic int intToBytes() {
buf.putInt(0, i0);return buf.get(0) + buf.get(1) +
buf.get(2) + buf.get(3);}
13.2 ns
7.9 ns
63
![Page 88: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/88.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 8: bytes ⇔ int
ByteBuffer buf = ByteBuffer.allocateDirect(4);
@Benchmarkpublic int bytesToInt() {
buf.put(0, b0);buf.put(1, b1);buf.put(2, b2);buf.put(3, b3);return buf.getInt(0);
}
@Benchmarkpublic int intToBytes() {
buf.putInt(0, i0);return buf.get(0) + buf.get(1) +
buf.get(2) + buf.get(3);}
13.2 ns
7.9 ns
63
![Page 89: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/89.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 8: Store Forwarding success
int
byte
byte
byte
byte
Store
Load
Load
Load
Load
64
![Page 90: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/90.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 8: Store Forwarding fail
int
byte
byte
byte
byte
Store
Store
Store
Store
Load
65
![Page 91: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/91.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: back to arraycopy
66
![Page 92: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/92.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: looking into asm
loop: vmovdqu -0x38(%rdi,%rdx,8),%ymm0vmovdqu %ymm0,-0x38(%rsi,%rdx,8)vmovdqu -0x18(%rdi,%rdx,8),%ymm1vmovdqu %ymm1,-0x18(%rsi,%rdx,8)add $0x8,%rdxjle loop
loop: vmovdqu -0xc(%r8,%rbx,4),%ymm0vmovdqu %ymm0,-0xc(%r10,%rbx,4)add $0xfffffff8,%ebxcmp $0x6,%ebxjg loop
22
arraycopy reversecopy
67
![Page 93: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/93.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: What about memory layout?
• ParallelOld GC– AddressOf(a) == 0x76d890628– AddressOf(b) == 0x76da90638– AddressOf(b) − AddressOf(a) == 2Mb + 16
• G1 GC– AddressOf(a) == 0x6c7200000– AddressOf(b) == 0x6c7500000– AddressOf(b) − AddressOf(a) == 3Mb
68
![Page 94: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/94.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: What about memory layout?
• ParallelOld GC– AddressOf(a) == 0x76d890628– AddressOf(b) == 0x76da90638– AddressOf(b) − AddressOf(a) == 2Mb + 16
• G1 GC– AddressOf(a) == 0x6c7200000– AddressOf(b) == 0x6c7500000– AddressOf(b) − AddressOf(a) == 3Mb
68
![Page 95: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/95.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: 4K-aliasing
HW uses 12 lower bits of address to detect Store Buffer conflicts.
• address difference 4K (12 bit)
• “Load” can’t bypass “Store”
• “Store Forwarding” can’t help - different addresses.
HW recovery:– wait until “Store” is finished– “clear pipeline” in case of speculation
69
![Page 96: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/96.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: arraycopy trace
text
Load A;Store B;
Load A + 32;Store B + 32;
Load A + 64;Store B + 64;
...
70
![Page 97: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/97.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: arraycopy trace
B == A + 2M + 16;
Load A;Store A + 2M + 16;
Load A + 32;Store A + 2M + 48;
Load A + 64;Store A + 2M + 80;
...
71
![Page 98: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/98.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: arraycopy trace
B == A + 2M + 16;
address % 4096Load A; 0Store A + 2M + 16; 16
Load A + 32; 32Store A + 2M + 48; 48
Load A + 64; 64Store A + 2M + 80; 80
...
4K-aliasing
72
![Page 99: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/99.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: 1K copying
73
![Page 100: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/100.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: 1K copying
Everything fine
73
![Page 101: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/101.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: 1K copying
Everything fine
Misaligned access
73
![Page 102: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/102.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: 1K copying
Everything fine
Misaligned access
”4K-aliasing”
73
![Page 103: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/103.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: too many details
74
![Page 104: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/104.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: It’s not the end
turned on ”Large Pages”addresses difference 1M
75
![Page 105: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/105.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Demo 1: All together
Data copying performance depends onhow data located in memory
• Line split
• Page split
• 4K-aliasing
• ”1M & large pages aliasing” (still didn’t find an explanation)
76
![Page 106: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/106.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Conclusion
77
![Page 107: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/107.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
To read!
• “What Every Programmer Should Know About Memory”Ulrich Drepper
• “Computer Architecture: A Quantitative Approach”John L. Hennessy, David A. Patterson
• CPU vendors documentation
• http://www.agner.org/optimize/• etc.
78
![Page 108: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/108.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Thank you!
79
![Page 109: Quantum Performance E ects - QCon · System Under Test Intel ® Core i5-5300U [2.3 GHz] 1x2x2 žarch: Haswell launched: Q1'2015s OS: Xubuntu 18.04 (64-bits) (4.15.0-36-generic) Java](https://reader034.vdocument.in/reader034/viewer/2022052014/602ae5df74e075681f7b00e3/html5/thumbnails/109.jpg)
Copyright © 2018, Oracle and/or its afliates. All rights reserved.
Q & A ?
80