turbo-geth: optimising ethereum client(s).ppt - edcon.io akhunovturbo-geth... · how it started 30...

22
Turbo-Geth - optimising Ethereum client(s) Alexey Akhunov Supported by a grant from Ethereum Foundation

Upload: duongmien

Post on 05-Oct-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Turbo-Geth - optimising Ethereum client(s)

Alexey Akhunov

Supported by a grant from Ethereum Foundation

Page 2: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

How it started30 November 2017

> git clone [email protected]:ethereum/go-ethereum.git> cd go-ethereum> make> ./build/bin/geth

… 5 days later

> ./build/bin/geth --cpuprofile cpu.prof

… 3 hours later

> go tool pprof cpu.prof(pprof)> png

Page 3: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

How it started

Page 4: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Persistent storage - data structuresData on a blockchain has to be stored in a tamper-proof data structures. Merkle tree is an implementation of tamper-proof list. Its main feature is short proof of membership:

?

proof of ‘?’ membership

?

hash of concatenation

?

Page 5: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

What about proof of non-membership?

How would we prove non-membership? In an ordinary Merkle tree, such proof consists of the entire tree, which is no good. Another alternative would be a sorted Merkle tree:

proof of ‘k’ non-membership

no ‘k’

a b e g h j l nh j l n

Page 6: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Encoding modifications

a b e g h j l n

old root before deletion

new root after deletion

Simply sorting the list is not enough to efficiently encode modifications - one needs to preserve the structure of the tree to reuse most of the nodes

Page 7: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Merkle radix tree (trie){‘cat’:4, ‘duck’:2, ‘dog’:4, ‘bug’:6}

c da ...b

c da ...b ... u... ...o

g h... ... t u... ... g h... ...

c da ...b

u v... ...

y z... 6 y z... 4

k l... ...

y z... 2

y z... 4

hash of concatenation:

c da ...b

does not actually store digits, but hashes from lower level in fixed-sized array (length = number of digits)

Page 8: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Merkle radix tree (trie)Merkle radix tree supports compact encoding of modifications, short proofs of membership and non-memberships. However, if the keys are relatively long, representation tends to be quite inefficient:

Note: inefficiency comes from the fact that long chains of digits for the keys are encoded by the chains of tree nodes, 1 node per digit

V1

V2

V3 V4

V5

Page 9: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Merkle Patricia tree (trie)PATRICIA stands for “Practical Algorithm To Retrieve Information Coded In Alphanumeric” and was invented in 1968 independently by Donald Morrison and Gernot Gwehenberger.

Patricia tree in Ethereum’s version addresses the inefficiency of radix tree by having 3 types of nodes:

branches 1,2: V1 leaves 3,4,1 extensions

fullNode in geth code

shortNode in geth code

Page 10: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Merkle radix tree (trie)If we transform the radix tree from earlier to Patricia tree, it will look like this:

1,2: V1V1

V2

V3 V4

V5

3,4: V2

2,3,3: V3

3,4,1

V4

V5

Ethereum: radix 16 (0,1,2,...,b,c,d,e,f), all keys 256 bit == 32 bytes == 64 nibbles

Page 11: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

RLP - Recursive Length Prefix encodingRLP is able to encode two types of structures: byte arrays of arbitrary size and sequences of other RLP-encoded structures. Interpretation of RLP stream:

1st byte < 128

single byte as it is

128<= 1st byte < 184

byte array of length < 56

len+128

184<= 1st byte < 192

byte array of length >= 56

len(len)+183

len as BE int

192<= 1st byte < 248

sequence of RLPs with total serialised length < 56

ser.len+192

RLP1 RLP2

RLP3 ...

248<= 1st byte

sequence of RLPs with total serialised length >= 56

len(ser.len)+247

ser.len as BE int

RLP1 RLP2

RLP3 ...

Page 12: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Persistence of Patricia tree in geth

1,2: V1

3,4: V2

2,3,3: V3V4

V5

Key Value

1,2: V1sha3(rlp( )) 1,2: V1rlp( )

sha3(rlp( )) rlp( )3,4: V2 3,4: V2

sha3(rlp( )) rlp( )

sha3(rlp( )) rlp( )2,3,3: V3 2,3,3: V3

sha3(rlp( )) rlp( )V4 V4

sha3(rlp( )) rlp( )V5 V5

sha3(rlp( )) rlp( )

sha3(rlp( )) rlp( )

3,4,1

3,4,1 3,4,1

sha3(rlp( )) rlp( )Root Hash

Page 13: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Block 5’032’0911 = 160

16 = 161

256 = 162

Level 3

Level 2

Level 1

Level 4Level 5Level 6Level 7Level 8Level 9

Level 9, 10, 11, 12

4’096 = 163

65’536 = 164

7 41 1’023’387 25’141

5.7m 4.4m 2.1m 0.7m 117k 39’123

18m 1.1m 31k 628 11

2.3m 72k 86

144k 1’6753’484 67

Page 14: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Persistence of Patricia tree in turbo-geth

1,2: V1

3,4: V2

2,3,3: V3V4

V5

Key Value

3,4,1

1,1,1,2,-blockNr V1

1,3,3,4,-blockNr V2

3,2,3,3,-blockNr V3

1,3,4,1,1,-blockNr V4

1,3,4,1,3,-blockNr V5

Goes here because it is sorted

Range query: 1,*,*,*,>=-blockNr

1,1,1,2,-blockNr V1

1,3,3,4,-blockNr V2

Page 15: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

State and Virtual Machine in geth

Virtual Machine

Transient state + journal

read

write

Data queries (RPC)

read

read

commit

read

commit

State ForestState History DB

Page 16: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

State and Virtual Machine in turbo-geth

Common Interface

Virtual Machine

Transient State + journal

read

writeData queries (RPC)

State History DB

State Tree (tip of the chain)

read

read

readcommit

Page 17: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Other differencesBoltDB (B+Tree) instead of LevelDB (Log-Structured Merge)

Guided B+tree traversal is predominant DB workload

Large DB transaction buffer (Red-Black trees) with merge-traversals between buffer and DB

Cannot maintain multiple tips of the chain -> reorgs have to rewind history and the trie

Page 18: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Full sync (fast sync not supported) time

Page 19: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Full sync space

Page 20: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Disk usage (block 4.952m)

Page 21: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Current sync hotspotsBlocks 4661960-4829034 (2 Dec 2017 - 31 Dec 2017)

Page 22: Turbo-Geth: Optimising Ethereum client(s).ppt - edcon.io AkhunovTurbo-Geth... · How it started 30 November 2017 > git clone git@github.com:ethereum/go-ethereum.git > cd go-ethereum

Current sync hotspotsBlocks 4661960-4829034 (2 Dec 2017 - 31 Dec 2017)