high-performance jdbc voxxed bucharest 2016

Post on 15-Apr-2017

8.323 Views

Category:

Software

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

High-Performance JDBCVLAD MIHALCEA

Performance Facts

“More than half of application performance bottlenecks originate in the database”

AppDynamics - http://www.appdynamics.com/database/

Data access layers

Poor man’s JDBC

• High response time

• Low throughput

Photo by Amit Patel CC BY 2.0 https://www.flickr.com/photos/amitp/6069412747/

State of the art JDBC

• Low response time

• High throughput

Photo by zoetnet CC BY 2.0 https://www.flickr.com/photos/zoetnet/14288129197/

Response time

• connection acquisition time

• statements submission time

• statements execution time

• result set fetching time

• idle time prior to releasing the database connection

𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒

Connection management

𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒

Connection acquisition overhead

Metric DB_A (ms) DB_B (ms) DB_C (ms) DB_D (ms) HikariCP (ms)

min 11.174 5.441 24.468 0.860 0.001230

max 129.400 26.110 74.634 74.313 1.014051

mean 13.829 6.477 28.910 1.590 0.003458

p99 20.432 9.944 54.952 3.022 0.010263

Connection pooling

• Logical vs physical connections

• Lease vs create

• Release vs close

Connection pool sizing

FlexyPool

• Java EE

• Bitronix / Atomikos

• Apache DBCP / DBCP2

• C3P0

• HikariCP

• Tomcat CP

• Vibur DBCP

https://github.com/vladmihalcea/flexy-pool

FlexyPool

• concurrent connections histogram

• concurrent connection requests histogram

• connection acquisition time histogram

• connection lease time histogram

• maximum pool size histogram

• retry attempts histogram

https://github.com/vladmihalcea/flexy-pool

FlexyPool – Concurrent connection requests

1

28

55

82

10

9

13

6

16

3

19

0

21

7

24

4

27

1

29

8

32

5

35

2

37

9

40

6

43

3

46

0

48

7

51

4

54

1

56

8

59

5

62

2

64

9

67

6

70

3

73

0

75

7

78

4

81

1

83

8

86

5

89

2

91

9

94

6

97

3

10

00

10

27

0

2

4

6

8

10

12

Sample time (Index × 15s)

Co

nn

ecti

on

req

ue

sts

max mean p50 p95 p99

FlexyPool – Pool size growth

1

28

55

82

10

9

13

6

16

3

19

0

21

7

24

4

27

1

29

8

32

5

35

2

37

9

40

6

43

3

46

0

48

7

51

4

54

1

56

8

59

5

62

2

64

9

67

6

70

3

73

0

75

7

78

4

81

1

83

8

86

5

89

2

91

9

94

6

97

3

10

00

10

27

0

1

2

3

4

5

6

Sample time (Index × 15s)

Max

po

ol s

ize

max mean p50 p95 p99

FlexyPool – Connection acquisition time

12

85

58

21

09

13

61

63

19

02

17

24

42

71

29

83

25

35

23

79

40

64

33

46

04

87

51

45

41

56

85

95

62

26

49

67

67

03

73

07

57

78

48

11

83

88

65

89

29

19

94

69

73

10

00

10

27

0

500

1000

1500

2000

2500

3000

3500

Sample time (Index × 15s)

Co

nn

ecti

on

acq

uis

itio

n t

ime

(ms)

max mean p50 p95 p99

FlexyPool – Connection lease time

1

29

57

85

11

3

14

1

16

9

19

7

22

5

25

3

28

1

30

9

33

7

36

5

39

3

42

1

44

9

47

7

50

5

53

3

56

1

58

9

61

7

64

5

67

3

70

1

72

9

75

7

78

5

81

3

84

1

86

9

89

7

92

5

95

3

98

1

10

09

10

37

0

5000

10000

15000

20000

25000

30000

35000

40000

Sample time (Index × 15s)

Co

nn

ecti

on

leas

e ti

me

(ms)

max mean p50 p95 p99

Statement Batching

statement.addBatch(

"INSERT INTO post "(title, version, id) " +

"VALUES ('Post no. 1', 0, 1)");

statement.addBatch(

"INSERT INTO post_comment (post_id, review, version, id) " +

"VALUES (1, 'Post comment 1.1', 0, 1)");

int[] updateCounts = statement.executeBatch();

𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒

Statement Batching (5k rows)

1 10 20 30 40 50 60 70 80 90 100 1000

0

500

1000

1500

2000

2500

Batch size

Tim

e (m

s)

DB_A DB_B DB_C DB_D

Oracle Statement batching

• For Statement and CallableStatement,

the Oracle JDBC Driver doesn’t actually support batching,

each statement being executed separately.

MySQL Statement batching

• By default, the MySQL JDBC driver doesn’t send the batched statements in a single request.

• The rewriteBatchedStatements connection property adds all batched statements to a String buffer.

Batch PreparedStatements

PreparedStatement postStatement = connection.prepareStatement(

"INSERT INTO Post (title, version, id) VALUES (?, ?, ?)");

postStatement.setString(1, String.format("Post no. %1$d", 1));

postStatement.setInt(2, 0);

postStatement.setLong(3, 1);

postStatement.addBatch();

postStatement.setString(1, String.format("Post no. %1$d", 2));

postStatement.setInt(2, 0);

postStatement.setLong(3, 2);

postStatement.addBatch();

int[] updateCounts = postStatement.executeBatch();

Batch PreparedStatements

• SQL Injection Prevention

• Better performance

• Hibernate can batch statements automatically

Insert PreparedStatement batching (5k rows)

1 10 20 30 40 50 60 70 80 90 100 1000

0

200

400

600

800

1000

1200

1400

1600

Batch size

Tim

e (m

s)

DB_A DB_B DB_C DB_D

Update PreparedStatement batching (5k rows)

1 10 20 30 40 50 60 70 80 90 100 1000

0

100

200

300

400

500

600

700

Batch size

Tim

e (m

s)

DB_A DB_B DB_C DB_D

Delete PreparedStatement batching (5k rows)

1 10 20 30 40 50 60 70 80 90 100 1000

0

200

400

600

800

1000

1200

Batch size

Tim

e (m

s)

DB_A DB_B DB_C DB_D

Statement caching

𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒

Statement caching gain (one minute interval)

Database System No Caching

Throughput (SPM)

Caching Throughput

(SPM)

Percentage Gain

DB_A 419 833 507 286 20.83%

DB_B 194 837 303 100 55.56%

DB_C 116 708 166 443 42.61%

DB_D 15 522 15 550 0.18%

Oracle server-side statement caching

• Hard parse

• Soft parse

• Bind peeking

• Adaptive cursor sharing (since 11g)

SQL Server server-side statement caching

• Execution plan cache

• Parameter sniffing

• Force recompile

SELECT *

FROM task

WHERE status = ?

OPTION(RECOMPILE);

PostgreSQL server-side statement caching

• Prior to 9.2 – execution plan caching

• 9.2 – optimization and planning are deferred

• The prepareThreshold connection property

MySQL server-side statement caching

• No execution plan cache

• Since Connector/J 5.0.5 PreparedStatements are only emulated

• To activate server-side prepared statements:

• useServerPrepStmts

• cachePrepStmts

Client-side statement caching

• Recycling Statement, PreparedStatement or CallableStatement objects

• Reusing database cursors

Oracle implicit client-side statement caching

• Connection-level cache

• PreparedStatement and CallabledStatement only

connectionProperties.put(

"oracle.jdbc.implicitStatementCacheSize",

Integer.toString(cacheSize)

);

dataSource.setConnectionProperties(

connectionProperties

);

Oracle implicit client-side statement caching

• Can be disabled on a per statement basis

if (statement.isPoolable()) {

statement.setPoolable(false);

}

Oracle explicit client-side statement caching

• Caches both metadata and execution state with data

OracleConnection oracleConnection =

(OracleConnection) connection;

oracleConnection.setExplicitCachingEnabled(true);

oracleConnection.setStatementCacheSize(cacheSize);

Oracle explicit client-side statement caching

• Vendor-specific API

PreparedStatement statement = oracleConnection.

getStatementWithKey(SELECT_POST_KEY);

if (statement == null)

statement = connection.prepareStatement(SELECT_POST);

try {

statement.setInt(1, 10);

statement.execute();

} finally {

((OraclePreparedStatement) statement).

closeWithKey(SELECT_POST_KEY);

}

SQL Server client-side statement caching

• Microsoft JDBC Driver 4.2 disableStatementPooling

• jTDS 1.3.1 – JDBC 3.0

JtdsDataSource jdtsDataSource =

(JtdsDataSource) dataSource;

jdtsDataSource.setMaxStatements(cacheSize);

PostgreSQL Server client-side statement caching

• PostgreSQL JDBC Driver 9.4-1202 makes client-side statement connection-bound instead of statement-bound

• Configurable:

• preparedStatementCacheQueries (default is 256)

• preparedStatementCacheSizeMiB (default is 5MB)

• Statement.setPoolable(false) is not supported

MySQL Server client-side statement caching

• Configurable:

• cachePrepStmts (default is false)

Required for server-side statement caching as well

• prepStmtCacheSize (default is 25)

• prepStmtCacheSqlLimit (default is 256)

• Statement.setPoolable(false) works for client-side statements only

ResultSet fetch size

• ResultSet - application-level cursor

𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒

statement.setFetchSize(fetchSize);

Oracle ResultSet fetch size

• Default fetch size is 10

• Oracle 10i and 11g JDBC Driver maximum ResultSet size memory preallocation

• VARCHAR2(4000) – allocates 8000 bytes (even for 1 character)

• Memory buffers are recycled only when using Statement caching

• Oracle 12c allocates memory on demand

• VARCHAR2(4000) – 15 bytes + the actual row column size

SQL Server ResultSet fetch size

• Adaptive buffering

• Only for the default read-only and forward-only ResultSet

• Updatable cursors use fixed data blocks

PostgreSQL ResultSet fetch size

• Fetch all – one database roundtrip

• Custom fetch size – database cursor

MySQL ResultSet fetch size

• Fetch all – one database roundtrip

• Streaming – only one record at a time

ResultSet fetch size (10k rows)

1 10 100 1000 10000

0

100

200

300

400

500

600

Fetch size

Tim

e (m

s)

DB_A DB_B DB_C DB_D

ResultSet size

• Avoid fetching data that is not required

• Hibernate addresses the max-size vendor-specific SQL statement syntax

SQL:2008 ResultSet size limit

• Oracle 12c, SQL Server 2012 and PostgreSQL 8.4

SELECT

pc.id AS pc_id, p.title AS p_title

FROM post_comment pc

INNER JOIN post p ON p.id = pc.post_id

ORDER BY pc_id

OFFSET ? ROWS

FETCH FIRST (?) ROWS ONLY;

Oracle ResultSet size limit

SELECT *

FROM (

SELECT

pc.id AS pc_id, p.title AS p_title

FROM post_comment pc

INNER JOIN post p ON p.id = pc.post_id

ORDER BY pc_id

)

WHERE ROWNUM <= ?

SQL Server ResultSet size limit

SELECT

TOP (?) pc.id AS pc_id, p.title AS p_title

FROM post_comment pc

INNER JOIN post p ON p.id = pc.post_id

ORDER BY pc_id

PostgreSQL and MySQL ResultSet size limit

SELECT

pc.id AS pc_id, p.title AS p_title

FROM post_comment pc

INNER JOIN post p ON p.id = pc.post_id

ORDER BY pc_id

LIMIT ?

Statement max rows

• Vendor-independent syntax

• Might not influence the execution plan

• According to the documentation:

“If the limit is exceeded, the excess rows are silently dropped.”

statement.setMaxRows(maxRows);

Max size: 1 million vs 100 rows

Fetch all Fetch max rows Fetch limit

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Tim

e (m

s)

DB_A DB_B DB_C DB_D

Fetching too many columns

• Fetching all column (ORM tools)

SELECT *

FROM post_comment pc

INNER JOIN post p ON p.id = pc.post_id

INNER JOIN post_details pd ON p.id = pd.id

Fetching too many columns

• Fetching a custom SQL projection

SELECT pc.version

FROM post_comment pc

INNER JOIN post p ON p.id = pc.post_id

INNER JOIN post_details pd ON p.id = pd.id

Fetching too many columns performance impact

All columns Custom projection

0

5

10

15

20

25

30

Tim

e (m

s)

DB_A DB_B DB_C DB_D

Processing Logic

• Hibernate defers connection acquisition

• Release connection as soon as possible

𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒

Questions and Answers

𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒

• Response time

• Connection management

• Batch updates

• Statement caching

• ResultSet fetching

• https://leanpub.com/high-performance-java-persistence

top related