performance by design

64
© 2009 Quest Software, Inc. ALL RIGHTS RESERVED Performance by Design Guy Harrison Director, R&D Melbourne www.guyharrison.net

Upload: guy-harrison

Post on 09-Jun-2015

823 views

Category:

Technology


0 download

DESCRIPTION

Oracle Performance by Design - presnetaiotn given at Oracle Open World 2009

TRANSCRIPT

Page 1: Performance By Design

© 2009 Quest Software, Inc. ALL RIGHTS RESERVED

Performance by Design

Guy Harrison

Director, R&D Melbourne

www.guyharrison.net

Page 2: Performance By Design

2

Introductions

Page 3: Performance By Design

3

Page 4: Performance By Design

4

Save the red-shirt Toad!

• The Red-shirt Toad is NOT expendable!

Page 5: Performance By Design

5

Core message

• Design limits performance• Architecture maps requirements to design• Make sure performance requirements are specified• Make sure architecture allows for performance• Make sure performance requirements are realized

Page 6: Performance By Design

6

Elements of Performance by DesignMethodology

•Define requirements

•Prototype

•Measurement and instrumentation

•Benchmarking

Database Design

•Logical and Physical

•Indexing, partitioning, clustering

•Denormalization

Application Architecture

•Minimize requests

•Optimize requests

Page 7: Performance By Design

7

Methodology

Requirements analysis

• Response time

• Throughput

• Data volumes

• Hardware budget

Prototype

• Data model

• Key transactions

• Data volumes

Benchmark

• Concurrency

• Transaction rates

• Data volumes

Page 8: Performance By Design

8

High performance can mean different things

Speed: response time

Page 9: Performance By Design

9

Efficiency: power consumption

Page 10: Performance By Design

10

Power: throughput

Page 11: Performance By Design

11

Not usually easy to change architectures

Page 12: Performance By Design

12

Poorly defined requirements lead to this:

Page 13: Performance By Design

13

The twitter lesson

Page 14: Performance By Design

14

Twitter growth

Page 15: Performance By Design

15

“Twitter is, fundamentally, a messaging system.

Twitter was not architected as a messaging

system, however. For expediency's sake, Twitter

was built with technologies and practices that are

more appropriate to a content management

system.”

Page 16: Performance By Design

16

Patterns of database performance

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 970

20

40

60

80

100

120

O(1)

O(n)

O(log n)

O(n2)

Q

Hard to distinguish patterns at low levels

Page 17: Performance By Design

17

Validating performance can’t wait...

Database (Tables, views, partitions, etc)

Middleware layer (J2EE)

UI Layer (HTML, JavaScript, Ajax)

User adoption and growth

SQLs

Page 18: Performance By Design

18

Page 19: Performance By Design

19

Database Design

Logical Modelling

• Normalize (enough but no further)

• Data types

• Artificial keys

Logical to physical

• Subtypes

• Table types (clustered, nested, heap)

• Nulls

• Denormalization

Indexing and physical storage

• Index and clustering strategies

• Partitioning

Page 20: Performance By Design

20

Normalize, but not too far!

"Make everything as simple as possible,

but not simpler."

Page 21: Performance By Design

21

Other logical design thoughts• Artificial keys

– Generally more efficient than long composite keys

• Null values– Not a good idea if you intend to search for “unknown” or

“incomplete” values– Null should not mean something– But beneficial as long as you don’t need to look for them.

• Data types– Constraints on precision can sometimes reduce row lengths– Variable length strings usually better– Carefully consider CLOBs vs long VARCHARs

Page 22: Performance By Design

22

Logical to Physical: Subtypes

“Customers are people too”

Page 23: Performance By Design

23

Indexing, clustering and weird table types• Lots’ of options:

– B*-Tree index– Bitmap index– Hash cluster– Index Cluster– Nested table– Index Organized Table

• Most often useful:– B*-Tree (concatenated) indexes– Bitmap indexes– Hash Clusters

Page 24: Performance By Design

24

Page 25: Performance By Design

25

Concatenated index effectiveness

SELECT cust_id

FROM sh.customers c

WHERE cust_first_name = 'Connor'

AND cust_last_name = 'Bishop'

AND cust_year_of_birth = 1976;

None

last name

last+first name

last,first,BirthYear

last,first,birthyear,id

0 200 400 600 800 1000 1200 1400 1600

1459

63

6

4

3

Logical IO

Page 26: Performance By Design

26

Concatenated indexing guidleines• Create a concatenated index for columns from a table that

appear together in the WHERE clause.• If columns sometimes appear on their own in a WHERE

clause, place them at the start of the index.• The more selective a column is, the more useful it will be

at the leading end of the index (better single key lookups)• But indexes compress better when the leading columns

are less selective. (better scans) • Index skip scans can make use of an index even if the

leading columns are not specified, but it’s a poor second choice to a “normal” index range scan.

Page 27: Performance By Design

27

Bitmap indexes

Page 28: Performance By Design

28

Bitmap indexes

1 10 100 1000 10000 100000 10000000.01

0.1

1

10

100

Bitmap index B*-Tree index Full table scan

Distinct values in table

Ela

pse

d T

ime

(s)

Page 29: Performance By Design

29

Page 30: Performance By Design

30

Bitmap join performance

SELECT SUM (amount_sold)

FROM customers JOIN sales s USING (cust_id) WHERE

cust_email='[email protected]';

Bitmap Join index

Bitmap index

Full table scan

0 2000 4000 6000 8000 10000 12000 14000

68

1,524

13,480

Logical IO

Acc

ess

Pat

h

Page 31: Performance By Design

31

Index overhead

1 (PK only)

2

3

4

5

6

7

0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 16,000 18,000

1,191

6,671

8,691

10,719

12,727

14,285

16,316

Logical reads required

Nu

mb

er o

f in

dex

es

Page 32: Performance By Design

32

Hash Cluster• Cluster key

determines physical location on disk

• Single IO lookup by cluster key

• Misconfiguration leads to overflow or sparse tables

Page 33: Performance By Design

33

Hash Cluster vs B-tree index

B-tree index

Hash (hashkeys=100000,size=1000)

Hash (hashkeys=1000, size=50)

0 1 2 3 4 5 6 7 8 9

3

1

9

Logical reads

Page 34: Performance By Design

34

Hash cluster table scan

Heap table

Hash (hashkeys=100000, size=1000)

Hash (hashkeys=1000, size=50)

0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000

1,458

3,854

1,716

Logical reads

Page 35: Performance By Design

35

Denormalization and partitioning

• Repeating groups – VARRAYS, nested tables• Summary tables – Materialized Views, Result cache• Horizontal partitioning – Oracle Partition Option • In-line aggregations – Dimensions • Derived columns – Virtual columns• Vertical partitioning • Replicated columns - triggers

Page 36: Performance By Design

36

Summary tables• Aggregate queries on big tables often the most expensive• Pre-computing them makes a lot of sense• Balance accuracy with overhead

Accuracy

Efficiency

Aggregate Query

MV stale tolerated

MV on COMMIT

Manual Summary

Result set cache

Page 37: Performance By Design

37

Vertical partitioning

Page 38: Performance By Design

38

Physical storage options

• LOB Storage• PCTFREE• Compression • Block size • Partitioning

Page 39: Performance By Design

39

Page 40: Performance By Design

40

Application Architecture and implementation

SQL Statement Management

• Reduce requests though application caching

• Reduce “hard” parsing using bind variables

Transaction design

• Minimize lock duration

• Optimistic and Pessimistic locking strategies

Network overhead

• Array fetch and Insert

• Stored procedures

Page 41: Performance By Design

41

The best SQL is no SQL • Avoid asking for the same data twice.

Page 42: Performance By Design

42

11g client side cache • CLIENT_RESULT_CACHE_SIZE: this is the amount of memory

each client program will dedicate to the cache.• Use RESULT_CACHE hint or (11GR2) table property• Optionally set the CLIENT_RESULT_CACHE_LAG

11g client Cache

Program caching

NoCaching

0 1,000 2,000 3,000 4,000 5,000 6,000 7,000

1,250

1,438

6,265

Elapsed time (ms)

Page 43: Performance By Design

43

Parse overhead• It’s easy enough in most programming languages to

create a unique SQL for every query:

Page 44: Performance By Design

44

Bind variables are preferred

Page 45: Performance By Design

45

Parse overhead reduction

No Bind variables

Bind Variables

CURSOR_SHARING

0 200 400 600 800 1,000 1,200 1,400

HardParse

OtherParse

Other

Elapsed time (ms)

Page 46: Performance By Design

46

Identifying similar SQLs

See force_matching.sql at www.guyharrison.net

Page 47: Performance By Design

47

Transaction design • Optimistic vs. Pessimistic

Dura

tion of lock

Duration

of lock

Page 48: Performance By Design

48

Using ORA_ROWSCN

• Setting ROWDEPENDENCIES will reduce false fails

Page 49: Performance By Design

49

Network – stored procedures

Page 50: Performance By Design

50

Network traffic example

Stored Procedure

Java client

0 200 400 600 800 1,000 1,200 1,400 1,600 1,800

344

1703

297

313

Local Host

Remote Host

Elapsed time (ms)

Page 51: Performance By Design

51

Array processing - Fetch

Page 52: Performance By Design

52

Network overhead – Array processing

0 20 40 60 80 100 120 1400

5,000

10,000

15,000

20,000

25,000

30,000

35,000

40,000

Logical Reads Network round trips

Array fetch size

Page 53: Performance By Design

53

Array Insert (Java)

Page 54: Performance By Design

54

Array Insert: (.NET)

Page 55: Performance By Design

55

Array Insert – PL/SQL

Page 56: Performance By Design

56

Thank you

Page 57: Performance By Design

57

• Geek quiz stuff:• High probability answers (keep standing if):• Know what Alice and Wally have in common• You know the next number in this series 3 . 1 4 • Know what “M” is in E=MC2

Page 58: Performance By Design

58

• Know (or can work out) your age in hex• Have an opinion about of ST vs SW • If you know who Leonard McCoy is • Think there is an important distinction between Nerd and

Geek • Can quote Monty Python • …. Other than dead parrot?• You’ve ever watched Jerry Springer

Page 59: Performance By Design

59

• There are more networked devices in your house than people, pets and cars

• Know the names of two of Thomas the tank engines friends

• Know the names of any of Angelina and Brad’s babies

• Low probability answers: (sit down if you):• Have a twitter account • # Azure is your new favourite color

Page 60: Performance By Design

60

• You’ve ever played Zork • You have a favourite Dr Who companion • Your favourite is Sarah Jane • Know your age in binary (or can work it out in your head) • You are proficient in some form of assembler

Page 61: Performance By Design

61

• # You are proficient in some for or English • There is a rubicks cube in your house • Have your own domain• Have ever been to Azeroth• Who is • Know who said “Dude I am not your nemesis”

Page 62: Performance By Design

62

• Worn a star trek or star wars costume• Played a game that uses a non-six sided dice• Get email on my phone – before getting out of bed• Calculator watch• Binary time piece • Was on the internet prior to the WWW

Page 63: Performance By Design

63

• # Met my current partner on line• Know the next thing in this sequence: Hydrogen, Helium,

Lithuim, Berilium, ….• Know what a Gigaquads in a megaquad is

Page 64: Performance By Design

64

• Saw a sci-fi movie more than twice at the movies• ============================================

=============• You cleaned up at home before going to work