ten commandments of tcp/ip performance inside products, inc. (831) 659-8360 sales@inside-...

Post on 30-Mar-2015

216 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Ten CommandmentsOf TCP/IP Performance

Inside Products, Inc.www.inside-products.com(831) 659-8360sales@inside-products.com

Nalini ElkinsInside Products, Inc.

Nalini_elkins@inside-products.com

How Does TCP/IP Work?

• To find problems in TCP/IP, let’s start with thinking about what TCP/IP is.

• TCP/IP helps the network to get information - packets - from one place to another through some network equipment - routers, switches, etc.

• What so hard about this?

Packet 2

Packet 1

Packet 2

Packet 1

Networks are Complex

• There are hundreds of thousands, even millions of connections in a network

• Finding just which one has problems is a daunting task.

Network diagnostics involves decoding multiple layers of protocols.

TCP

IP

MQSeries

HTTP

LDAP

LPR/LPD

FTP

TN3270

UDP

IPv6

Ten Commandments

1. Thou shalt monitor thy application backlog queue

2. Thou shalt not kill thy network by many short connections

3. Thou shalt drop unused connections

4. Thou shalt honor thy TCP duplicate ACKs and thy retransmissions

5. Thou shalt relate thy TCP resets to the cause.

6. Thou shalt not fail in watching thy TCP attempt fails

7. Thou shalt delve deeply into UDP no ports errors

8. Thou shalt address the reason for your IP address errors

9. Thou shalt not convert thy applications directly from multi-dropped SDLC

10. Thou shalt not use two packets when one will do

Thou shalt monitor thy application backlog queue

• How does backlog queue work? Example below:– Application can have 5 active connections– 10 connections can be in the backlog queue– When the 16th connection comes in, it is rejected

• How to monitor? See above– Portion of Netstat All display– SNMP MIB will also show queues

• SOMAXCONN

What is going on?

• User is a U.S. state government.• Using a CICS accounting application,

sometimes the user gets a ‘Connection Refused’ other times, the session initiation just hangs.

• Technical support has no idea what to do.• Users are angry!• The problem has gotten escalated to just

below the governor’s office.

Backlog Queues

• We looked at the application backlog queues.

• We see that the backlog queues are being exceeded and connections are dropped.

Connection Refused…

• When the backlog queue is exceeded, then the users get a ‘Connection Refused’ message.

• This is actually better than….

• If the user is stuck in the backlog queue, then they just see an hourglass on the terminal and they appear to be hung!

• This is even more frustrating.

The Real Problem…

• The installation tried a number of ways to speed up the application.

• The vendor of the application is being contacted for assistance.

How to Find This!

- One way is via the z/OS SNMP MIB.

- The counts are also shown per socket on the Netstat All command. If you do a ‘Netstat all (IPAddr 0.0.0.0’ then you may see all the listener connections.

- But, remember, these are all just snapshots. You should be monitoring these continuously.

SOMAXCONN

• SOMAXCONN is used in conjunction with the backlog queue value specified in application programs. As a socket connection request arrives at the TCP/IP stack and the server is busy processing a previous request, the new request is queued up to the amount specified with the SOMAXCONN parameter.

• When that number is exceeded, TCP/IP connection requests will timeout and get refused. The value specified in the backlog queue can not exceed that of SOMAXCONN. No error will be given, but the value of SOMAXCONN will be used. SOMAXCONN is set per listener. In other words, the SOMAXCONN value is not cumulative for all listener ports.

• SOMAXCONN: Specifies maximum length for the connection request queue created by the socket call listen().

• Sample: SOMAXCONN 10

• This is the length of the backlog queue.

Thou shalt not kill thy network by many short connections

• Each connection establishment requires a flow of packets

• If there is a lot of data flow, better to have a long connection with many packets than multiple short connections.

TCP Many New Connections

• We were led to the problem of possible unneeded sessions by noticing that hundreds of new connections were made and terminated for TCP port 23.

• We investigated the possible sources for this activity.

Many Connections for Port 23

• When we look at the IP addresses with connections to port 23, some stand out.

• IP address 10.111.1.190, in particular, was responsible for 53% of the connections.

Thou shalt drop unused connections

1. Many unused connections come from Voice Response Units (VRU’s)

2. Others may be from scripts which use TN3270.

3. Minor modifications may help: fewer connections at longer intervals. (Instead of 100 connections every 3 minutes, 50 connections every 10 minutes.)

4. Investigate persistent connections, connection pooling.

Thou shalt honor thy TCP duplicate ACKs and thy retransmissions

What do the dup acks and

retransmissions have in

common?

• The same subnet• The same time of day• The same socket

application • The same route - set of

hardware

ACK 101

ACK 101

ACK 201

TCP Retransmits By Port

• Notice that port 23 is responsible for 96% of the retransmits.

• Let’s see about remote addresses

TCP Retransmits By Remote Address

• Five remote addresses are responsible for over 80% of the retransmits.

• Duplicate acknowledgments show a similar pattern to the retransmits.

Thou shalt relate thy TCP resets to the cause

• A RESET packet is sent by TCP to abort a connection. • May or may not be a problem - closing an idle connection is proper• On the other hand, if an application is refusing connections because it is out of resources

then you may see many RESETs.• Let’s look at the next commandment for an example.

• It may be that there is a TCP application which is not active.

• The packet to the TCP port which is not active will be responded to with a TCP RESET packet.

• The count of TCP Attempt Fails will be incremented.

HostTCP Port 445

Packet toTCP

Port 445

TCP Reset Packet

Count of

TCP Attempt Fails

Thou shalt not fail in watching thy TCP attempt fails

TCP Resets

Thou shalt delve deeply into UDP

no ports errors

• UDP No Ports is equivalent to TCP Attempt Fails

• It may be that there is a UDP application which is not active.

• If all UDP sockets are active, then it may be that UDP traffic is coming in at too high a rate for a particular port.

• We have seen this error to be correlated with ICMP Destination Unreachable SubType Port Unreachable error.

HostUDP Port 161

Packet toUDP

Port 161

ICMP

Destination Unreachable

Port Unreachable

Count of

UDP No Ports

UDP Port Unreachable

• In the case above, no application was listening on port 161, so this generated the ICMP error.

• Since this port happened to be for UDP, then it also generated a UDP No Ports error.

• If this is just a mistake and happens thousands of times a day because some application is not properly configured…

Thou shalt address the reason

for your IP address errors

• Many UDP applications send packets to a broadcast address.

• The mainframe does not recognize such addresses.

• The packets are dropped and noted as address errors.

• Such packets may also come from a router if routing is misconfigured.

• We have seen millions per day.

Host1.2.255.255

Packet to1.2.255.255

Count of

Address Errors

Count of

IP Discards

Misdirected Packets

• This analysis only lasted a few minutes.

• Hundreds of packets were sent from many UDP NetBios connections.

• Some were improperly configured SQL servers.

• These packets were dropped by the mainframe.

• Why clog up the network and make the mainframe do extra work for no reason?

Thou shalt not convert thy applications

directly from multi-dropped SDLC

Makes sense to have

small packets so

that no one

dominates traffic.

Multi-dropped SDLC link

Packet For PU 1

Packet For PU 2

Packet For PU 3

Packet For PU 2

Packet For PU 1

Packet For PU 3

TCP virtual circuit.

• Remote host

• Local host

Small packets

means overhead.

IP Header: 20 bytesTCP Header: 20 bytes

Data: 8 bytes

Sample Application

• This application was converted directly from SDLC.• Suffered from poor response time.

Thou shalt not use two packets when one will do

Packet 1: data

Packet 2: 0 bytes dataPSH in TCP header

• PSH bit on in the header indicates that data transmission is complete.

• The PSH bit could have been turned on in Packet 1.

• Packet 2 does not need to be sent.

• A small mistake.• If you do it a million times a

day, it becomes a big mistake.

Tuning TCP Saves Money

– Eliminate errors and unneeded traffic and benefit from:

• lower CPU usage

• Less frequent hardware upgrades

• lower costs for MIPS based software charges

• Increased bandwidth availability

• Increased technical staff productivity

• Inside the Stack is the only TCP/IP monitor focused on problem solving and tuning.

• Data from a recent Network Health Check reveal TCP, UDP, ICMP, and listener errors for both systems.

• Over 2,000 errors per 3-minute interval.• With tuning these numbers fall significantly.• Errors contribute to TCP/IP SRB usage.

• After a Health Check and tuning efforts lasting 2 -3 weeks, the listener and UDP errors for both systems have been completely eliminated.

• The ICMP errors for both systems are nearly eliminated.• The TCP errors have been cut to 1/4 to 1/3 of what they

used to be. • TCP dropped from 2nd highest user of CPU to 4th highest

user of CPU (SRB’s).

The Silent Killer

– You may not even realize you have problems with TCP/IP.– Just as cholesterol in the heart can be a silent killer,

retransmissions, excessive connections, and unneeded traffic can clog up the network.

– And… these problems are preventable!

How Can We Help?

• ESAI and Inside Products are TCP/IP specialists.• We can help you with:

– Training– Tools– Consulting

Inside the Stack

• Inside the Stack provides:

– Real time monitoring– Historical reports– Alerting

– Connection monitoring– TCP stack diagnostics

– There are hundreds of reports possible!

Inside the Stack

TCP Problem Finder

• The product most directed to the serious diagnostician : TCP Problem Finder allows you to:

– Find problems in diagnostic traces - which can consist of thousands or hundreds of thousands of packets

– See the exact flow in a connection from a high level overview or the details

– We use this product ourselves in consulting. IBM subcontracts to us to help with TCP problem resolution, we could not do it without TCP Problem Finder!

TCP Problem Finder

When you are serious about tuning your network, our Network HealthCheck can help to:

Identify response time problems for applications (host or network) Identify response time problems for individual connections (host or

network) Identify congestion or network traffic errors on subnets Identify paging, queues, high CPU usage for TCP sockets or TCP

address space Analyze TCP profile Identify paging, queues, high CPU usage for individual FTPs Identify routes and applications with packet fragmentation Identify excessive idle or hanging connections Identify connections in frequent error status Identify application configuration problems (keepalive required, etc)

Network Health Check

TCP Classes

We offer many classes in TCP/IP including:

• Security, IPSec, Policy Agent• IPv6 (Addressing, Multi-platform)• TCP Tuning and Performance Analysis• Trace Analysis and Diagnostics

For more information on :

– Inside the Stack, – TCP Problem Finder, – TCP Response Time Monitor,– Availability Checker,

– Network Health Check,– TCP/IP classes

Coming soon!

– EE Health Check

Please contact us!

Contact Us!

1-831-659-8360 or

1-866-464-3724

sales@inside-products.com

sales@ESAIGroup.com

Australia: Blueline Software

UK : FitzSoftware

BENELUX: Adinsec BvBa

top related