load balancing and intelligent load balancing jesús gonzález escalation engineer

37
Load Balancing and Intelligent Load Balancing Jesús González Escalation Engineer

Upload: mervin-neal

Post on 17-Dec-2015

258 views

Category:

Documents


2 download

TRANSCRIPT

Load Balancing and Intelligent Load BalancingJesús González Escalation Engineer

Objectives

• Deep understanding of load balancing architecture

• Troubleshooting techniques

• Identify the root cause of any LB problem

Agenda

• Architecture• Data Collector

• Updates - Examples• Performance Data Helper• Citrix and the PDH

• Where can it go wrong?• Black hole effect (Load Throttling - Intelligent Load Balancing)• Some servers get all the connections• Full load after installing an MUI

• Troubleshooting

Architecture

Architecture

Data Collector DC - Dynamic Store DSData Collector DC - Dynamic Store DS

LmsSS.dll - IMA SubsystemLmsSS.dll - IMA Subsystem

MFRules.dll - XA countersMFRules.dll - XA countersLMS20Rules.dll - System countersLMS20Rules.dll - System counters

PDH.dll - Performance Data HelperPDH.dll - Performance Data Helper

Data Collector Updates

• Data Collector is updated every 30 s

• Only when the change in the load evaluator is bigger than 5%• Every 5 minutes we send a full update

• Connection logon or logoff

Data Collector UpdatesServer load - CPU Load example

• Data Collector

XenApp

Time

30s 60s

15 20 23 25 23 50 98 34 45 56

X 20 23 25 23 50 98 34 45 56 40

X 23 25 23 50 98 34 45 56 40 12

38 41 40Server Load % :

5 minutes

Data Collector UpdatesXA CPU Load != Task Manager CPU Load

• CPU load

• XA = 25%

• Task Manager = 1%

Ctxnotif.dll

MFSrvSS.dll

LMSSS.DLL

Data Collector UpdatesAt Logon time - Default load evaluator

Data Collector

ICA Client

Dynamic Store

Server A

Server B

IMAS

erve

r B

Ser

ver

AS

erve

r A

BIAS

WI/XML

IMA

X

Performance Data Helper – API

The performance data helper interface calls the registry interface to retrieve performance data

Uses the PDH.DLL to access the PDH API

http://msdn2.microsoft.com/en-us/library/aa373083.aspx

Performance Data Helper – C++ Example

pdhStatus = PdhAddCounter ( hQuery,

"\\Processor(0)\\% Processor Time",

0,

&hCounter);

YOU MUST CALL THE PERFORMANCE COUNTER BY ITS NAME

Citrix and the PDH

• \\Processor(_Total)\\% Processor Time

• \\System\\Context Switches/sec

• \\Memory\\% Committed Bytes In Use

• \\Memory\\Page Faults/sec

• \\Memory\\Pages/sec

• \\PhysicalDisk(_Total)\\Disk Bytes/sec

• \\PhysicalDisk(_Total)\\Disk Reads/sec

• \\PhysicalDisk(_Total)\\Disk Writes/sec

Performance Data Helper - Registry

Performance Data Helper – File system

EnglishPerfc009.dat

Perfh009.dat

GermanPerfc007.dat

Perfh007.dat

Performance Data Helper - perfmon

Performance Data Helper - IMA

IMA(At start up)

IMA(At start up)

Citrix and the PDH

We also provide Performance counters

Citrix and the PDHHKLM\SYSTEM\CurrentControlSet\Services\IMAService\Performance

Where can it go wrong?

Black hole effect

Problem Solution

• At Peak logon times a recently boot up server will get all connections and might become unresponsive

• Cause: Server is unable to update the DC

• Load Throttling

• Intelligent load balancing

HKLM\SOFTWARE\Citrix\IMA\LMS\

(DWORD ) UseILB = 1

(DWORD ) ILBMultiplier = 2

Before Intelligent Load Biasing

BIASBIASCurrent Load (0)

Max Load (10000)

0 sessions

Current Load (100)

Max Load (10000)

1 session

BIASBIAS

Max Load (10000)

2 sessions

Current Load (200)

1 session comes 2nd session comes

Default BIAS = 10000/100 = 100

Intelligent Load Biasing

BIASBIAS

Current Load (0)

Max Load (10000)

ILBMultiplier (2)

0 sessions

Current Load (5000)

Max Load (10000)

1 session

BIASBIAS

Max Load (10000)

2 sessions

Current Load (7500)ILBMultiplier (2)

1 session comes 2nd session comes

Load = [(Max Load – Current Load) / ILBMultiplier] + Load

Load Throttling- Low logon rate

Data CollectorICA Client

Dynamic Store

Server A

Server B

Server B

Server A

Server Load BIAS

0

0 0

0(10000-0)/2+0 = 5000100

(10000-0)/2+0 = 5000100

Load Throttling- Low logon rate

Server load if Low logon rate

0

1000

2000

3000

4000

5000

6000

ILB=1

ILB=0

Load Throttling- Black hole effect

Data CollectorICA Client

Dynamic Store

Server A

Server B

Server B

Server A

Server Load BIAS

0

6000 0

0(10000-0)/2+0 = 5000(10000-5000)/2+5000 = 7500

(10000-6000)/2+6000 = 8000

(10000-7500)/2+7500 = 8750

Load Throttling

Server load at Peak logon times

0

2000

4000

6000

8000

10000

12000

1 8

15

22

29

36

43

50

57

64

71

78

85

92

99

ILBMultiplier=2

ILBMultiplier = 10

ILBMultiplier = 20

ILB=0

Upgrade to W2K3 causes full load

Problem Solution• English W2K

\\System\\% Total Processor Time

• English W2K3

\\Processor(_Total)\\% Processor Time

• In W2K3, use

\\Processor(_Total)\\% Processor Time

Some servers get all the connections

Problem Solution

• 1 server all the connections

• Cause: Fail to read performance counters causes Load = 0

• Fail to read performance counters causes Load = 10000

Full load after installing an MUI

Problem Solution

• After installing MUI with advance load evaluators it results in load = 10000

• Cause:

-> perfc007.dat (German) perfc009.dat (English)

• HKLM\Software\Citrix\IMA\LMS\

EnableTranslation=1 (dword)

Full load after installing an MUI

• Jesús González

in a German restaurant

English Menu (perfc009.dat)

22 . Vegetable Soup

German Menu (perfc007.dat)

22. Gemüsensuppe

Troubleshooting

Troubleshooting

Data Collector DCData Collector DC LmsSS.dllLmsSS.dll

LMS20Rules.dllLMS20Rules.dll

PDH.dllPDH.dll

\\Processor(_Total)\\% Processor TimeXXXXXXXXXXXXXXXX

Full Load

TroubleshootingFailing to read performance counters => full load

• Use procmon (filemon) while restarting IMAMake sure the correct perfcXXX.dat file can be accessed.

• Consider rebuilding performance countershttp://support.microsoft.com/kb/300956/en-us

• Check that no performance counter are disabled[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\PerfDisk\Performance]"Disable Performance Counters"=dword:00000001

• Use CDFControl to gather CDF traceshttp://support.citrix.com/article/CTX111961

•LMSRuleDll_Interface::CreatePDHQuery() ERROR!!!!! rc = -1073738823

•-1073738823 -> FFFFFFFFC0000BB9 (HEX)

•http://msdn2.microsoft.com/en-us/library/aa373046.aspx

“The specified counter could not be found”

TroubleshootingCDF Traces example

Summary

• Deep understanding of load balancing• BIAS, PDH…

• Understanding Intelligent load balancing• Load Throttling

• Typical scenarios• Full load versus 0 load, MUI…

• Troubleshooting techniques• So far we are able to resolve all the problems that arrive to EMEA tech support with those tips

Before you leave…

• Session surveys are available online at www.citrixsynergy.com starting Thursday, 7 October• Provide your feedback and pick up a complimentary gift card at the registration desk

• Download presentations starting Friday, 15 October, from your My Organiser Tool located in your My Synergy Microsite event account