engageone server performance and scalability whitepaper...hardware – the system is capable of...

EngageOne Compose – Scalability White Paper of 27

Customer Engagement

EngageOne® Compose

Performance and scalability whitepaper

EngageOne® Server


Scalability White Paper

Contents Introduction ............................................................................................................................................ 3

Executive Summary ................................................................................................................................ 4

Architectural Overview ........................................................................................................................... 5

Performance and scalability factors ....................................................................................................... 6

Real-time test scenarios ......................................................................................................................... 6

Reference Configuration ......................................................................................................................... 7

Scaling out............................................................................................................................................. 11

Scaling up .............................................................................................................................................. 16

Batch Processing ................................................................................................................................... 20

Software Platforms ............................................................................................................................... 20

Capacity Planning.................................................................................................................................. 22

1: Estimate the demand ................................................................................................................... 22

2: Estimate the number of application servers required ................................................................. 23

3: Define the database server and shared file system ..................................................................... 25

4: Validate the environment through testing .................................................................................. 27

Diagnostic and Tuning Techniques ....................................................................................................... 27



Introduction This document describes the performance and scalability characteristics of the EngageOne Compose

server. It is intended to be used by customers and Professional Services staff who need to understand how

the system scales so that capacity can be properly planned.

Throughout this document performance refers to how quickly the system is able to process a request. This

is measured as the time taken for a web application to deliver a page to a user or for a web service to

respond to its caller.

Scalability refers to the system’s ability to increase its throughput whilst maintaining good response times.

For example, if the number of real-time users is doubled the system should be able to sustain twice as

much throughput without an increase in response time. A system with good scalability characteristics can

make use of additional hardware to support the increased load.

When measuring performance and scalability it is important to use test scenarios that represent the way in

which the system will be used in the real world. For EngageOne Compose there are four primary use

cases, divided into real-time and batch.

Real-time use cases

1. Interactive – Users of the EngageOne Interactive application or custom applications that behave in

a similar way. These users create one document at a time by filling in customer-specific data and

then submitting the document for composition. The Interactive application enables users to track

and update their documents in a task list.

2. On-demand – An external system calls the EngageOne web services to compose and deliver a

document.

Batch use cases

3. Accumulated Batch – A document created by an Interactive or On-demand user can have its

output type set to Batch. The system will add it to a collection of documents to be composed

later, as part of a batch. Documents accumulate until the batch is started either manually or by a

timer.

4. Non-accumulated Batch – The system composes a set of documents based on data in an input file.

The there are no Interactive or On-demand users involved in the process.



Executive Summary A reference hardware configuration was used to obtain the numbers below. The details of the hardware

are provided later in the document, but in summary it consisted of two application servers, one file server

and one database server. Increased throughput can be achieved by using larger hardware configurations.

Real-time workload

The table below summarizes the throughput and response times for the real-time scenarios using the

reference hardware.

Test Scenario Throughput 1 Response time 2 Max supported users 3

1. Interactive 8 tps (2-server cluster) 300 ms 2,400 to 14,400

2. On-demand 27 tps (2-server cluster) 400 ms 8,100 to 48,600

Notes:

1. Throughput is expressed as transactions per second. A transaction represents one iteration of the

test scenario.

2. The response times are averages across all of the pages and web service calls in the scenario.

3. The number of supported users varies greatly depending on the workload of each user. The

numbers shown in the ‘Max supported users’ column table represent the maximum number of

busy and light users respectively. A busy user creates one document every five minutes and a light

user creates one every 30 minutes.

Later sections of this document describe how the system scales out (by using more servers) and scales up

(by using servers with more cores).

The results show that the system scales out in a nearly linear way when executing the real-time scenarios.

This means that doubling the number of servers results in approximately double the capacity.

The system also scales up nearly linearly, to a maximum of eight cores per server. The system will use

more than eight cores if they are available, but the benefit is no longer linear. When deploying on

hardware with many more than eight cores it is therefore recommended to use virtual machines, each

with four to eight cores.

Batch workload

The batch process runs on a single server. Using the reference hardware configuration, batches of 100,000

documents were processed at the rate shown in the table below.

Batch type Number of docs Time taken Rate (per hour) Rate (per second)

Accumulated 100,000 35:38 minutes 168,382 47

Non-accumulated 100,000 28:00 minutes 214,286 60



Architectural Overview The EngageOne Compose server is made up of a number of bundles, each of which has a particular role.

The system is designed so that each bundle can be scaled independently to meet the needs of a particular

environment. The bundles are as follows:

Security bundle – Responsible for the authentication and authorization of users and web service

callers.

Core bundle – Responsible for the EngageOne Interactive web application and web services.

Composition bundle – Responsible for composing documents.

Conversion bundle – Responsible for converting the format of attachments. Note that the

conversion bundle was not used in the tests described in this paper.

Batch bundle – Responsible for processing batches of documents and also for purging old

documents from the database and the file system.

The Core and Composition bundles are the most CPU intensive. A typical deployment will use one instance

of core and one instance of composition per application server. Generally, the security bundle is deployed

on two servers – a primary and a replica.

A minimum of two application servers should be used in a production environment for resilience purposes.

Depending on the load placed on the system additional application servers may be required. The

application servers require access to three shared resourced: A database server (running Oracle or SQL

Server), a file system and an LDAP server.

The Reference Environment described later in this document is deployed as follows:

The reference system uses two application servers, each of which has the Security, Core and Composition

bundles installed. Additionally, one of the servers has the Batch bundle.



For testing purposes, a load generation tool, JMeter, replaces the real users in this environment. The load

is distributed evenly across the two application servers by a load balancer.

Performance and scalability factors There are many factors that can influence the response times and throughput of an EngageOne Compose

deployment. These include:

Hardware – The system is capable of scaling out through the addition of extra servers and scaling

up by increasing the processing capacity of each server. There are two shared resources – the

database and the file system – and these must be sized appropriately for the anticipated load as

they cannot be scaled out.

Size and complexity of templates – Larger and more complex templates require more CPU cycles to

process them and use more storage and I/O bandwidth on the file system. The mix of templates

must be considered when sizing a system.

Different types of workload (Interactive, On-demand and Batch) affect the system in different

ways. The mix of workload is therefore an important factor when sizing a system.

Housekeeping processes such as purging work items and maintaining the database are important

for keeping a system running efficiently. The EngageOne Compose Administration Guide provides

advice on maintaining the system for best performance.

Real-time test scenarios There are two test scenarios which are representative of the way the system is used by real-time users, the

scenarios are:

Interactive – This scenario simulates users manually creating documents through the EngageOne

Interactive application. As well as the creation of documents, this scenario also includes populating the

application’s home page with information about tasks that are relevant to the user.

This scenario includes approximately 12 seconds of think time per iteration. A real user would take several

minutes to execute the steps in the scenario, so the think time is not intended to be representative.

The user actions included in the Interactive scenario are:

Land on the EngageOne Interactive start page.

Log in

Repeat the following steps three times

o Go to home page (populates various task lists)

o Search for template and randomly select one of the three available

o Create a work item based on the template

o Load the ActiveX editor and fill in the required fields

o Search for delivery option and select an option with two channels

o Submit the document

o Return to home page (populates various task lists)

o Wait for the document to be composed

o Open the work item and confirm its delivery

Log off



On-demand – This scenario simulates an environment where an external application is submitting requests

to EngageOne Compose via its web service interface. Each request results in a document being composed

and delivered to two different channels. This scenario runs with no think time, therefore a new request is

made as soon as the previous one completes.

The web service requests made in this scenario are:

Search for template and randomly select one of the three available

Search for delivery option and select an option with two channels

Deliver document

Invalidate session

Batch test scenarios Whilst the Interactive and On-demand scenarios are the primary focus of this document, the performance

of batch jobs are also tested. Accumulated Batch and Non-accumulated batch are tested by timing the

total execution time of a manually initiated batch job.

Reference Configuration The Reference configuration consists of a defined set of hardware, software, templates and test scenarios.

Variations to this configuration can be used and the results can be compared with the reference. For

example, it is possible to measure the effect of doubling the number of application servers.

All of the results described in this document were obtained by testing on virtual hardware in Amazon Web

Services (AWS). Therefore, the reference hardware is defined in AWS terminology.

The reference hardware configuration is as follows:

Server purpose Instance type Number

used

Notes

Application Server m5.2xl

4 cores (8 vCPUs)

32GB RAM

2 The m5 family of virtual machine instances

are designed for general purpose

workloads.

File Server m5.4xl

8 cores (16 vCPUs)

400GB drive with

5000 Provisioned IOPS

1 An m5.4xl server was chosen because it

guarantees the availability of high I/O

throughput to the disk. The file server

does not require a large number of cores

but the disk I/O is very important.

Database server db.m4.2xl

4 cores (8 vCPUs)

32GB RAM

100GB drive

2000 Provisioned IOPS

1 AWS RDS instance running Oracle

Enterprise Edition 12.1.



The test scenarios all use three different templates of varying size and complexity. Each iteration of the

scenario randomly chooses one of these templates, so each one is used for 1/3 of the iterations.

Template type Description

Letter Relatively simple. Composed document size is 150KB.

Policy document 20-page document. Composed size is 350KB.

Brochure Three pages long, with images. Composed size is 1.8MB.

Two real-time scenarios are executed as part of the reference configuration – the Interactive and On-

demand scenarios. The two scenarios are tested independently and the results are presented separately.

The tests involve ramping up the load over a period of time and measuring how the throughput and

response times vary with load. For convenience each test is one hour long. Several test runs were

executed with different levels of load to determine the amount of load required to fully saturate the

system.

The throughput for the Interactive and On-demand scenarios using the reference hardware was as follows:

As mentioned previously, a transaction in this context is a single iteration of the test scenario. Each

iteration includes searching for a template and submitting it to be composed. In addition, the Interactive

scenario includes populating the task lists on the user’s home page. The Interactive scenario therefore

does considerably more work during each iteration, which is why Interactive transactions per second (tps)

are much lower than the On-demand transactions per second.

The graph shows that for both test scenarios the throughput rises as the load increases, up to the system’s

saturation point. For the Interactive scenario the maximum throughput is approximately 9 tps and for On-

demand it is approximately 31 tps.



The chart below shows how the average response time changes as the load increases. The response times

shown are averages of all web service calls in the scenario (excluding those required to log on and off the

application).

It is clear that response times increase with load, which is to be expected. Plotting the response time

against the throughput can be helpful when estimating the “usable” capacity of a system. The usable

capacity represents the work that the system can do without encountering unreasonably long response

times.

This chart confirms that the system can deliver a certain level of throughput whilst maintaining good

response times, but if the system is asked to deliver slightly more throughput then the response times will

increase significantly. For the Interactive scenario the usable throughput is 8 tps (compared to its



maximum throughput, which is 9 tps). For the On-demand scenario the usable throughput is around 27

tps (compared to its maximum of 31 tps).

Throughout this paper all throughput numbers refer to usable throughput unless they are specifically

stated as being maximums.

The number of concurrent users that a system can support depends largely on the workload of those

users. For example, a busy user who is creating 12 documents per hour would use six times as much

system resource as a light user who creates only 2 documents per hour. Therefore, a given hardware

configuration can support six times as many light users as busy users. For any implementation it is

important to understand the number of users and their workload before attempting to size the hardware.

The busy and light users described here are just examples and they will not be applicable to all

implementations.

Multiplying the usable throughput by 3600 (number of seconds in an hour) and dividing by the average

number of documents per user per hour gives the maximum number of concurrent users supported for a

given scenario.

System capacity (tps) Number of busy users supported Number of light users supported

8 (Interactive) 8 * 3600 / 12 = 2400 users 8 * 3600 / 2 = 14400 users

27 (On-demand) 27 * 3600 / 12 = 8100 users 27 * 3600 / 2 = 48600 users

It is best practice to size a system so that the anticipated maximum load does not saturate the system. As

a guideline, the system should never exceed 75% of its saturation point. See the capacity planning section

later in this document for more details.

The graph below shows the CPU utilization for the application servers and the database servers in both test

scenarios. The “App” lines show the average of the two application servers.



This graph shows that the application servers’ CPUs reached nearly 90% for the On-demand scenario but it

was a little lower, around 75%, for the Interactive scenario. The situation for the database server is

different – during the On-demand scenario it reached about 45% and during Interactive it reached about

55%. This suggests that the Interactive scenario places more load on the database than the On-Demand

scenario.

Another key resource is the shared file system, or Active Drive. All of the application servers write files to

this shared location and it is important that its disk system has sufficient bandwidth to cope with the load.

The bandwidth is measured in IO Operations Per Second (IOPS).

The graph below shows the IOPS consumed during the On-demand and Interactive scenario tests.

The graph shows that the On-demand scenario reached a peak of around 2500 IOPS whilst delivering a

usable throughput of 27 tps. This means that the system uses approximately 93 IOPS for each transaction-

per-second of throughput. This can be rounded up to 100 IOPS per tps for convenience.

The Interactive scenario reached a peak of around 1000 IOPS whilst delivering a usable throughput of 8

tps. The system therefore uses approximately 125 IOPS per tps for the Interactive scenario.

These numbers provide useful input to the capacity planning process, as described later in the document.

Scaling out Scaling out, or horizontal scaling, refers to the ability of a system to handle more load when additional

servers are added to the environment. In the case of EngageOne Compose it is possible to add extra

application servers to the cluster to achieve greater throughput. The database server and shared file

system must be sized appropriately so that they can handle the extra load.

Load tests were conducted with various numbers of application servers, as shown in the table below. In all

cases the specification of the application servers was the same as in the reference test. They were

m5.2xlarge instances, which have 4 cores and 32GB RAM. The file server for the shared file system (Active



Drive) was sized to handle a large amount of load and the same server was used in all test configurations.

However, the database server was sized for each specific test and three different sizes of database server

were used in total.

Number of application servers Database instance

1 db.m4.2xlarge (4 cores)

2 (Reference configuration) db.m4.2xlarge (4 cores)



(Note: There is no “8xlarge” instance type)

Note that a single-server test is included for completeness but in a production environment there must

always be at least two servers for resilience.

The two graphs below show the throughput obtained for both the Interactive and On-demand scenarios.

The scenarios are presented separately for clarity.



For both scenarios it is clear that doubling the number of application servers results in a significant

increase in throughput. As before, the response time can be plotted against the throughput to enable the

usable throughput to be estimated.



The usable throughput for each combination of scenario and hardware is as follows:

Number of app servers

(4 cores each)

Database cores

Usable tps

Interactive

Usable tps

On-demand

1 4 cores 5.5 14

2 (Reference configuration) 4 cores 8 27

4 8 cores 15 53

8 20 cores 39 90

Plotting these numbers on a graph shows that the system has near linear scalability. i.e. Doubling the

number of application servers (and adjusting the size of the database server) gives the system almost

double the capacity.



The shape of the On-demand scenario line is as expected. It is almost straight but curves downwards

slightly at 8 nodes. This shows that the CPU capacity of the application servers is the limiting factor for the

On-demand scenario.

The shape of the Interactive scenario line is a little different. It curves up instead of down at 8 nodes. The

reason for this is that the Interactive scenario causes much more work in the database. The database used

for the 8-node tests had 20 cores (compared with 8 cores for the 4-node test). These additional cores

enable the database to process 2.5 times more work, and hence the overall throughput of the system

more than doubled when moving from 4 to 8 application servers. This shows that the CPU capacity of the

database is the limiting factor for the Interactive scenario.

The graph below shows the maximum CPU utilization of the application servers and the database server

for each hardware configuration.



A number of observations can be made from this graph.

On-demand scenario:

o The application server CPU was relatively high during all tests, ranging from about 90% in

the 2-server test to about 75% in the 8-server test. This shows that each application

server was working near to its limit, which is good. This suggests that additional On-

demand throughput could be achieved by adding more application servers.

o In the 2-server test the database server was loaded to about 50%. This might have a small

impact and slightly better throughput might have been possible with a larger database

instance.

Interactive scenario:

o With 2 and 4 application servers the database CPU utilization was around 55-60%. This

increased to around 90% when 8 application servers were used, even though the database

had 20 cores.

o This shows that the database is the limiting factor for the Interactive scenario and that

deploying more than 8 servers is unlikely to increase the overall capacity of the system.

Scaling up Scaling up, or vertical scaling, refers to the ability of a system to handle more load when larger servers are

used. In the case of EngageOne Compose it is possible to obtain more throughput by deploying application

servers with additional cores. The database server and shared file system must be sized appropriately so

that they can handle the extra load.

Load tests were conducted using application servers with different numbers of cores, as shown in the table

below. All tests used a cluster of two application servers so that the results can be compared with the

reference test. The larger servers had more RAM than the smaller ones but none of the servers was

constrained by its RAM. The file server for the shared file system (Active Drive) was sized to handle a large

amount of load and the same server was used in all test configurations. However, the database server was

sized for each specific test and three different sizes of database server were used in total.

Note: EC2 “m5” instances were used. These instances are available with 2, 4, 8 and 24 cores. There is no

instance type with 16 cores. The RDS instance types have 4, 8 and 20 cores.

Cores per server Instance type Database instance

2 m5.xlarge db.m4.2xlarge (4 cores)

4 (Reference configuration) m5.2xlarge db.m4.2xlarge (4 cores)

8 m5.4xlarge db.m4.4xlarge (8 cores)

24 m5.12xlarge db.m4.10xlarge (20 cores)

The two graphs below show the throughput obtained for both the Interactive and On-demand scenarios.

The scenarios are presented separately for clarity.



The scale-up results are similar to the scale-out results – adding more cores significantly increases the

capacity of the system. Once again, the response times can be plotted against the throughput to enable

the usable throughput to be estimated.



The usable throughput for each combination of scenario and hardware is as follows:

Cores per app server

(2 servers in cluster)

Database cores Usable tps

Interactive

Usable tps

On-demand

2 4 cores 7 13

4 (Reference configuration) 4 cores 8 27

8 8 cores 19 50

24 20 cores 28 95



Plotting these numbers on a graph shows that the system is able to make good use of the additional cores

when executing the On-demand scenario. When executing the Interactive scenario, the throughput

doubles when moving from four cores to eight cores but there is a much smaller increase when moving

from 8 to 24 cores. This is because the database was the limiting factor in the 24-core Interactive test, as

explained later.

Looking at the CPU utilization of the Application Servers and Database Server it is clear that the Database

Server is the bottleneck in the Interactive scenario but not in the On-demand scenario.



For the Interactive scenario, when the Application Servers have 8 cores each and the Database Server also

has 8 cores, the Database CPU runs at around 80%. The Database CPU stays at 80% when the Application

servers have 24 cores and the database has 20 cores. 80% is very high for a database CPU and this result

suggests that a larger Database Server might enable the system to sustain more throughput.

For the On-demand scenario, the Database Server remains between 30% and 50% in all tests. The CPUs of

the Application Servers are higher, ranging from around 60% up to 95%. This suggests that the Database

Server was sized appropriately for the workload.

Batch Processing The previous sections have focused on the scalability of the real-time scenarios: Interactive and On-

demand. Batch scenarios were not included in the Scale-out and Scale-up sections because batch

processing does not scale in the same way.

The Batch bundle is usually installed on a single application server within the EngageOne Compose cluster.

This server may be shared with other bundles or it may be dedicated to batch processing.

Batch is often run during a quiet period, such as overnight. The results presented here were obtained by

running batches when there was no other work happening in the system.

The main unit of measurement for batch processing is the rate of document composition. This can be

expressed as documents per hour, per minute or per second. To be consistent with the other data

presented in this paper the unit will be documents per second.

Batch processing takes place on a single server and it uses a limited number of threads. Therefore, moving

it to a larger server or introducing additional servers has little effect on the rate of composition.

The Reference test environment was used, with the Batch bundle running on a single server. The batch

consisted of equal numbers of the three templates described earlier. The results were as follows:

Batch type Number of docs Time taken Rate (per hour) Rate (per second)

Accumulated 100,000 35:38 minutes 168,382 47

Non-accumulated 100,000 28:00 minutes 214,286 60

Database Platforms The reference configuration uses Linux for the EngageOne servers and Oracle for the database.

EngageOne Compose also supports Windows servers and SQL Server databases. All results presented in

this paper were carried out using Linux application servers. It is likely that Windows application servers

would give very similar results and this will be validated in a future test.

This section presents the results of running the reference hardware with SQL Server instead of Oracle. The

database server is the same specification as the Oracle server in the Reference test – RDS instance type

db.m4.2xl, which has 4 cores (8 vCPUs) and 32GB RAM with a 100GB drive with 2000 Provisioned IOPS.

SQL Server 2016 Standard Edition was used.



The first graph below compares the throughput of the On-demand and Interactive scenarios using the SQL

Server and Oracle databases.

It can be seen that the throughput of both configurations is very similar. The response time graph below

also shows very similar results for the two types of database.

The Reference configuration (using Oracle) gave usable throughputs of 8 tps for the Interactive Scenario

and 27 tps for the On-demand scenario. The graph below plots the response times against throughput for

the SQL Server tests and shows vertical lines at 8 and 27 tps. It can be seen that the usable throughput on

SQL Server is very similar to the reference results.



Capacity Planning Capacity planning is the process of estimating the hardware required to support the anticipated workload

in a given deployment. There are four key steps to the process:

1. Estimate the demand placed on the system by its users and batch operations.

2. Using the data presented in this document, estimate the number of application servers required to

support the demand.

3. Determine the specification of the database server and shared file system required to support the

application servers.

4. Carry out tests to validate that the environment will have sufficient capacity to meet the needs of

its users.

1: Estimate the demand The goal is to estimate the peak throughput that the system needs to support, measured in documents per

second. When the documents per second figure has been estimated it is straightforward to calculate the

hardware requirements.

Often the batch operations can be ignored because they will be run during quiet times of the day. If

batches are to be run at peak times it might be necessary to increase the capacity of the system beyond

the level needed for real-time workloads. Batch is ignored for the example calculations in this section.

Most organizations think about their document production in terms of user numbers or monthly or annual

production figures. This is a good starting point but additional information and some assumptions are

required to arrive at the documents per second figure.

Example 1

Organization A has 2000 back-office staff who will use EngageOne Interactive. The staff all work from

09:00 to 18:00 and they take a one-hour lunch break. On average each user will create 50 new documents

per day.



Average documents per second during working day =

Documents per day per user

Divided by hours worked per day

Divided by seconds in an hour

Multiplied by number of users

50

/ 8

/ 3600

x 2000

= 3.5 documents per second

3.5 documents per second is the average, but there will be busy periods and quiet periods. The system

must be sized to easily cope with the busy periods. During the busy period(s) the demand may reach two

or three times the average demand. For this example, we will multiply the average by three and round up

the result, giving a peak load of 11 documents per second being created in EngageOne Interactive.

Example 2

Organization B has exactly the same number of back-office EngageOne Interactive users as Organization A

and they create the same number of documents per day. Their peak load from EngageOne Interactive is

therefore 11 documents per second.

Organization B also plans to integrate EngageOne Compose into their CRM system via the On-demand web

service interface. The CRM system supports the contact center where two shifts of users cover the period

from 07:00 to 22:00 seven days a week. Statistics show that the contact center creates around 50 million

documents per year and that half of these are created between 08:00-11:00 and 18:00-21:00 on

weekdays.

Average documents per second during busy hours =

Total documents per year

Divided by two (half of all doc are in busy periods)

Divided by number of weekdays in a year

Divided by number of busy hours in a day

Divided by number of seconds in an hour

50,000,000

/ 2

/ 260

/ 6

/3600

= 5 documents per second

Even during the busy hours there will be periods of higher and lower demand so the system must be able

to cope with additional load. If we assume that the peak load is double the busy-period average, we arrive

at a figure of 10 on-demand documents per second at peak times.

There is some overlap between the busy periods of the back office and the contact center. Therefore, the

deployment needs to support 11 Interactive documents per second plus 10 on-demand documents per

second.

2: Estimate the number of application servers required The starting point is the number of documents per second achieved with the reference configuration,

which uses two application servers with four cores each.



The reference configuration was able to support 8 documents per second for the Interactive scenario or 27

documents per second for the On-demand scenario. Put another way, for each document created through

Interactive, the system can create approximately three on-demand documents. For simplicity, the capacity

of the reference system will be taken to be 8 Interactive documents per second. The number of on-

demand documents will be divided by three to obtain the equivalent number of Interactive documents.

The actual servers used in the deployment environment will probably not be the same as those used in the

reference configuration. The number of documents per second supported by the chosen servers must

therefore be estimated based on the number of cores per server and the type of processor used. The

scale-out results presented in this document show that the capacity of a server is approximately

proportional to the number of cores. However, it is less straightforward to compare the capabilities of

different types of processor and third-party data may need to be consulted.

The graph below shows the maximum throughput for the Interactive and On-demand scenarios for a given

number of cores. It shows that up to 16 cores the same throughput can be achieved by scaling out (more

servers) or scaling up (larger servers). If higher levels of throughput are required it is more efficient to add

more servers than to use larger servers.

Finally, it is not desirable to load the servers right up to their maximum capacity. It is recommended that

only 75% of the server capacity is taken into account for planning purposes.

Example 1

Organization A needs to support a peak load of 11 Interactive documents per second. The reference

configuration provides a maximum usable throughput of 8 documents per second, which is 4 per

application server.

Number of 4-core reference application servers required =



Total user demand (Interactive docs per second)

Divided by target capacity of each application server

Plus number of redundant servers (for resilience)

11

/ 3 [= 4 x 75%]

+ 1

= 5 application servers

Using 5 application servers, the system would have a maximum capacity of 5 x 4 = 20 Interactive

documents per second which is significantly more than the 3.5 per second average. This means there is

plenty of “headroom” for busy periods and the system would still be able to cope with the loss of at least

one server.

Example 2

Organization B needs to support a peak load of 11 Interactive documents per second plus 10 on-demand

documents per second. Dividing the on-demand number by three gives the equivalent Interactive number,

which we round up to 4. This gives a total requirement of 11 + 4 = 15 Interactive documents per second.

Number of 4-core reference application servers required =




15

/ 3 [= 4 x 75%]

+ 1


Organization B would therefore require 6 application servers of the reference specification, with 4 cores

each.

The organization may prefer to deploy a smaller number of servers, each with more cores. Each 8-core

application server should be able to deliver twice the throughput of a 4-core server. The calculation of

application servers would then be as follows:

Number of 8-core application servers required =




15

/ 6 [= 8 x 75%]

+ 1


3: Define the database server and shared file system There are several considerations for the database server:

It must have enough processing power (CPU cores) to support the expected workload. Results

presented in this document have shown that the database server needs approximately one core

for every two cores in the application server cluster. For example, a cluster of four application



servers with four cores each has a total of 16 cores. Therefore, the database server should have

(at least) 8 cores.

Its disk system must have enough I/O bandwidth so that it does not become a constraint. The

database disk I/O observed during the performance tests was approximately 60 IOPS per tps for

the Interactive scenario and approximately 20 IOPS per tps for the On-demand scenario. These

should be seen as absolute minimum figures and it is always better to provision disks with the

highest possible throughput capability.

There must be enough storage available on the disk system. This document focuses on the

performance and scalability aspects of the system and does not provide guidance on disk space

requirements.

For the shared file system, there are two considerations:

The file system must have enough I/O bandwidth so that it does not become a constraint. In most

deployments a SAN or NAS device incorporating SSD drives will be used. The file system disk I/O

observed during the performance tests was approximately 125 IOPS per tps for the Interactive

scenario and approximately 100 IOPS per tps for the On-demand scenario. Again, it is always

better to provision hardware with the highest possible throughput capability.

The shared file system must have enough space to store all of the in-progress and completed

documents, batch jobs, and other files. This document focuses on the performance and scalability

aspects of the system and does not provide guidance on disk space requirements.

Example 1

Organization A has estimated a peak load of 11 documents per second using the Interactive application.

This load can be handled by 4 x 4-core application servers (ignoring the fifth server, which is just for

redundancy).

The peak load of 11 Interactive documents per second can be handled by a database server with 8 cores

(16 cores divided by 2). This level of load will generate approximately 660 IOPS (60 IOPS multiplied by 11

Interactive documents per second). A file system with at least 1000 IOPS should therefore be provisioned.

11 Interactive documents per second would require approximately 1375 IOPS on the shared file system

(125 IOPS per tps multiplied by 11 documents per second). A file system with at least 2000 IOPS should

therefore be provisioned.

Example 2

Organization B has estimated a peak load equivalent to 15 documents per second using the Interactive

application. This load can be handled by 5 x 4-core application servers (ignoring the sixth server, which is

just for redundancy).

The peak load of 15 Interactive documents per second can be handled by a database server with 10 cores

(20 cores divided by 2). Depending on the available infrastructure, it might be more practical to deploy a

server with 12 or 16 cores. This level of load will generate approximately 900 IOPS (60 IOPS multiplied by

15 Interactive documents per second). A file system with at least 1200 IOPS should therefore be

provisioned, ideally 1500 IOPS, or more if available.



15 Interactive documents per second would require approximately 1875 IOPS on the shared file system

(125 IOPS per tps multiplied by 15 documents per second). A file system with at least 2500 IOPS should

therefore be provisioned.

4: Validate the environment through testing There are many factors that can affect the capacity of a system. For example, the templates used in the

performance tests were designed to represent real world documents, but every organization has different

requirements and some templates will be more complex than others. More complex templates might

require more CPU during the composition phase and also may generate more I/O traffic in the shared file

system.

It is not possible to simulate every variation of a user’s work flow and in some implementations the system

might be used in very different ways. It is therefore essential that some testing is carried out to validate

any assumptions made during the capacity planning process. The tests should use the templates from the

production system and test cases should be built to simulate the way the users work. If there are multiple

groups of users carrying out different types of work, test cases should be created for each type of user.

The techniques illustrated in this document can be used to measure the capacity of a “reference” system

and the results can then be extrapolated to validate the capacity planning assumptions.

Diagnostic and Tuning Techniques It is often possible to identify a system’s bottleneck by looking at some basic resource information.

A key indicator is the CPU utilization on each of the servers. If the application servers’ CPUs are running at

more than 75% it suggests that additional application servers are required. If the database server’s CPU is

high it suggests that either the server is not large enough for the load or that some database maintenance

is required.

Measuring the disk I/O of the shared file system will help to identify if there is a constraint whilst reading

or writing to the disk. The number of bytes per second or IOPS should be measured if possible, and these

numbers should be compared with the stated capabilities of the file server, SAN or NAS.

If these basic measurements do not yield any useful results, more detailed information can be collected

from the Java Virtual Machines (JVMs) used by the EngageOne bundles. The EngageOne Compose

Administration Guide explains how to enable Java Management Extensions (JMX) so that third party tools

can monitor resource usage in the JVMs. The resources that can be monitored include:

Memory usage

CPU usage

Threads

The database is a key, shared resource and it is important that it operates efficiently. Housekeeping tasks

should be carried out regularly and a DBA should ensure that appropriate maintenance jobs are run.

Vendor-specific monitoring tools such as SQL Server Profiler and Oracle Enterprise Manager can help to

identify performance issues in their respective databases.