unica flowchart performance
TRANSCRIPT
© 2011 IBM Corporation
Mark GrinerEMM Managing Consultant, Professional Services
Campaign Flowchart Design For Optimal Performance
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Agenda
IBM Unica Campaign Fundamental Features–Table mapping–Flowchart operation–Custom macros
Design for Performance–Design process–Flowchart building blocks–Campaign server throughput
Q & A
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Target Audience
Organization Roles:–Application owner –Technical application lead–Application administrator
Skill Sets:–OS scripting–SQL–Advanced IBM Unica Campaign features
Responsibilities:–Application configuration–Advanced design and performance tuning–Creating reusable application objects (catalogs, custom macros,
etc)
IBM Unica Campaign Fundamental Features
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Table Mapping: TerminologyIn Data Marts
“Dimension” tables contain normalized lists of values (States, Zips, Products, etc)
“Fact” tables contain quantitative characteristics of business transactions with foreign key for dimensions
A table is either a fact or a dimension
In IBM Unica Campaign:
“Base” tables contain audience keys– For a “normalized” base table audience column is the primary key– “Denormalized” base tables contain multiple records with the same audience key
“Dimension” tables could be joined to base tables or other dimensions
A table could be mapped in multiple times in different roles
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
The Role of Table Mappings
Providing IBM Unica Campaign with metadata about audience levels and table relationships
Controlling the complexity of user queries
Mapping at the right audience level(s) simplifies flowchart logic
Defining the structure of SQL generated by IBM Unica Campaign:
– Joins between base and dimension tables are executed in the database
– Joins between base tables are performed on IBM Unica Campaign server
– Types of join (inner, outer or automatic) between two tables
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Table Mappings: Example
TransactionsTxnID (PK)
CustID (FK)ProdID (FK)TimeID (FK)ChannelID (FK)QuantityAmountDiscount
CustomerCustID (PK)
NameAddress…
ProductProdID (PK)
SKUBrand…
ChannelChannelID (PK)
Description…
TimeTimeID (PK)
Date…
Dimension
Dimension
Dimension
•Normalized Base Table•Dimension of Transactions
•De-normalized Base Table•Dimension of Customer
Audience Level: Customer
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Table Mapping: Example 2
TransactionsTxnID (PK)
CustID (FK)ProdID (FK)TimeID (FK)ChannelID (FK)QuantityAmountDiscount
CustomerCustID (PK)
First NameLast nameHH ID…
ProductProdID (PK)
SKUBrand ID…
ChannelChannelID (PK)
Description…
TimeTimeID (PK)
Date…
HH ID (PK)
AddressCiyStateZIP…
Household
BrandBrandID (PK)
Brand NameDescription…
Audience Level: HouseholdNormalized Base Table
Dimension of HH
Dimension of Customer
Dimension of Transaction
Dimension of TransactionDimension of Transaction
Dimension of Product
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Understanding Flowchart Operation (No in-db)
Campaign Server DatabaseUniverse and Suppression Queries
ProcessSelectQueries
ID List Merge
Temp Table Creation
ID List Upload
Query for Derived Field DataProcessSelectQuery
ID Lists Retrieval
Data RetrievalID List Segment
Query for Output Fields
Data RetrievalFile Output
ProcessSelectQuery
DF Calc
Sorting
Temp Table Creation
ID List Upload
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Understanding Flowchart Operation (With in-db)
Campaign Server DatabaseUniverse and Suppression Queries Process
Insert intoSelectQueries
Query for Derived Field DataProcessSelectQuery
Merge Query
Data Retrieval
ID List Segment
Query for Output FieldsProcessInsert intoSelectQuery
DF Calc
Temp Table Creation
ID List Upload
ProcessInsert into Select Queries
© 2011 IBM Corporation
Marketing Innovation Summit 2011
Campaign Macros vs. Database FunctionsCampaign GROUPBY
Macros
SQL Server 2005
Oracle DB2 Netezza Teradata
Aggregation FunctionsAvgOf AVG AVG AVG AVG AVG
CountOf COUNT COUNT COUNT COUNT COUNT
DistinctOf COUNT (DISTINCT) COUNT (DISTINCT) COUNT (DISTINCT) COUNT (DISTINCT) COUNT (DISTINCT)
MaxOf MAX MAX MAX MAX MAX
MinOf MIN MIN MIN MIN MIN
SumOf SUM SUM SUM SUM SUM
StdDevOf STDDEV STDDEV STDDEV STDDEV STDDEV_POP
Analytic FunctionsIndexOf ROW_NUMBER ROW_NUMBER ROW_NUMBER ROW_NUMBER ROW_NUMBER
MedianOf ROW_NUMBER / COUNT = 0.5
PERCENTILE_DISC(0.5)
ROW_NUMBER / COUNT = 0.5
ROW_NUMBER / COUNT = 0.5
ROW_NUMBER / COUNT = 0.5
RankOf RANK DENSE_RANK DENSE_RANK DENSE_RANK RANK *)
© 2011 IBM Corporation
Marketing Innovation Summit 2011
Custom Macro Definition ExampleTools-> Custom Macros
Name and parameter list
Description/Help
Expression
– SQL that selects audience keys and optional value
– Includes TEMPTABLE token and <parameters>
– Uses {} for conditional query generation
Expression type
Data source
Return value type
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Custom Macro Usage Example• Used in query definitions• Listed in Query Helper• Syntax help is available• Top level Select SQL:
select a.CustomerID, sum(a.UnitPrice*a.Quantity -a.Discount)
from dbo.vOrderDetails agroup by a.CustomerID
• Mid stream SQL:select a.CustomerID,
sum(a.UnitPrice*a.Quantity -a.Discount)
from dbo.vOrderDetails a , UAC_34 bwhere a.CustomerID=b.CustomerIDgroup by a.CustomerID
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Using SQL Custom Macros with In DB Processing03/31/2008 20:42:31.756 (4312) [I] [PROCESS] ONE_PROCESS_RUN_START Select2
...03/31/2008 20:42:31.776 (4312) [I] [DB QUERY] [Select2] FIONA: DROP TABLE UAC_116_4...03/31/2008 20:42:35.211 (4312) [I] [DB QUERY] [Select2] FIONA (thread 00000D08): CREATE TABLE UAC_116_4 NOLOGGING AS
SELECT AUTODCC.ACCT.INDIV_ID INDIV_ID, sum ( ACCT_BALANCE) Val
FROM AUTODCC.ACCT , UAC_116_3WHERE AUTODCC.ACCT.INDIV_ID = UAC_116_3.INDIV_IDGROUP BY AUTODCC.ACCT.INDIV_ID [sdbtacc:2387]
…03/31/2008 20:42:48.280 (4312) [I] [DB QUERY] [Select2] FIONA (thread 00000D98):
CREATE TABLE UAC_116_5 NOLOGGING AS SELECT DISTINCT UAC_116_4.INDIV_ID FROM UAC_116_4
INNER JOIN UAC_116_3 ON (UAC_116_4.INDIV_ID=UAC_116_3.INDIV_ID) WHERE ((UAC_116_4.VAL > 5000)) [sdbtacc:2387]
…03/31/2008 20:42:54.168 (4312) [I] [DB QUERY] [Select2] FIONA (thread 00000B88): SELECT COUNT (*) FROM UAC_116_5 [sdbtacc:2387]
Calculating macro values
Using macro values
Design For Performance
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Design for Performance Process
Execute Flowchart
Analyze Collect Data
DesignDesign/Config Change
Communicate
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Design Best PracticesWhen designing individual flowcharts
–Distribute workload between AC and Database• Perform data intensive operations where they can be done faster
– Table joins
– Complex calculations• Minimize transfer of data between Campaign and databases
–Leverage available resources• Campaign
– Virtual memory
– Connections and query threads• Database
– SQL structure (select distinct, order by)
– Query optimizer (by collecting statistics, creating indices)
– Bulk loader
– Temp table use thresholds and temp table pooling
For improving overall Campaign throughput–Optimize frequently scheduled and long running flowcharts first–Consider consolidation of large number of similar user flowcharts into a
“system flowchart”
Flowchart Building Blocks
© 2011 IBM Corporation
Marketing Innovation Summit 2011
Select Process
Process Configuration In-DB Eligibility
Query with a single base table and dimensions
Yes
Query with multiple base tables Yes(Full outer join)
Derived fields Yes(single SQL macro)
Persisted derived fields No
SQL custom macros Yes
Raw SQL Yes (with <outputtemptable> token)
Using Max Cell Size No
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Select Process Recommendations
Control table joins with appropriate table mappings
If selection based on calculated fields is required–Avoid calculations in top level selects–Use SQL custom macros to keep calculations in the database–Use Campaign macros to calculate on Unica server
Beware of performance impact of limiting cell size
© 2011 IBM Corporation
Marketing Innovation Summit 2011
Extract Process
Process Configuration In DB Eligibility
Extract within one data source and no derived fields
Yes
Extract from one data source landing in a file or another data source
No
Derived fields No
De-duping, sorting No
Use Extract when you need records not audience keys. For example:
•Pick up new transactions/responses from a touch point system•Enrich, derive, aggregate data •Move data from one location/format to another
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Example of Extract
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Extract Log
04/06/2011 13:26:13.081 (9532) [I] [DB QUERY] [Extract2] NorthWind (thread 00001C04): CREATE TABLE UAC_EX_1_8 (CustomerID nchar(5), OrderDate datetime, UnitPrice decimal(19,4), Quantity smallint, Discount real, Amount real) [sdbtacc:2663]
...
04/06/2011 13:26:14.094 (9532) [I] [DB QUERY] [Extract2] NorthWind (thread 0000219C): INSERT INTO UAC_EX_1_8(CustomerID, OrderDate, UnitPrice, Quantity, Discount, Amount) SELECT dbo.vOrderDetails.CustomerID, dbo.vOrderDetails.OrderDate, dbo.vOrderDetails.UnitPrice, dbo.vOrderDetails.Quantity, dbo.vOrderDetails.Discount, dbo.vOrderDetails.Amount FROM dbo.vOrderDetails WHERE (dbo.vOrderDetails.OrderDate > '1998-01-01') [sdbtacc:2663]
Creating extract table
Populating extract table
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Extract Process Recommendations
Perform data selection and calculations in separate Extracts
If the size of extract is small use Unica macros for calculations
If the size of extract is large– Write extract results to a working table– Create SQL macros to do calculations on data in the working table– Use Select with Derived Fields to complete calculations
When extract source does not have a common audience key– Land extract results in the marketing data mart– Join extract to Customer table as a dimension (on email and last name, for example)– Snapshot extract results plus an audience key from dimension to a work table– Map work table at a primary audience level
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Merge Process
Merge is always eligible for in db processing
Merge(OR) is implemented as UNION of input cells temp tables
Match (AND) is implemented as JOIN of input cells
Exclude by default is implemented using NOT EXISTS construct
Exclude could also be performed by Except and Not In queries
Data source level parameters that control Exclude query structure:
–UseExceptForMerge=TRUE | FALSE–UseNotInForMerge=TRUE | FALSE
© 2011 IBM Corporation
Marketing Innovation Summit 2011
Segment Process
Process Configuration In-DB Eligibility
Segment by Field No
Segment by Query Same as Select
Segment by Query with Mutual Exclusivity Same as Select
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Segment Process Recommendation
Use process configuration eligible for in db optimization
Make sure the number of segmentation cells does not exceed MaxQueryThreads setting
When using queries with a single base table in a mutually exclusive Segment, control the structure of de-duping SQL with parameters:
–UseExceptForMerge–UseNotInForMerge
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Sample Process
Not eligible for In DB Optimization
For random sampling in database use Segment with SQL custom macro that invokes random numbers generator in the database
© 2011 IBM Corporation
Marketing Innovation Summit 2011
Snapshot Process
Process Configuration Options In DBEligibility
Output from a single base table and dimensions
Yes
Output from multiple base tables and dimensions
Yes
Output of Unica Generated Fields Yes
Output of derived fields Yes(single SQL macro)
Output de-duping No
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Audience Process
Not eligible for in database optimization
Convert processing to a single audience level by using base-dimension table mappings and GROUPBY Campaign or SQL macros
Or
Replace Audience by a Snapshot to write out both audience keys for the current selection
Map in Snapshot output table at a different audience level
Use Select to pick the right keys from the output table
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Mail List Process
Mail List process generates output list and logs to Contact History
Generation of output lists is consistent with Snapshot:–When data source and target are in the same data source, list processing
happens in database–When SQL Macros are used to derive output fields, the processing
happens in database
Contact History logging will extract data to Campaign Server
Minimize the number of cells
Be cognizant of the effect that data driven offer personalization could have on performance
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Response Process
Response process reads data from an “action table” and logs responses to Response History
Response “action table” data will be extracted to Campaign Server
Response process will create and index temp tables in the system tables data source
The complexity of queries that Response generates depends on the codes that are used for response attribution
Using treatment codes is the most efficient and precise way to track responses
Managing IBM Unica Campaign Server Throughput
© 2011 IBM Corporation
Marketing Innovation Summit 2011Marketing Innovation Summit 2011
Key Steps to Better Campaign Server Performance
1. Monitor Campaign server workload and performance
2. Analyze workload and identify flowcharts with highest impact (frequency, run time, resource consumption)
3. Schedule and stagger flowchart runs to avoid resource contention
4. Re-design individual flowcharts to improve performance
5. Aggregate processing performed by multiple similar user flowcharts (e.g. run by UDM) into one “system” flowchart
Questions?
Thank you!