implementing hana

Upload: sryalla

Post on 03-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Implementing Hana

    1/65

    SAPexperts.com

    www.SAPexperts.com

    SPECIAL REPORT:

    Implementing SAP HANA,

    an End-to-End Perspective

    by Jonathan Haun, Consulting Manager, Decision First Technologies; Christopher Hickman,

    Principal Consultant, Decision First Technologies; and Don Loden, Principal Consultant for BI,

    Decision First Technologies

    In this exclusive special report, get an in-depth, step-by-step look at the aspects of implementing

    business intelligence (BI) solutions on SAP HANA. Gain insight into the ways SAP Data Services

    extracts, cleanses, and loads data into data warehouses, and how these activities integrate with

    SAP HANA. Learn how SAP HANA acts as the focal point for storing and calculating data loaded

    into its in-memory tables. Finally, find out how SAP BusinessObjects BI 4.0 analyzes and visualizes the data

    stored in SAP HANA through mobile analytics, reports, and dashboards.

    SPECIAL

    REPORT

    This report is produced by SAPexperts in cooperation with Decision FirstTechnologies.

    SAPexperts publishes the most valuable, independent SAP expertise on the web. Validated instruction, tutorials, tools, tips,and case studies are continuously added to help your entire team advance their skills and overcome the toughest SAPchallenges.

    A 12-month subscription to SAPexperts.comprovides access to the most sought-after expertise from the worlds top expertsin SAP technology. Its like adding experts to your team who have all the answers without the cost of hiring them.Learn how SAPexperts helpscorporationsand consultants, or getsubscription ratesfor your team. To speak with anSAPexperts representative, call +1-781-751-8799.

    This document is for your personal use only. Reproduction or distribution in any form is strictly prohibited without thepermission of the publisher, SAPexperts. For information about SAPexperts visit SAPexperts.com.

    http://www.sapexperts.com/http://sapexperts.wispubs.com/Your-SAP-Investment-is-Too-Importanthttp://sapexperts.wispubs.com/Your-SAP-Investment-is-Too-Importanthttp://sapexperts.wispubs.com/your-sap-investment-is-too-importanthttp://sapexperts.wispubs.com/your-sap-investment-is-too-importanthttp://sapexperts.wispubs.com/sharpen-your-sap-skillshttps://sapexperts.wispubs.com/Account/Subscribe/Subscription-Form?user=corporationhttps://sapexperts.wispubs.com/Account/Subscribe/Subscription-Form?user=corporationhttps://sapexperts.wispubs.com/Account/Subscribe/Subscription-Form?user=corporationhttp://sapexperts.wispubs.com/Your-SAP-Investment-is-Too-Importanthttp://sapexperts.wispubs.com/Your-SAP-Investment-is-Too-Importanthttps://sapexperts.wispubs.com/Account/Subscribe/Subscription-Form?user=corporationhttp://sapexperts.wispubs.com/sharpen-your-sap-skillshttp://sapexperts.wispubs.com/your-sap-investment-is-too-importanthttp://sapexperts.wispubs.com/Your-SAP-Investment-is-Too-Importanthttp://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    2/65

    Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Decision First TechnologiesDecision First Technologies is a professional services company specializing

    in delivering end-to-end business intelligence solutions to its customers.

    An SAP gold channel partner and six-time SAP BusinessObjects Partner of

    the Year, DFT resells and offers certified consulting, training and technical

    support for all analytics as well as database and technology products in

    the SAP BusinessObjects portfolio. DFTs comprehensive BI solutions

    allow users to access, analyze, and share information stored in multiple data sources, including SAP and

    non-SAP environments. The companys data warehousing practice designs and implements enterprise

    solutions utilizing SAP Business Warehouse, appliances such as SAP HANA or traditional relational

    database technologies. With more than 10 years of experience, DFT has helped hundreds of companies

    make better business decisions, dramatically reducing costs, increasing revenues, and boosting profits.

    For more information on Decision First, visit http://www.decisionfirst.com/ or follow on Twitter

    @DecisionFirst.

    Jonathan Haun

    Jonathan Haun has more than 12 years of information technology experience and has

    served as a developer and administrator within a diverse set of technologies. Over the

    past six years, he has served as a full time SAP BusinessObjects and SAP Data Services

    principal consultant for Decision First Technologies. He has gained valuable experience

    with SAP HANA based on his project experience and management of the Decision First

    Technologies SAP HANA lab in Atlanta, Georgia. He also writes the All Things BOBJ BI

    blog at http://bobj.sapbiblog.com. You can follow Jonathan on Twitter @jdh2n. You can

    contact Jonathan by email [email protected].

    Christopher Hickman

    Christopher Hickman is a principal consultant at Decision First Technologies. He has

    experience in the SAP BusinessObjects suite of applications as well as ESRIs range

    of GIS applications (ArcGIS Desktop, Server, Business Analyst Desktop, BA Server,

    etc) and systems used to tie the two together (APOS, Centigon, Antivia), resulting in

    fully featured business and location intelligence systems. He is also heavily engaged in

    social media and the opportunities presented with the synergy generated from various

    social media utilities. You can follow Chris on Twitter @chickman72. You can contact

    Christopher by email at [email protected].

    Don Loden

    Don Loden is a principal consultant for BI at Decision First Technologies with full

    life cycle data warehouse development experience in multiple verticals. He is anSAP-certified application associate on SAP BusinessObjects Data Integrator and

    he speaks globally at numerous SAP and ASUG conferences. He has more than 12

    years of information technology experience in the following areas: ETL architecture,

    development, and tuning, logical and physical data modeling, and mentoring on data

    warehouse and ETL concepts. You can follow Don on Twitter at @donloden. You can

    contact Don by email at [email protected].

    You can contact the editors at SAPexperts by email at [email protected] follow them on

    Twitter @SAPexperts.

    http://www.sapexperts.com/http://bobj.sapbiblog.com/http://www.twitter.com/jdh2nmailto:[email protected]://www.twitter.com/chickman72mailto:[email protected]://twitter.com/donlodenmailto:[email protected]:[email protected]://www.twitter.com/SAPexpertshttp://www.twitter.com/SAPexpertsmailto:[email protected]:[email protected]://twitter.com/donlodenmailto:[email protected]://www.twitter.com/chickman72mailto:[email protected]://www.twitter.com/jdh2nhttp://bobj.sapbiblog.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    3/65

    1Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    SPECIAL REPORT:

    Implementing SAP HANA, an End-to-EndPerspective

    Managing SAP HANA with a Proper Data Model ..............................................................................................................4

    Start with a Good Data Model for SAP HANA .....................................................................................................4

    Profiling Data with SAP Information Stewards Data Insight ........................................................................9

    Loading Data into SAP HANA using SAP Data Services 4.0 ........................ ........................ ....................... 12

    SAP HANA Modeling Process ...................... ....................... ........................ ........................ ....................... ........................ .... 19SAP HANA Studio ..................... ........................ ....................... ........................ ....................... ........................ ............ 20

    Schemas ..................... ........................ ....................... ........................ ........................ ....................... ........................ .... 21

    Packages ..................... ........................ ....................... ........................ ....................... ........................ ........................ .... 22

    Attribute Views ...................... ........................ ....................... ........................ ....................... ........................ ................ 22

    Analytic Views.................... ........................ ....................... ........................ ....................... ........................ .................... 28

    Calculation Views ..................... ........................ ....................... ........................ ....................... ........................ ............ 36

    Analytic Privileges ........................ ........................ ....................... ........................ ....................... ........................ ........ 41

    Combining the Modeling Components to Produce Analytic Views and Calculation Views ........... 42

    Reporting and Analytics ....................... ........................ ....................... ........................ ....................... ........................ ............ 42

    Connecting to SAP HANA ..................... ....................... ........................ ........................ ....................... .................... 43

    SAP BusinessObjects IDT ....................... ....................... ........................ ........................ ....................... .................... 51

    Using Visualization and Analytic Tools in SAP HANA ..................... ....................... ........................ ................ 56

    Tool Connectivity Matrix ............................... ....................... ........................ ....................... ........................ ............ 62

    Conclusion ..................... ........................ ....................... ........................ ........................ ....................... ........................ ................ 62

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    4/65

    2 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    When you take a moment to think back to all thetechnical innovations that have occurred during thelast 30 years, several thoughts come to mind. There was

    the invention of the Nintendo game console. In todaysstandards, it is not a technical wonder but it did lead to the

    birth of a new market that paved the way for all the amazing

    game consoles and personal gaming devices that exist today.

    There was the invention of the Internet, which helped to

    essentially change the way we humans shop, communicate,

    share information, and collaborate. There was the invention

    of the smartphone, a device that put the power of the

    Internet in the palm of our hand in virtually every city in the

    world.

    Just imagine for a second what life would be like if

    companies such as AOL, Apple, and Nintendo lacked the ability to develop these products andbring these technical wonders to market. Technical innovation is something that we have all come

    to expect, but how does one recognize when innovation will lead to fundamental change?

    For those of us that have been working in the business intelligence (BI) arena for the past decade,

    the limitations of interacting with large quantities of data at speeds that were acceptable to

    business users has been a real challenge. The relational database technologies that had been a

    core component of our strategies were reaching a point of diminishing return. No real innovation

    was being introduced by the main database vendors or at least innovation that offered major

    performance change. In large part, that was due to their need to support legacy solutions while

    attempting to provide perceived enhancements. Their strategy for innovation was slow and

    continually centered around the use of inefficient and increasingly expensive magnetic storage

    arrays.

    In the meantime, SAP was struggling to find a solution to help its customers solve the ever

    decreasing performance issues associated with managing large volumes of SAP application

    data. In 2008, SAP began working on a pilot project to prove that the basic mechanisms and

    processes of a database could be re-developed, leveraging RAM and multi-core CPUs in a way that

    would revolutionize the capabilities of BI, analytics, and complex data processing. Based on first

    impressions and more than a year of experience working with SAP HANA, we believe SAP has

    developed an innovative data platform that will lead to revolutionary changes in BI and beyond.

    SAP HANA is a merger of software, hardware, and creative ideas that afforded SAP the

    opportunity to rethink the database platform. Because SAP had the opportunity to develop

    this technology from the ground up, without the constraints of the legacy relational database

    management system (RDBMS) vendors, innovation was an inevitable result of its efforts.

    Hardware had evolved to a state where RAM could be addressed in terabytes and CPU cores could

    be numbered in the hundreds, all within a single blade chaise or server rack. When you combine

    this with SAP HANAs ability to compress data in-memory, organizations had a viable solution for

    managing 40 to 120 terabytes of data on a platform that could produce query results so quickly

    that many questioned if what they were seeing was a hoax.

    Will SAP HANA lead to fundamental change? In some regards we are already seeing other

    SAP HANA modelingis a process

    whereby a developer converts raw

    columnar tables into business-

    centric logical views, such as

    dimensions and measures. The

    result lets business consumers

    find their data elements, group by

    business elements, and filter and

    sort data. There are seven com-

    ponents behind SAP HANA

    modeling, each with its own

    function.

    Key Concept>>

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    5/65

    3Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    database vendors update their solutions to be more like SAP HANA. For organizations that have

    already adopted SAP HANA, there is no question that it has changed the capabilities of analytics

    and data processing. Only time will be the true judge of SAP HANA, but all indications are that

    SAP has developed a solution that will lead BI into the next generation.

    When organizations look to develop solutions on SAP HANA, there are three ways you can

    categorize the available solutions:

    The first way that organizations can use SAP HANA, while leveraging their investmentsin traditional SAP BI solutions, involves moving their BW environment, based on a legacydatabase, to SAP NetWeaver BW powered by SAP HANA

    The second broad category of solutions can be characterized as rapid solutions based on aspecific industry, business process, or line of business.

    The final category and the main focus of this report pertains to the ways organizations can

    use the SAP HANA database by moving data from multiple sources, in either batch or realtime, into the SAP HANA in-memory database. In general terms, we label this final solutionSAP HANA standalone.

    For organizations that have years of experience and knowledge invested in the SAP NetWeaver

    BW platform, SAP NetWeaver BW powered by SAP HANA will prove to be the most

    straightforward and cost-effective SAP HANA-based solution available. Organizations will

    experience very few process or procedure changes with this solution. This is due to the fact that

    primarily only the underlying relational database that powers SAP NetWeaver BW 7.3 will change.

    However, there are specific optimizations whereby DataStore objects (DSOs) and InfoCubes

    can be converted to in-memory optimized versions. Under the covers, there are also several

    optimizations within the code that effectively push down processes that were previously handled

    at the application layer to the SAP HANA database. The net result is a substantial reduction indatabase storage requirements and query response times.

    The SAP Web site (http://www.sap.com/solutions/technology/in-memory-computing-platform/

    hana/overview/index.epx) has a long list of prebuilt or rapid accelerated solutions designed

    specifically to use SAP HANA. Each solution is tailored for a specific business process or line of

    business or industry. The list includes, but is not limited to, SAP CO-PA Accelerator, SAP Finance

    and Controlling Accelerator, SAP Smart Meter Analytics, and SAP Sales Pipeline Analysis. As of

    October 2012, you can find just over 20 solutions available, but you should expect to see the list

    grow as SAP and its partners find innovative ways to use SAP HANA.

    The final category of solutions centers on SAP HANA standalone in-memory database. Those that

    have blazed the trail with traditional Enterprise Information Management (EIM) solutions willfind the most comfort with this category. The solution includes the use of SAP BusinessObjects 4.0,

    SAP Data Services 4.0, and SAP HANA.

    SAP Data Services 4.0 provides all the features needed to support enterprise level data

    management. SAP Data Services is a proven tool for managing all aspects of EIM. It is used by

    thousands of companies to extract, cleanse, translate, model, and load data into data warehouses

    and data marts. With the release of version 4.x, it is tightly integrated with SAP HANA while

    maintaining support for almost every popular legacy RDBMS and business application on the

    http://www.sapexperts.com/http://www.sap.com/solutions/technology/in-memory-computing-platform/hana/overview/index.epxhttp://www.sap.com/solutions/technology/in-memory-computing-platform/hana/overview/index.epxhttp://www.sap.com/solutions/technology/in-memory-computing-platform/hana/overview/index.epxhttp://www.sap.com/solutions/technology/in-memory-computing-platform/hana/overview/index.epxhttp://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    6/65

    4 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    market. In short, it is an excellent tool for extracting data from both SAP and non-SAP based

    sources. SAP HANA will serve as the engine for storing, aggregating, calculating, filtering, and

    forecasting the data loaded into its columnar or row store in-memory tables. BusinessObjects 4.0

    provides the tools needed to analyze and visualize the data stored in SAP HANA. It includes a Swissarmy knife of tools that all have well defined mechanisms to connect to the data on SAP HANA.

    As you continue to read this special report, we will walk you through all the aspects of implementing

    BI solutions on SAP HANA standalone using SAP Data Services to manage and load data, SAP

    Information Steward to profile and research data issues, SAP HANA to develop and manage multi-

    dimensional models, and the SAP BusinessObjects suite of tools to create mobile analytics, reports,

    and dashboards.

    For those reading this special report with little or no experience using SAP BusinessObjects

    or SAP Data Services, we hope to provide insight into how companies and Decision First

    Technologies have implemented successful solutions for over a decade using SAP BusinessObjects

    EIM and analytic best practices. For those looking to find more information on creating multi-

    dimensional models in SAP HANA, this special report will also provide you with valuable

    insight into that world.

    Managing SAP HANA with a Proper Data Model

    SAP HANA provides such a powerful in-memory data platform that much more information is

    available at speeds never seen before. This is why managing information appropriately is more

    important than ever before. SAP HANA in a standalone configuration is truly a blank slate. There

    are no tables, no models, no views, and no data. You must not only get your data into SAP HANA

    but also plan and design the structures and strategy to house your data. In this portion of the

    special report, we focus on managing data effectively using proper data modeling techniques,profiling and examining data with SAP Information Steward, and finally loading data into SAP

    HANA using SAP Data Services.

    Start with a Good Data Model for SAP HANA

    Data modeling in SAP HANA is quite similar to traditional data modeling with some subtle

    differences. Data must be modeled into efficient structures that take full advantage of SAP HANAs

    in-memory structure and analytic modeling capabilities before presenting the data to reporting

    tools such as SAP BI BusinessObjects 4.0. In certain cases this deviates from traditional data

    modeling techniques.

    Traditionally, star schemas have been used as the backbone of BI design, and this approach alsoworks well as a baseline data model for SAP HANA. With a traditional RDBMS, your data is

    modeled into a star schema consisting of fact tables with measures and dimension tables with

    attributes to describe the data. Notice in Figure 1that the Fact_Sales table has measures of units

    sold with foreign keys to the dimension tables: Dim_Date, Dim_Product, and Dim_Store. Data

    structured in this manner performs quickly and efficiently when joined in queries and presented

    to reporting tools.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    7/65

    5Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 1p Typical star schema example with one fact table and multiple dimensions

    This is certainly a good starting structure for an SAP HANA data model, with a couple of

    exceptions. SAP HANA stores data either in rows or in a columnar format, so this degree of

    normalization is not always necessary or even beneficial for certain types of queries. Both in

    our lab at Decision First Technologies or at clients, we have seen better performance in some

    situations with SAP HANA by denormalizing or flattening data in certain fact tables when this

    flattened data is stored in column store tables.

    When data is stored in a columnar table, the repeating data has a greater likelihood to be onlystored once using run-length encoding. With this method, the values are sorted and the repeating

    values have a greater likelihood of being sorted together as run-length encoding counts the

    number of consecutive column elements with the same values. If the values are the same values,

    only one instance is stored.

    This is achieved by actually storing column data using two columns: one for the values as they

    appear in the table and another for a count of the use of those values. This encoding method

    yields good compression and the query response times are often better querying this type of

    structure with repeating data stored in a columnar table over data stored in relational row tables.

    For example, in our tests both on client sites and in the Decision First lab, we have seen anywhere

    from six times to 16 times compression over traditional RDBMS structures, and the performance

    has been no less than incredible.

    Another reason to stray from the traditional normalized approach over star schemas in SAP

    HANA for BI applications is due to join cost. Specifically, the join cost of including range-based

    operations from the two relational tables in the row engine is expensive due to the intermediate

    data being transferred from a columnar engine to a row engine. These types of analysis are

    not available in the columnar engine, so they must occur in the row engine. You then get a

    performance cost for joining the data that is referred to as join cost.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    8/65

    6 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    This repositioning of data at query runtime from the columnar engine to the row engine makes

    these types of operations much more costly from a performance standpoint. Take the star schema

    example in Figure 1. This is optimized for RDBMS structures, which work fine in SAP HANA.

    However, the cost in performance of joining the two tables DIM_Date and Fact_Sales whenrunning the following query is much greater when the heavy lifting is not performed by the

    column engine.

    These are the kinds of decisions you must consider when modeling data for storage in SAP HANA.

    In some cases it makes sense to move from a traditional star schema modeling technique toward

    columnar modeling by using columnar functions available in SAP HANA. Take the example in

    Figure 2showing a typical star schema join between a sales fact table and a date dimension table.

    Figure 2pTypical star schema join on sales and date

    If the query were revised to use the SAP HANA EXTRACT function as shown in Figure 3, which is

    natively supported in the columnar engine in SAP HANA, you could avoid the join cost altogether

    by using a lightning fast EXTRACT function to derive the necessary date values in real time ratherthan joining.

    Figure 3pUsing the columnar engine function EXTRACT to increase sales and date join performance

    The query results come faster by eliminating a whole transfer step, with the processing occurring

    at the more efficient column engine in-memory using a built-in native SAP HANA function. This

    type of thinking is what fosters a discussion and a change in modeling data. This leads to the final

    topic to consider when modeling your data in SAP HANA: Cardinality.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    9/65

    7Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Simply put, cardinality refers to the uniqueness of data in a column or attribute of a table. There

    are three types of cardinality: high, normal, and low. Most columns that have high cardinality

    are unique in their content. For example, IDs are primary keys that are unique and have high

    cardinality. However, state values repeat in an address table (Figure 4).

    Figure 4pExamples of high, normal, and low data cardinality

    All new records in the Address table receive a new AddressID. This makes AddressID completelyunique. Low cardinality is essentially the opposite and this refers to columns containing values

    that almost completely repeat. State data provides good examples of low cardinality, and are

    typically carved off, or normalized, into separate tables as the foreign key column StateProvinceID

    in the Address table shows in Figure 4. Normal cardinality refers to columns with values that are

    somewhat uncommon. Take shipping address values that relate to SalesOrderHeader records. Sales

    orders will most likely be shipped multiple times to the same address for the same customers, so

    there will likely be some repetition of these values in the SalesOrderHeader table.

    This is why in a traditional data model, the structure looks as it does in Figure 4. The address

    records would exist in a normalized structure with an Address table with a foreign key to

    SalesOrderHeader. Both low and normal cardinality conform to this modeling technique for

    traditional RDBMS databases, but this is entirely the wrong approach for loading data into SAPHANA.

    Again, you must consider the join cost of reassembling the information at query runtime versus

    a more natural structure for a columnar engine. A more efficient data model for SAP HANA is

    shown in Figure 5. It mergesboth Address and State information with SalesOrderHeader and with

    SalesOrderDetail data to create one table in SAP HANA.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    10/65

    8 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 5pA merged sales table containing both address and state data in SAP HANA

    One thing to notice in Figure 5, aside from the denormalized data, is the use of float Column

    Store Data Type for all the amount fields. Normally, decimal data types would be used for their

    precision, but float data types accommodate a behavior that is unique to SAP HANA. SAP HANArequires the data type of the base column values to be able to cover or support the maximum

    value in size and precision of the data as it is rendered in aggregate operations.

    This is especially important as the values of the datasets grow in size. For example a decimal (19,

    4) data type at the individual record level in a table is fine, but as the aggregation of a recordset

    grows, the growth produces overflow errors that a decimal (19, 4) does not cover. So, you guard

    against this unique behavior by using floats for commonly calculated values, such as amount fields

    in base tables.

    This fact table is a poor choice for a traditional data model, as a traditional approach dictates

    multiple structures, and the join cost in a traditional RDBMS is helped greatly by providing

    indexes at all join points. However, in SAP HANA, the compression achieved by the columnstorage structure as described before performs better than taking the time to join the separate

    tables in the row engine. The compression achieved by a Column Store table negates the gains of a

    traditional normalized structure.

    We have discussed numerous examples of ways to model data and are almost ready to load the

    data and create these structures in SAP HANA using SAP Data Services 4.0. However, by not

    profiling the source data first, you may miss aspects of the data that could compromise the quality

    of your data. The last thing that you want in SAP HANA is really fast bad data, so you can ensure

    quality with data profiling in SAP Information Stewards Data Insight.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    11/65

    9Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Profiling Data with SAP Information Stewards Data Insight

    SAP Information Stewards Data Insight is a tool for quickly ascertaining a grand amount of

    information from both data source tables and target tables. There are many profiling capabilitiesincluding columns, addresses, dependency, redundancy, and uniqueness. Data Insight also has

    the capability to measure and record data quality over time by creating scorecards that are fully

    configurable to measure quality aspects that are important to each individual companys business.

    It is important to note that Data Insight is only one application in SAP Information Steward. For

    the scope of this special report, we limit our focus to the profiling capabilities of Data Insight.

    Upon logging into SAP Information Steward, you land at the main application screen with the

    Data Insight application tab in focus, as seen in Figure 6. For the purpose of this special report,

    we have created both a project called HANA_Source and a connection to the source SQL Server

    database within this project.

    Figure 6pData Insight application on SAP Information Stewards main screen

    With regards to profile tables in this project, you click the project to launch the Workspace home

    screen, which is where you set up and run the profiling tasks against the tables. In our example

    stated earlier for SAP HANA, we are loading both customer and address data with our sales data,

    so we need to take care and ensure that addresses are good, verified United States Postal Services

    (USPS) addresses and that customer and address data all have good quality before loading it into

    SAP HANA.

    To set up the column profile task, select the tables Address and Contact in the Workspace Home

    application tab. Select Columns for the profiling task from the pull-down menu as shown in

    Figure 7. After clicking Columns, you are prompted to click Save and Run Now. This executes the

    profiling job on the SAP Information Steward server, and the profile job runs the profile against

    the database tables. This is really all that you need to do to engage a column profile task.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    12/65

    10 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 7pSelect the tables to profile in the Workspace Home and Columns from the pull-down menu

    This takes care of column profiling, so now we now turn our focus to address data. SAP

    Information Steward has the unique capability to run address profiling tasks using USPS validated

    directories. It gives you information about your address data quickly with just a few clicks and field

    settings. You can determine if an address in a record is a valid, deliverable address, if an address ina record is correctable using the Data Quality Management transforms in SAP Data Services, or

    if an address in a record is invalid and uncorrectable. A correctable address means that according

    to the profile result, SAP Data Services has enough information available in the input record to a

    data quality job to adequately fix the address to ensure that it is deliverable by the USPS. All this is

    done with no coding using SAP Information Steward. Before this tool, that task was impossible.

    To perform the address profile, select the Address table and Addresses from the Profile pull-down

    menu as represented in Figure 7. This launches the Define Addresses Task window as shown in

    Figure 8. Using this screen, you assign or map the fields from your database table that correspond

    to the field mappings shown in the Define Address Task screen. In our example table for the

    Address1 field in SAP Information Steward, we have an AddressLine1 field. For Address2 we have

    AddressLine2 in the database. Locality1-3 in SAP Information Steward refers to the city informationand Region refers to state information, so those map to City and PostalCode fields, respectively.

    PostalCode is the Zip code field and a PostalCode field maps to this information. Upon filling out

    this form, you again click the Save and Run Now button to submit the address profiling task.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    13/65

    11Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 8pMap address attributes and click Save and Run Now

    After the tasks finish in Information Steward, you have a lot of information about your source

    tables for the Data Services job. It helps fix data quality issues in your code before the data is

    presented to the data model that you have set out to establish in SAP HANA. Lets consider the

    results of the column profile in Figure 9.

    Figure 9pResults of the Data Insight Column Profile task

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    14/65

    12 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    You can see from the results of this column profile task in Figure 9that you have some work to do

    on the data before loading it into SAP HANA. There are some issues with names. It appears that

    some have been entered in upper case as indicated in the Value column by Xxxxxx and some in

    lower case as indicated in the Value column as xxxxx for example, the record of gomez. You needto standardize all of the names on proper or mixed case as well as run them through data cleansing

    transforms before loading them into SAP HANA.

    Looking at the address profile results in Figure 10it appears that you should cleanse the addresses

    as there are quite a few correctable addresses that the Address_Cleanse transforms in SAP Data

    Services can fix. These are valuable repairs before you load the data for further presentation in

    SAP HANA. You are now ready to begin building your code in SAP Data Services to both build

    tables and load data into the model youve designed in SAP HANA.

    Figure 10pResults of the Data Insight Address Profile task

    Loading Data into SAP HANA using SAP Data Services 4.0

    After seeing the trouble that can arise from faulty addresses and faulty names, you are ready tocraft both the FACT_Sales_Order_Detail table structure that was presented in the data modeling

    section of the special report in Figure 5and to load data into that structure. SAP Data Services is

    the only certified solution to load third-party data into SAP HANA, and this is our vehicle for data

    loads. You can quickly create both row- and column-based tables in SAP HANA, thus both building

    and loading the model laid out in the examples above. To accomplish this, you first need to create

    Datastore connections to the source SQL Server database and the target SAP HANA system.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    15/65

    13Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Open the SAP Data Services Designer and browse to the Datastores tab in the Local Object Library

    on the bottom left portion of the screen. Right-click the white space to bring up the pop-up

    menu shown in Figure 11. Click New on the pop-up menu to launch the Create New Datastore

    configuration screen.

    Figure 11pClick New to create Datastore connections to both the SQL Server source and SAP HANAtarget

    In the Create New Datastore screen, you specify the settings as shown in Figure 12. Notice the

    ODBC Admin button on the screen. You need to create an ODBC connection to SAP HANA if

    you have not done so already. This is a standard ODBC connection just like any other data source

    using Windows Data Sources (ODBC) in the control panel in Windows. The only thing slightly

    different is that you use the SAP HANA ODBC driver shipped with SAP HANA over a standard,

    Windows-supplied ODBC generic driver. This is similar to using an IBM ODBC driver to set up an

    IBM DB2 connection much like other databases that are supported in SAP Data Services as ODBC

    connections. The SAP HANA ODBC driver is installed on the machine hosting the SAP Data

    Services job server.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    16/65

    14 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 12pSpecifying new Datastore connection settings

    You now have your Datastores created and have established connections to the Microsoft SQL

    Server source database and the SAP HANA target system. All the components in SAP Data

    Services are ready to create the data flows necessary to build the FACT_SALES_ORDER_DETAIL

    table in SAP HANA.

    However, it would not be wise to go directly from the source to the structure laid out in the data

    modeling section of this special report. What if you choose to include other data sources in your

    well-modeled Sales Order Header fact table in the future? By going straight from the source to

    SAP HANA, you are to use the primary key from the source table as well as just taking the fields as

    they are in the source. Usually, this is not desired in a reporting data structure.

    Dimensionally modeled star schema data marts or data warehouses should be divorced from the

    source and contain source-agnostic columns that represent business definitions and have source-

    agnostic primary and foreign key structures. The way to achieve a divorced storage structure is to

    use a staging database and create a surrogate (source-agnostic) primary key with a link back to the

    source primary key. To do this, you model a staging layer in SQL Server into your Data Services

    process before moving data or creating structures in SAP HANA. Follow these steps to model astaging layer.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    17/65

    15Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Step 1. Create Staging with Surrogate Keys

    Staging serves two functions in your load to SAP HANA. First, it divorces the source-primary key

    structure with the keys that you create while loading to SAP HANA. This allows you to easily

    integrate other data sources in the future.

    The second function of staging is to do all the manipulation or transformation of the data

    necessary to deal with the issues that were found earlier in profiling using SAP Information

    Steward. To do this, you use SAP Data Services to create a table called SALES_ORDER_DETAIL_

    STAGE. It has flattened or denormalized data from the following tables in your source database:

    SalesOrderHeader, Address, StateProvince, and SalesOrderHeader. The data in these tables will be

    merged into the target table to take advantage of the unique columnar engine properties of SAP

    HANA. This type of data structure performs better and serves as a proper foundation to properly

    exploit the analytic modeling capabilities of SAP HANA. The fully realized data flow is depicted in

    Figure 13.

    Figure 13pCreate a SALES_ORDER_DETAIL_STAGE staging table

    Whats inside the data flow components? The first thing that the data flow does is to join four

    disparate tables from the source database in the query transform labeled Query in Figure 13. You

    can see in Figure 14how the joins are accomplished in SAP Data Services in the FROM tab of the

    query transform.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    18/65

    16 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 14pJoin all tables together in the Query transform of the data flow DF_SALES_DETAIL_SG_I

    Take note in Figure 14that the native date fields from the source will be transformed in these

    data flows to varchar() fields and the format of the field should be YYYYMMDD. This means a

    date field in the OrderDate source table would look like 09/01/2012 11:59:59, but in the staging

    table or in SAP HANA, you want the date field to look like 20120901. The reason for this is that an

    SAP system contains sophisticated built-in date handling functionality that we explore in the next

    section (analytic modeling) of this special report. This varchar() format is what is required to take

    advantage of that functionality.

    One last thing that is happening in the query transform in Figure 14is that the first field, SALES_

    ORDER DETAIL_ID, has a gen_row_num() function in the Mapping column of the query transform.

    This is the surrogate key as the gen_row_num() function generates a row number for each record.

    The source table key SalesOrderID will also be mapped to the target table so this staging table,

    SALES_ORDER_DETAIL_STAGE, will contain both the surrogate key as well as the source primary

    key. This table provides the link of the ultimate fact table in SAP HANA back to the source table.

    Eventually, when you wish to add more sources to the fact table in SAP HANA, you just map the

    attributes appropriately to this staging table and add the new sources primary key column as a

    new column in the staging table. The other fields signify the business terms, not a direct link to

    any source. Take, for example, the OrderDate field. An OrderDate is an abstracted business conceptnow. It is no longer just a linked field to the source. The OrderDate stands source independent

    and represents an OrderDate business concept outside of just coming from this source. This

    concept is agnostic to the source and can be used independently to describe any OrderDate from

    any source. A new source has a new order date field that is mapped to this OrderDate field in the

    SALES_ORDER_DETAIL_STAGE table. Therefore, all the other attribute fields, such as OrderDate,

    are reused with the new source. It is the primary keys presence, along with the surrogate key, that

    provides the link back to any source table. This is the primary reason for taking the time to craft a

    staging layer for your load to SAP HANA.

    Also take notethat dates are

    transformed tovarchar() data types

    Join Conditions arespecified in the JoinPairs section of the

    FROM tab

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    19/65

    17Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Another issue that arose in the data profiling is the validity of the addresses. You can use the

    USARegulatory_AddressCleanse transform in your data flow DF_SALES_DETAIL_SG_I (as shown

    in Figure 13) to correct the addresses. The address cleansing transforms are found in the Local

    Object Library under the Data Quality node as shown in Figure 15.

    Figure 15pWhere to find the USARegulatory_AddressCleanse transform

    After placing the USARegulatory_AddressCleanse transform in the data flow, you configure both

    the input and output fields within the transform. The input fields map to the existing address fields

    coming from the source tables through the query transform. The address cleanse transform takesthese field inputs and analyzes and corrects the physically stored addresses using SAP-supplied

    postal address directory files updated by the USPS. By using SAP Information Steward to quickly

    identify the address records to correct, you are able to use the address cleansing capabilities of SAP

    Data Services to effectively cleanse your records in the staging database.

    Now that you have your staging table SALES_ORDER_DETAIL_STAGE correctly populated, this

    table can link you back to the various sources that will be loaded over time. You are now ready to

    load the data to SAP HANA.

    Step 2. Move Data into SAP HANA and Create All Tables at Runtime in SAP Data Services

    You have performed most of the heavy lifting in the staging data movements, and the load to SAPHANA is straightforward. You are essentially going to take your staging tables as a template, use

    the template table functionality within SAP Data Services to quickly create table structures, and

    load the data into SAP HANA. Template tables are handy tools. They take any recordset and craft

    a create table SQL statement against the target database. As soon as you have the structure for

    the table exactly as you wish, you can select a template table as the target for your data flow, as

    shown in Figure 16. The table structure will be created in SAP HANA at data flow runtime. After

    executing the Job_HANA_Load SAP Data Services job to run your DF_FACT_SALES_DETAIL data

    flow, you now have your table structure created in SAP HANA.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    20/65

    18 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 16pCompleted data flow in SAP Data Services to load the sales order detail into SAP HANA

    The template table is a great way to quickly create the structure of the table in SAP HANA, but it

    may not perform as well as bulk loading data using SAP HANAs bulk loader. This is particularly

    important if you are loading a large table with millions of records. Smaller tables can stop at this

    point and use the template table to create the table structure and load the data, but with a larger

    table, such as FACT_SALES_ORDER_DETAIL, you probably want to explore the bulk loader

    options available from SAP HANA. To use the bulk loader capabilities within SAP Data Services,

    import the table into SAP Data Services as a standard table. To do this, right-click the template

    table in the Local Object Library that was created by running the job and data flow. Then thepopup menu in Figure 17 appears.

    Figure 17pImport the table in Data Services to get standard table full functionality

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    21/65

    19Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    After importing the table, you are free to set commit sizes or use the bulk loader by double-

    clicking the FACT_SALES_ORDER_DETAIL target table. This brings up the target table editor

    screen, in which you can specify many things about the load of the large FACT_SALES_ORDER_

    DETAIL table (Figure 18). Since you know this table is large, use the Bulk Load Options tab tocontrol the maximum bind array size. Set it to 1,000,000 rows. This is a practical starting value that

    we have used with good results in our Decision First Technologies lab. The maximum bind array

    value acts like a commit size control in other target databases and batches the records together

    into larger groups for performance in large loading operations.

    Figure 18pUse the target table editor to control the maximum bind array size

    After carefully crafting your SAP Data Services job and data flows to load the FACT_SALES_

    ORDER_DETAIL table in SAP HANA, the only thing left to do is execute the job. Navigate to

    the Project Area in Designer as shown in the top left of corner of Figure 15. Right-click the job

    name, and select Execute Job from the pop-up menu. With data extracted, cleansed, and loaded

    into a series of SAP HANA columnar tables, you can now begin the process of developing multi-

    dimensional models or views based on those tables.

    SAP HANA Modeling Process

    SAP HANA modeling is a process whereby a developer converts the raw columnar tables into

    business-centric logical views. Much like the process in which a legacy BusinessObjects customer

    would define a universe based on relation tables, modeling within SAP HANA allows for columns

    of data to be defined as dimensions and measures. The result presents the data in a format that is

    more business intuitive, granting consumers an easy catalog to find their data elements, group by

    business elements, and filter and sort data.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    22/65

    20 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    There are seven main components to SAP HANA modeling. Each component has a specific

    purpose and function. When these components are compiled together, the result provides a

    meaningful multi-dimensional representation of the data. The main components of modeling are

    the following:

    SAP HANA Studio

    Schemas

    Packages

    Attribute views

    Analytic views

    Calculation views

    Analytic privileges

    Lets look at each component in more detail.

    SAP HANA Studio

    SAP HANA Studio (Figure 19) is a Java-based client tool that allows developers and administrators

    to create models and manage the SAP HANA RDBMS. It is typically installed on a developers

    desktop and it is the basis for developing rich, multi-dimensional models that are consumed by

    the various supported SAP BusinessObjects 4.0 reporting tools. It also contains a subset of tools

    for the SAP HANA database administrator (DBA). Developers use the interface to create packages,

    attribute views, analytic views, database views, calculation views, and analytic privileges. DBAs use

    the interface to manage security, roles, backups, tables, and views and to monitor the system.

    Figure 19pSAP HANA Studio

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    23/65

    21Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Schemas

    Schemas (Figure 20) are directly associated with user accounts created by the SAP HANA DBA and

    are used to store row and columnar tables. There are also other objects that are stored in an SAPHANA schema, including views and procedures. For each user created by the DBA or default to the

    system, a schema space exists that must be referenced when working with tables in SAP HANA. The

    term schema is not unique to SAP HANA. Almost every RDBMS on the market incorporates this

    term per the schema modification standards set by the American National Standards Institute.

    Note

    Schemas are secured in SAP HANA, so it is important that the developers account and _SYS_BIC(system account for managing SAP HANA models) have been granted the SELECT rights beforemodels can be developed or activated in SAP HANA Studio.

    Figure 20pSchemas

    When you create a table using SQL syntax in the SAP HANA Studio, you must reference the

    schema in the CREATE TABLE and DROP TABLE commands. The syntax of every table-related

    function always references the schema name (Figure 21).

    Figure 21pCREATE TABLE and DROP TABLE commands

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    24/65

    22 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Packages

    Packages are the first logic storage component of an SAP HANA model. Within a package you

    define one or more attribute views, analytic views, calculation views, or analytic privileges.Packages can be created in a hierarchical order for the purposes of security and logic ordering of

    components (Figure 22).

    Figure 22pPackage hierarchies

    When you create your first package, you can give it a name, such as Sales. Subsequent packages

    can be created using the . naming convention. In Figure 23, wecreated a sub package named northamerica. Because we wanted this package to exist under the

    sales package, we named it sales.northamerica. The dot or period in the name indicates that the

    package should be created as a child to the parent package sales. Creating a hierarchical package

    structure is important for both organization of modeling objects and for securing objects within

    packages.

    Figure 23pCreating a package

    Attribute Views

    Attribute views are the logical dimension and hierarchy containers within an SAP HANA model.

    SAP HANA Studio allows you to create them by joining and filtering tables found in SAP HANA

    schemas. Attribute views are not required for an SAP HANA model, but before you can create an

    analytic view containing hierarchies, you must first create an attribute view. The end result of an

    attribute view appears to be a single logic table or view of data.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    25/65

    23Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Attribute views allow the developer to denormalize data by joining one or more tables, filtering

    one or more tables, or by developing calculated attributes. Imagine you are developing a SQL

    View based on three tables that will result in a record set that contains all the information

    about customers who placed a sales order. Within this attribute view you likely join tables suchas Customer, Address, and Account. You can also filter the Customer table so that only active

    customer records are returned. The end result is a single, logical view of these tables that returns

    all the relevant customer information in a single unique row (Figure 24).

    Figure 24pComponents of an attribute view

    There are two main tabs within the interface that developers use to create an attribute view. The

    Data Foundation tab is used to define the joins, keys, and filters needed to create a complete

    attribute view. The Hierarchies tab is used to define hierarchies that are available to some of the SAP

    BusinessObjects reporting tools.

    The Data Foundation tab of the attribute view allows developers to denormalize a data set

    using joins, filters, and calculated attributes. The joins are defined as inner, left outer, right outer,

    referential, or text. If the developer right-clicks any column in a data foundation table, the user

    interface (UI) presents the option to create a filter. A filter at the foundation level is permanently

    applied to the results sets and should only be used to remove records based on technical or

    business requirements.

    On the right side of the Data Foundation tab are the output columns. These columns are added by

    right-clicking a column within a table found on the Data Foundation tab. On the right-click menu,

    there is an option to Add as Attribute. Any value available on the output window is accessible

    anywhere the completed and activated attribute view is used.

    Another option available on the output windows is the derived column. You can derive attribute

    columns using the calculated attribute option. This useful feature allows developers to derive

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    26/65

    24 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    columns to support various reporting requirements (Figure 25). For example, you could

    concatenate the customers last and first name separated by a column. You can also use the if() and

    now() function and CUSTOMER_EFFECTIVE_DATE field to create a calculated column that flags

    customers that have more than five years of history with your company.

    Figure 25p Calculated attribute

    When you define an attribute view, you select one or more columns and establish the attribute

    key (Figure 26). The attribute key is the basis for joining the attribute to an analytic foundation,

    which we discuss in more detail later. Developers can find the option to add an attribute key by

    right-clicking the table in the data foundation and selecting Add as a key attribute. It is important

    that the values for this column be truly unique in results. In traditional data modeling, developers

    define a primary key that signifies that all records are unique based on the column or columns

    defined as a primary. The same is true with an attribute view. When the attributes are joined

    within an analytic view, each record must be unique to prevent the duplication of records and

    subsequent over-aggregation of data.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    27/65

    25Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 26pComponents of an attribute view

    Within an attribute view, developers can create hierarchies that can be directly used by tools,

    such as SAP BusinessObjects Analysis for Office and BusinessObjects Analysis for OLAP.

    Developers can find this option by clicking the Hierarchies tab (Figure 27). In future releases of

    SAP BusinessObjects 4.0, these hierarchies will also be accessible by SAP BusinessObjects Web

    Intelligence (also known as WebI) and possibly SAP BusinessObjects Crystal Reports for Enterprise

    via direct binding to SAP HANA analytic views. Hierarchies add a logic order to data ranging from

    a narrow to a broad category.

    Figure 27pAttribute hierarchies

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    28/65

    26 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Hierarchies are useful when reporting needs require expand and collapse functionality for

    displaying key performance indicators and other measures. In Figure 28, you can see that the

    AccountNumber column contains a + sign, which indicates that there are child objects available.

    In almost every line of business, you will find hierarchies that are useful for analyzing measures orkey figures.

    Figure 28pSAP BusinessObjects Analysis for OLAP

    There are four main options available when creating an attribute view in SAP HANA Studio

    (Figure 29):

    The standard attribute view type is just as the name implies. This is the type of attribute

    view developers choose when creating or deriving attributes based on existing tables storedin SAP HANA.

    Time-based attributes are derived based on pre-loaded date and time tables maintained bythe SAP HANA system. When you create a time-based attribute, you have the option toestablish the calendar type, variant table, and granularity. Time-based attributes are handy

    because they eliminate the need for an external tool to load and manage date and time

    tables.

    Developers use the derived attribute type to create aliases of existing attribute views. Theyare handy when your analytic foundation contains multiple foreign keys for various dates

    or times. For example, a typical sales_order_detail table likely contains three columns thatrepresent the order_date, ship_date, and due_date. Each of the three columns contains aunique date that will be joined in that analytic foundation to three different date-based

    attributes. If you attempt to join all three columns to the same time-based attribute, youcreate a logic loop. The results of your model then only display transactions in which theorder_date, ship_date, and due_date all occur on the same day. To overcome this issue, youmust create a derived attribute based on an existing date-based attribute for each expecteddate key in your analytic foundation. Derived attributes are permanently fixed to their

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    29/65

    27Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    parent attribute. Every change made to the parent automatically is reflected in each child-derived attribute and associated analytic view. Developers find them efficient when anattribute view alias is required.

    The final option when creating an attribute view is the use of the copy from option. This

    is different from the derived attributes in that a physical copy of an existing attribute viewis created. The copy will have no further association with its parent once the copy processis complete. This is typically used when a developer wants to rename an existing attribute

    view without affecting the overall model.

    Figure 29pAttribute view options

    Regardless of the type of attribute view you select, each attribute view is used within one or more

    analytic views to complete a multi-dimensional model. Once you have completed the design

    of your attribute view, click the save and activate icon to commit its definition to the metadata

    repository of SAP HANA (Figure 30).

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    30/65

    28 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 30pSave and activate your attribute view

    Analytic Views

    Analytic views are the heart of SAP HANAs multi-dimensional models. They bring together the

    attribute view and are the basis for the measures or key figures that make up a multi-dimensional

    analytic model (Figure 31). In almost every circumstance, the analytic view is defined using a

    transactional columnar table. Transactional tables contain each record of activity within a line of

    business. They can range from sales transactions to a customers calls to units shipped.

    Figure 31pAdding an attribute view to the data foundation

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    31/65

    29Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    If you are using SAP Data Services to extract, transform, and load (ETL) data into SAP HANA,

    and also following standard data modeling approaches, you will use fact tables as your analytic

    foundation. If you are loading data without using an ETL processes, transaction tables might

    be more difficult to identify. With almost every transaction table, there is a general set ofcharacteristics that you can use to recognize these types of tables. They typically contain dollar

    amounts or unit counts that occur over time or over a sequence of events.

    In the examples used in this report, the SALES_ORDER_DETAIL table is a perfect example. It

    contains three distinct dates and four columns that can be used as measures (Figure 32). Once

    joined with the attribute views, users can subtotal these amounts over fiscal and calendar dates,

    months, years, or quarters or by customers, states, regions, or countries.

    Figure 32pTransaction tables

    When creating an analytic view, you must use a new or an existing package for storage and

    security. You specify the analytic view name and choose from the Create New or Copy From

    options (Figure 33). Note that you cannot change the name of an analytic view once it is saved

    and activated. However, developers can use the Copy From option to create a new version with a

    different name.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    32/65

    30 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 33pCreating an analytic view

    There are two main tabs within an analytic view. The Data Foundation tab is the starting point for

    designing an analytic view. It contains all the components needed to define the transaction or fact

    table. The Logic View table is used to define the joins between the data foundation and existingattribute views.

    On the right side, developers add one or more tables to the data foundation. Once the tables

    are added, developers define private attributes and measures by right-clicking each column and

    selecting the appropriate option (Figure 34).

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    33/65

    31Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 34pAnalytic view on the Data Foundation tab

    Private attributes are the columns used in joining to existing attribute views or for defining display

    attributes that do not exist in an attribute view. In most cases they are used to define a join path,

    but they are present in the output of any model and can be used for filtering, grouping, and

    sorting within analytic tools once the model is complete. Developers can also define filters that

    will be applied to any results generated by the final model.

    Developers typically filter the analytic view data foundation to eliminate records that should be

    excluded from any calculation based on the final model. For example, a transaction table might

    contain multiple order statuses and duplicate measure values for each status. From a business user

    point of view, only the final or confirmed order status is necessary for reporting. Using an analytic

    view filter eliminates the status used in the workflow of entering, verifying, and confirming an

    order and only presents calculations on the records representing the final status of the order.

    From a technical perspective, developers need to filter the order status to prevent the model from

    over-aggregating the results. If an order has three statuses and subsequently three order-detail

    line records, only one record can be included in the results without triplicating the values of the

    measure.

    It is possible to include more than one table in the analytic view foundation. However, we caution

    against this approach as it results in significant performance degradation when both tables containmillions of records. In almost all cases, it is better to model the data into a single table using SAP

    Data Services as data is loaded into SAP HANA. This not only simplifies the SAP HANA modeling

    tasks but also increases the query response times of any model.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    34/65

    32 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    The Data Foundation output includes all the columns that are available for use on the Logical

    View tab. They consist of Attribute Views, Private Attributes, Calculated Attributes, Measures,

    Calculated Measures, Restricted Measures, Variables, and Input Parameters. The output columns

    available in this view can be managed on both the Data Foundation and Logical View tabs ( Figure35). However, items will not be visible until the joining of the attribute view work has been

    completed on the Logical View tab (Figure 34).

    Figure 35pAnalytic view columns

    The attribute view contains all the columns defined within attribute views that are joined to the

    foundation on the Logical View tab. Until you have added and joined the attribute views to your

    foundation, this section remains empty.

    Private attributes are those that you select in the foundation for joining on the Logical View tab.

    They represent columns that you can use for the display in the final model or with restricted

    measures. In any case, unless hidden, these values are available in the final model and appear as

    though they are standard attribute views.

    Calculated attributes allow for the manipulation of any attribute using SAP HANA formulas and

    functions. In most cases, we recommend that you design calculated attributes in the appropriate

    attribute view. However, developers may sometimes find it necessary to concatenate, substring, orderive new output columns based on multiple private attributes or attribute view columns within

    the analytic view.

    Generally developers create them in the analytic view because the calculation spans multiple

    attribute views or private attributes. This is difficult to accomplish in the attribute view because

    the values might exist in disparate tables in the data model.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    35/65

    33Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Measures are defined by right-clicking columns in the foundation that will be aggregated in the

    final results of the model. SAP HANA analytic views only support the SUM, MAX, and MIN

    aggregation functions at this time. To perform more complex aggregations, you need to develop a

    calculation view, which we discuss later in this report.

    Calculated measures are defined in the output section of the analytic view. They represent

    calculations that involve static values or additional measures. For example, users might want to see

    the total value of an order less the shipping costs. This can be accomplished in calculated measures

    simply by subtracting the shipping costs from the sales order total. Developers can also define

    ratios and percentages at this level, but they must consider the tools used to consume these values

    as summing a ration or averaging. Average might occur at the reporting tool level.

    Restricted measures are a feature of SAP HANA models that allow the developer to define

    conditional aggregates. When defining restricted measures, the developer selects an existing

    attribute, defines an operator, and indicates a value to which it must be equal. For example,

    developers can define a measure that totals sales for 2003 and another that totals sales for 2004.

    When these values are aggregated and grouped on country, users can see total sales for 2003 and

    2004 for each country.

    Variables allow the developer to define single value, interval, or range filters within the analytic

    view. Any query that is executed against the published analytic view must satisfy any mandatory

    variables. This is a very useful feature if the developer intends for the result set to be limited

    for a specific date range, attribute, or other criteria. Note that most of the SAP BusinessObjects

    reporting tools do not recognize these variables at this time. However, we have been told by SAP

    that this functionality will be fully supported in the next few service pack releases. Variables are

    different from filters in that they are intended to be dynamic or changed based on the values

    selected from the input parameters. Filters, on the other hand, are hard coded and must be re-

    coded by developers when business requirements change.

    Variables work hand in hand with input parameters. These placeholder values allow developers

    to enhance the use of variables by allowing the executor of the query to insert a custom value

    upon execution. For example, each time the query is executed, the user interface requests that

    a beginning and ending fiscal year be entered to limit the results. When developers define input

    parameters, they must indicate the name, database data type, length, and scale. There is also an

    option to specify the default value of the input parameter if needed for the users.

    After the data foundation is defined, the second tab of the analytic view is named the logical view.

    The logical view is the basis for defining the joins between the analytic foundation and existing

    attribute views (Figure 36).

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    36/65

    34 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 36pLogical view

    Developers add the existing attribute views either using the new analytic view wizard or by

    dragging them from the navigator pane on the far left side of the SAP HANA Studio modeling

    perspective. Attribute views are joined to the analytic foundation using the attribute key of the

    attribute view and the private attributes of the foundation. The basic inner, left outer, and right

    outer join types are all supported. Each join is assumed to use the equal operator, which limits the

    use of between, less than, or greater than joins.

    There are also two additional join types of joins, referential and text. Referential joins are the

    default join type. They offer better performance compared to inner joins assuming only a subset

    of attributes are queried in relation to the overall number of attributes defined in an analytic

    view. They act as an inner join but they are not enforced if attributes are not selected in a query.

    This is unlike the SAP HANA inner join in which attributes defined in the analytic foundation are

    enforced even when they are not selected in a query. In short, the referential join helps to reduce

    the number of expensive join operations by eliminating joins that are not relevant to any user

    defined query.

    However, the results of one query to the next might vary because the analytic foundation

    records will be excluded or included based on the inner joining of the various attribute views

    selected in the query. They should only be used if the referential integrity between the analytic

    foundation table and all its attribute views is known to be sound. In database terms, a logicalforeign key constraint should exist. In laymans terms, every record in the analytic foundation

    table should have a matching record in the analytic views. If this is not the case, a query by YEAR

    and SUM(SALES_DOLLARS) might return different results than a query on YEAR, CUSTOMER

    and SUM(SALES_DOLLARS) when a sales transaction record exists in the foundation that has no

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    37/65

    35Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    matching customer in the attribute view.

    Text joins are used within attribute views. They are a special join type that allows developers to

    join two tables when one contains characteristics and the other contains the characteristic in aspecific language. Text joins were developed specifically to work with SAP ERP tables and the

    SPARAS field to provide for automatic translation of characteristics. Text joins act as an inner

    join, meaning that they will restrict the results based on matching records. There is also a special

    dynamic language parameter. It is defined in the attribute view foundation join definition that is

    automatically processed within to filter the text to a specific language based on the locality of the

    user querying the attribute. In short, they are used to provide automatic multi-language support in

    query results.

    Based on the documentation, you can also establish the cardinality between tables to help the

    various SAP HANA engines quickly and accurately execute the analytic view. We have never

    noticed any difference in performance when changing the cardinality rules, but we have seen a

    model fail to activate if an attribute key is not truly unique. When viewing the interface from the

    Logical View tab, the same output columns and their various types are available. There is no real

    difference in the output when switching between the data foundation and logic view. The only

    exception is that attribute views are only visible in either tab once they have been added to the

    model on the Logical View tab.

    Once developers have fully defined the model, they must save and activate the analytic view

    before it is available within the SAP HANA metadata repository (Figure 37). To save and activate

    the model, developers click the save and activate icon. Activation also validates that no rules have

    been violated within the design of the model. Developers should pay close attention to the Job Log

    window, as it indicates if there are any failures in the activation. If there are any failures, the font

    color changes to red, indicating that there was an issue in the attempt to activate the model.

    Figure 37pSave and activate the analytic view

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    38/65

    36 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Developers can double-click an item in the Job Log to open the Job Details window (Figure 38).

    Within this window, a detailed explanation is provided as to the issues that led to the activation

    failure. The same is true when a model is validated without activation.

    Figure 38pJob Log details

    Calculation Views

    Calculation views are the basis for performing complex calculations, aggregations, and projections.

    It is difficult to describe the full functionality of calculation views, but they are generally used to

    produce result sets that span multiple analytic views. A more simple explanation might include

    the use to produce a distinct count or to further filter and aggregate the analytic view for faster

    processing. Calculation views can be used to produce a view of that data that spans multiple fact

    tables or contexts, similar to the way Web Intelligence and a universe manage multiple queries.

    In SAP BusinessObjects, the universe and Web Intelligence report engine overcome cross fact

    aggregation by passing multiple independent SQL statements to the relational database and then

    merge the results as if they were a single query within the report engine. SAP HANA approaches

    this differently in that calculation views are used to merge data sets into a single logical view of

    the data. They incorporate a more set-based philosophy in working with data than you see in

    a traditional database view or procedure. SAP HANA can provide most of this functionality in

    a graphical UI (GUI) without the need to write hundreds of lines of SQL code. With that said,

    calculation views can also be based on script logic if needed.

    The calculation view UI is similar to that of the attribute view and analytic view. On the left side,

    developers can create logic dataset workflows to guide SAP HANA in the processing of the data

    sets. The center window contains details on only objects selected from the left-side window. The

    right-side window contains the output column definitions for each items selected from the left

    side. Each item selected from the left side produces a different view for both the center and right

    windows (Figure 39).

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    39/65

    37Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 39pCalculation view overview

    For the purposes of this special report, we do not go into great detail on all the facets of

    calculation views. However, we do describe in general terms a solution in which calculation views

    are used to produce meaningful results.

    Take, for example, an analytic view that produces customer sales orders and another that produces

    customer product returns. The analytic view for each area would be capable of calculating results for

    not just products and dates, but also for customers, sales reps, distribution centers, and other facets.

    For the purposes of this solution you only need to use a few of those facets to produce the results.

    Using a calculation view, you can develop a results set that compares the number of orders for a

    given product and subsequently the number of returns for that same set of products. To develop

    this solution using a calculation view you would start by adding both analytic views to the GUI.You then would project them to include only the columns needed to satisfy the requirements.

    Projection is a process in SAP HANA in which developers can reduce the amount of in-memory

    data blocks that are accessed by removing columns from an analytic view that are not needed

    within the calculation view. In most cases, projecting the analytic view increases the performance

    of the calculation view.

    Once they are projected, you can aggregate the results of the sales analytic view to include the

    product, year, month, total units shipped, and a null value place holder for products returned.

    Using the sales returns analytic view, you can aggregate the results to produce product, return

    year, return month, total units returned, and NULL place holder for units shipped (Figure 40).

    The purposes of the NULL value place holders are to facilitate the subsequent UNION of the two

    results. When performing a UNION, both results sets must have the same number of columns.

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    40/65

    38 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 40pSetting a NULL column

    Within the aggregation of each set, you create a calculated column and set it to a NULL value

    (Figure 41).

    Figure 41p Results of a projection and aggregation of twoanalytic views: products sold and products returned

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    41/65

    39Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Taking the results of each aggregation, you then can UNION the records sets. The results of the

    UNION operation would only be temporarily managed by SAP HANA and never returned by

    the results set of the calculation view. However, it is important to logically understand what is

    happening within the sequence of calculations that produce the desired results (Figure 42).

    Figure 42pUNION of the record sets

    Using the aggregate option within a calculation view, you can then aggregate the results again to

    produce a single records set that displays the results as if they were stored together in the database

    (Figure 43).

    Figure 43pAggregation of UNION

    The setup of such a calculation can be done completely using a GUI. Each object in the GUI

    represents a different dataset-based operation that can project, aggregate, UNION, aggregate,

    and output a results set. Within the SAP HANA Studio, this is represented as a series of set-

    based operations (Figure 44). From a workflow standpoint, you are simply taking two datasets,

    aggregating each set, combining the two sets, and then aggregating the combined sets to produce a

    single result set (Figure 45).

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    42/65

    40 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Figure 44pAn SAP HANA Studio calculation view workflow

    Figure 45pLogic workflow of a calculation view

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    43/65

    41Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Analytic Privileges

    Analytic privileges allow developers to define automatic, row-based filters based on an SAP HANA

    user account. In general, we refer to this as row-level security. Analytic privileges can either protectdata or automatically filter data for each SAP HANA logon. They are set up and stored with the

    same packages that are used to manage attribute views, analytic views, and calculation views.

    When defining an analytic privilege, the developer specifies one or more view objects to restrict.

    Once the objects are selected, they must then define the attribute to restrict. The final step of the

    process requires that a restriction be set up for that selected attribute. For example, an analytic

    privilege can set up to restrict the results of a calculation view to only the country of Great Britain

    (Figure 46). Once the analytic privileges are saved and activated, the DBA can then assign it to an

    individual user or a database role (Figure 47).

    Figure 46pCreating an analytic privilege

    Figure 47pAssigning the privilege to a user

    http://www.sapexperts.com/http://www.sapexperts.com/
  • 8/12/2019 Implementing Hana

    44/65

    42 Copyright 2012 WIS Publishing | www.SAPexperts.com

    SAPexperts.com

    Combining the Modeling Components to Produce Analytic Views and

    Calculation Views

    Now that we have discussed all the main components of SAP HANA modeling, it is time to showhow they all work together to produce usable, multi-dimensional data models. You start with

    loading data, using SAP Data Services 4.0, into SAP HANA schemas and tables. You then create

    attribute views to define all possible facets of your multi-dimensional model. Once the attribute

    views are created, you define the analytic view. You can also define measures and additional

    attribute views within the a