ibm puredata for analytics (ipda) filepowered by netezza technology multiple schema support in...

IIBBMM PPuurreeDDaattaa ffoorr

AAnnaallyyttiiccss ((IIPPDDAA))

PPoowweerreedd bbyy NNeetteezzzzaa tteecchhnnoollooggyy

Multiple Schema Support

in Release 7.0.3

- including Best Practices

Netezza Performance Server 9 September 9, 2013

ii

Table of Contents

1. Introduction ..................................................................................................................................................................... 1 2. Referencing Database Objects in IPDA ........................................................................................................................ 1 3. Multiple Schema Support ............................................................................................................................................... 4

3.1. Compatibility Mode (default) ................................................................................................................................ 5 3.2. Multi-Schema Mode .............................................................................................................................................. 5

4. Configuring Multiple Schema Support ........................................................................................................................ 6 4.1. Enable Multiple Schema Support ........................................................................................................................ 6 4.2. Disable Multiple Schema Support ....................................................................................................................... 7

5. Multiple Schemas and User Connection Options ....................................................................................................... 7 5.1. Connect to the default Database Schema ............................................................................................................. 8 5.2. Connect to the default User Schema ..................................................................................................................... 8 5.3. Connect to a specific User Schema (with Multiple Schema support Enabled) .................................................... 8

6. Enabling Multiple Schema support on Existing Systems .......................................................................................... 9 7. Database Schemas ......................................................................................................................................................... 11 8. Managing Multiple Schemas ....................................................................................................................................... 11

8.1. Creating Schemas ................................................................................................................................................. 11 8.2. Changing Schemas............................................................................................................................................... 12 8.3. Altering Schemas ................................................................................................................................................. 12 8.4. Display Database Schema ................................................................................................................................... 12 8.5. Dropping Schemas ............................................................................................................................................... 12

9. Abbreviated Database Object Naming Notation ...................................................................................................... 13 10. Security ...................................................................................................................................................................... 14

10.1. GRANT/REVOKE Privileges .............................................................................................................................. 14 11. Application Considerations .................................................................................................................................... 15

11.1. Migrating from other DBMS to IPDA ............................................................................................................... 15 11.2. Determining the Current Schema ...................................................................................................................... 16 11.3. Dynamically Changing Databases .................................................................................................................... 17 11.4. Resolving SQL Statement Database Objects .................................................................................................... 17

12. Application Usage Types......................................................................................................................................... 17 12.1. ETL / ELT .............................................................................................................................................................. 18 12.2. System Administration ....................................................................................................................................... 18 12.3. Database Administration .................................................................................................................................... 19 12.4. Development ........................................................................................................................................................ 20 12.5. Business Intelligence ........................................................................................................................................... 21 12.6. Scientific/Labs/Class ............................................................................................................................................ 21 12.7. HA/Replication .................................................................................................................................................... 21

13. Backup and Restore (BAR) ...................................................................................................................................... 22 14. Upgrade / Downgrade ............................................................................................................................................. 22

14.1. Upgrade ................................................................................................................................................................ 22 14.2. Downgrade ........................................................................................................................................................... 23

15. Catalog/Control Tables and other Database Objects ........................................................................................... 23 16. Stored Procedures / UDFs / UDAs ......................................................................................................................... 24


iii

16.1. Creating a Stored Procedure in a Schema ........................................................................................................ 24 16.2. Using Stored Procedures .................................................................................................................................... 25 16.3. Functions / UDFs ................................................................................................................................................. 25 16.4. Procedure/Function Privileges ........................................................................................................................... 26

17. IPDA Utilities / Interfaces ........................................................................................................................................ 26 17.1. CLIENT CONNECTIVITY.................................................................................................................................. 26 17.2. NZLOAD .............................................................................................................................................................. 26 17.3. Other IPDA Utilities ............................................................................................................................................ 27

18. Other ........................................................................................................................................................................... 27 19. Conclusion ................................................................................................................................................................. 28


1

1. Introduction

IBM Pure Data for Analytics (IPDA) appliances, formerly Netezza, supports SQL Environments as defined by the

ANSI/ISO standard for SQL. The standard defines an SQL Environment as an object hierarchy which is made

up of one or more catalogs, a.k.a databases, containing one or more schemas, containing one or more SQL

Objects such as tables, view, procedures and functions. These objects are made up of a collection of one or more

columns defined using SQL data types. Connections to an SQL environment results in a session that exist at the

SQL object level. So sessions exist within a schema that exist within a catalog.

Note: This document will use the common term database rather than the ANSI term catalog. Further schema is

often used to refer to the structure of a database, its objects (tables, views, udfs, etc), and the rules for how those

objects are related and organized in the database. In this document schema refers to the SQL object class of

schema. We will use object to refer to SQL tables, views, procedures, routines, and functions.

Prior to release 7.0.3, only one schema was supported for every database. By default, the schema was the name of

the database owner. All database objects were created within that one schema. To support ANSI SQL and be user

friendly the system allowed any value to be specified as the schema name in a two or three part name. To make

object ownership easy for users using odbc, jdbc the owner of and SQL object was returned as it’s schema.

Beginning release 7.0.3, multiple schema support can be enabled and, when enabled, multiple schemas can be

created in each database. Multiple schemas are useful to help organize database objects as well as to give users

areas within the database for development and testing. The implementation of multiple schema support enables

applications to consolidate multiple databases into a single database containing schemas for stage, production

archive and so on. Because schemas are objects within the database applications can read from any schema and

write into any schema, this make application development simpler. Application migration to IPDA from other

database offerings that support multiple schemas are made significantly easier.

IBM Pure Data for Analytics (IPDA) systems running release 7.0.3 forward will have the ability to be configured

as either supporting a single schema or multiple schema. When configured to support only 1 schema per

database the system can be configured to ignore the schema element in SQL statements.

2. Referencing Database Objects in IPDA

IPDA systems implement SQL using the ANSI/ISO standard for SQL and support may, but not all features

specified in latest standard revision. IPDA appliances implement an ANSI/ISO SQL (SQL) Environment.

SQL specifies all SQL objects be uniquely identified by a three part name that consist of the database name, the

schema name and the object name.

IPDA follows the three-level SQL naming standard to refer to objects in a database:

• The database or catalog name

• The schema name

• The object name, which is the name of the table, view, synonym. routine, and other database objects


2

An example select using the fully qualified 3 part name:

select * from database-name.schema-name.object-name

When a connection (session) is established to database default values are provided for current_catalog

(database) and current_schema. So any SQL statement that is submitted containing only using less than the full

three part name will have the missing parts completed with the session defaults. This greatly simplifies the effort

required to write SQL statements.

For example given a database sales with schema demo and table customer all of the following SQL statements

would return the same data.

Three part: sales(user1)=> SELECT * FROM sales.demo.customer;

Two part: sales(user1)=> SELECT * FROM demo.customer;

One part: sales(user1)=> SELECT * FROM customer;

In the above examples the default catalog is sales, and the default schema is demo.

IPDA supports the above naming convention with many valid SQL statements, including cross-database SQL

statements. For example, the following statements are all valid:

1. Truncate the contents of the CUSTOMER table:

sales(user1)=> TRUNCATE TABLE customer;

sales(user1)=> TRUNCATE TABLE demo.customer;

sales(user1)=> TRUNCATE TABLE sales.demo.customer;

2. Retrieve specific records from the CUSTOMER table in the SALES database:

sales(user1)=> SELECT * FROM customer WHERE sales.demo.customer.custid = 101;

sales(user1)=> SELECT * FROM sales.demo.customer WHERE custid = 101;

3. While connected to the SALES database:

• Retrieve the contents of the CUSTOMER table that is located in the ORDERS database, and insert into the

CUSTOMER table in the SALES database:

sales(user1)=> INSERT INTO customer SELECT * FROM orders.admin.customer;

• Join information in the CUSTOMER tables of both the SALES and ORDERS databases using their

CUSTID column:

sales(user1)=> SELECT * FROM customer sc, orders.admin.customer oc

WHERE sc.custid = oc.custid;


3

• Create a table named CUSTOMER in the SALES database, with the same definition as the CUSTOMER

table in the ORDERS database, and populate the new table:

sales(user1)=> CREATE TABLE customer AS SELECT * FROM orders.admin.customer;

Note that cross-database writes (DML and DDL statements) are not supported in IPDA. For example, the

following statement will return an error:

sales(user1)=> INSERT INTO orders.admin.customer SELECT * FROM customer;

Cross Database Access not supported for this type of command.

For DML statements, consider changing the query to a cross-database SELECT statement (which is supported)

while logged in to the target database. For example:

orders(user1)=> INSERT INTO customer SELECT * FROM sales.admin.customer;

As a best practice, when a query involves multiple tables, it is important to qualify column names as follows:

database.schema.table.column

or

table_alias.column where alias refers to table alias defined in the from

clause.

Select cst.custid from sales.admin.customer cst where cst.lasname like ‘Smith’;

This helps identify which column belongs to which table especially if the same names are used for tables and

columns. The use of aliases keeps the size of the SQL manageable.

SQL procedures , routines, and functions are SQL objects and they support the use of multi-part names to locate

the object. For procedures, routines and functions using the single part name, IPDA will try to resolve the

object’s location using the current_schema, then the SQL_PATH (if specified), then the

INOFORMATION_SCHEMA and finally the DEFINITON_SCHEMA to find the object. The first matching object

name is used in this search.

Note. Procedure, routine, function name resolution can be time consuming and result in performance issues.

Therefore, to avoid either incorrect resolution or search performance issues, as a best practice these objects

should always be fully qualified name – database-name.schema-name.object-name.

2.1 Double Dot notation

Since the introduction of cross database access in version 3.1 the IPDA SQL syntax has provided a place holder

syntax that allowed sql users and developers to omit schema in their SQL statements. In a single schema world names always correctly resolve, so the double dot notation saved time, space

and was appliance like. Now with support for multiple schemas objects, primarily

tables and views, can reside under many schemas. So the only way to ensure the

statement resolves as desired is to provide the schema name.

Should the double dot notation be used it’s important to understand how the missing

schema element is supplied.

This syntax consisted of a database name two dots and the object name, as shown below.


4

Sales.user1(user1)=> SELECT * from orders..customer;

Note the periods between orders and customer, these are the double dots.

When this statement executes the missing schema value will be filled in by either

the current_schema or the database default schema. When the database name matches

the current database name then the current schema is used. When the database name

does not match the current database name then the schema value is filled in with

the database default schema for the specified database.

For example: sales.user1(user1)=> SELECT * from sales..customer;

In this example the current_database sales matches so the current schema user1

fills in the missing schema, making the statement “SELECT * from

sales.user1.customer”.

When the database does not match the current database the database default schema for the specified database is

used.

For example: Given the database default schema for orders database is user5 and the

SQL:

sales.user1(user1)=> SELECT * from orders..customer;

In this example the current_database sales does not match the specified database

orders so the default schema for the orders database, user5, is used to fill in the

missing schema, making the statement “SELECT * from orders.user5.customer”.

Because the missing schema can be filled in with different values we recommend all SQL statements provide the

schema values and avoid the uses of the double dot notation. This will make migration to multiple schema easier.

3. Schema Support

All IPDA releases support SQL’s three part names.

For all releases where multiple schema support is either not available or not enabled the schema element is not

validated, because there is only 1 schema.

IBM in preparation for the default enablement of multiple schema support in IPDA has introduced the option of

enabling multiple schema support in the 7.0.3 release to allow partners, and customers access to this functionality

before IBM starts shipping systems with multiple schema support enabled by default. Nearly all the partner

applications that support IPDA already also support multiple schema, so for most partners no changes are

required.


5

When an IPDA database is created it is created with a default schema named for the owner of the database. The

system variables current_catalog and current_schema are set to reflect the database and default schema. The

database owner is also the schema owner and has full privileges on both the database and schema.

When a SQL object like a table is created it’s created in the current_catalog and current_schema combining the

object name with the schema and database name results in a unique three-part name.

Privileges can be used to restrict access to databases, schemas and the objects in a schema.

3.1. Compatibility Mode (default)

One of the requirements with the introduction of multiple schema support was seamless backward compatibility.

Customers upgrading to IPDA release 7.0.3 must be able to function without any conflicts and with no

modification to their existing application/user code.

Compatibility mode allows customers with existing single schema systems to continue using the existing

database applications and processes with no modifications, following an upgrade to releases 7.0.3 and later.

In compatibility mode schema is ignored and the new syntax supporting multiple schemas are disabled.

Compatibility mode is governed by the variable enable_schema_dbo_check in the /nz/data/postgresql.conf file –

a value of 0 enables this mode. If the variable enable_schema_dbo_check is omitted it’s default value is 0.

The variable enable_schema_dbo_check has been part of the IPDA product since release 3.1 and has been used

specify the actions that will be taken when processing a query with an invalid schema. In previous releases, a

value of 0 caused IPDA to ignore the user-specified schema and use the default schema. A value of 1 would cause

a warning, but the schema would be ignored. A value of 2 would cause the SQL statement to fail if the schema

did not match.

3.2. Multi-Schema Mode

In multiple schema mode each database can have a multiple schemas. As a result of having multiple shcema’s

there can be multiple instances of an SQL object, say a table in the database. For example in the sales database

you can have an instance of the customer table under the production schema and another instance of customer

under the stage schema. The instances of customer are unique because they are in different schema and because

the both instances of customer are in the same database both instances can be written to. Being able to write to

tables in different schema enables many applications functionality equal to cross database write.

Multi-schema mode is enabled by specifying a value of 1 or 2 for the variable enable_schema_dbo_check.

When enabled all functionality relating to this schema, including new syntax and access to the information

schema is enabled.

It is recommended that new customers or new IPDA installations with no existing data should enable multi-

schema mode with a value of 2, and begin using this feature. This will help simplify the data migration from

other systems that use multiple schemas .


6

3.3. Default schema

The ANSI standard requires users connect to a vendor defined default schema when a user connects and does not

specify a schema. IPDA provides an environment (Appliance) level selectable option for the default schema.

IPDA allows the specification of either the DBMS default schema or a schema named identically to the session

owner. The variable enable_user_schema in the /nz/data/postgresql.conf determine which schema is

selected 0 = database , 1 = user.

3.4. Upgrading existing systems to multiple schema.

After upgrading an existing appliance to release 7.0.3 multiple schema support can be enabled. We strongly

recommend existing customers phase in multiple schema support according to business requirements and with

full functional testing of existing applications and processes.

Refer to Section 6, Enable Multiple Schema support in existing Systems, for more information on migrating existing

applications to release 7.0.3.

4. Configuring Multiple Schema Support

Multiple Schema Support in IPDA is enabled at the appliance, system, level . multi-schema support cannot be

enabled at the database level!

To enable, disable multiple schema support requires access to /nz/data/postgresql.conf and a restart

of the NPS system. The Linux user nz typically owns the /nz/data/postgresql.conf file.

The admin and users granted the schema privilege can create and manage schemas within databases.

You can configure whether invalid schemas return a warning or an error. For example, you can configure IPDA

to return an error for any queries that specify an invalid/non-existent schema, or you can return a warning

message for queries that use a non-existent schema and allow the use of the default schema.

4.1. Enable Multiple Schema Support

To enable multiple schema support, use the following procedure:

1. Log in to the IPDA active host as the nz user. Change the value of the variable

enable_schema_dbo_check in the /nz/data/postgresql.conf file to one of the following:

�� 0

•• DDiissaabblleess mmuullttiippllee sscchheemmaa ssuuppppoorrtt

�� 11

• Enables multiple schema support in warn mode.

• Users can create, alter, set, and drop schemas.

• However, if a query references an invalid schema, IPDA looks for the object in the default

schema and if found continues processing and displays a WARNING message (Schema

'schema_name' does not exist) .


7

�� 22

• Enables full support in error mode.

• Users can create, alter, set, and drop schemas.

• If a query references an invalid schema, the query fails and returns an ERROR (Schema

'schema_name' does not exist).

2. Run nzstop command to stop IPDA.

3. Run nzstart command to restart the IPDA software for the change to take effect.

4.2. Disable Multiple Schema Support

To disable multiple schema support, use the following procedure:

1. Move any SQL objects not in the default schema to the default schema. Note: some names may need to be

changed to keep object names unique!

2. Change the value of the variable enable_schema_dbo_check in /nz/data/postgresql.conf to 0.

3. Run nzstop command to stop IPDA.

4. Run nzstart command to restart the IPDA software for the change to take effect.

Use caution when you disable the functionality. If you disable multiple schema support without moving all SQL

objects to the default schema they have limited accessibility. The schemas and their objects still exist and can be

referenced in SELECT statements, but they will not be accessible with insert, update, delete statements. Further

schema commands will be disabled. If multiple schema support is re-enabled , users will be able to set schemas

and manage the existing objects again.

As a best practice, you should not enable and disable multiple schema support to avoid any problems managing

and accessing schemas.

5. Multiple Schemas and User Connection Options

When multiple schema support is enabled (enable_schema_dbo_check=1 or 2), if a user does not specify a

schema name to connect to when accessing a database, they will be connected to a default schema.

IPDA allows for two default connection behaviors: Connect to the database default schema or connect to the

schema which matches the user name. The enable_user_schema variable, set in the

/nz/data/postgresql.conf, controls which of these two options will be used by IPDA.

enable_user_schema = FALSE, 0 or not defined results in a connection to the database default schema.

enable_user_schema = TRUE, 1 results in a connection to a schema named for the user. In the event a schema named for the user does not exist it will be created and the user will own the schema and be placed in it.


8

5.1. Connect to the default Database Schema

When you create a database, IPDA automatically creates a schema for that database. The name of this schema is

the name of the owner who created the database. This is known as the database default schema and each database

has one. For example, if user user1 creates a database sales1, there will be a default schema named user1 in the

sales1 database. You cannot drop the default schema in a database, but you can change the default schema using

the ALTER DATABASE command and then drop the non-default schema. For example:

sales.myschema(admin)=> ALTER DATABASE sales SET DEFAULT SCHEMA schema2;

sales.myschema(admin)=> SET SCHEMA schema2;

sales.schema2(admin)=> DROP SCHEMA myschema;

Note that in 7.0.3, the nzsql environment prompt has changed to include the schema name to which you are

currently connected. It has the format database-name.schema-name(user-name)=>. The above and earlier

examples reflect this change in the prompt.

You must own the database or have alter database privilege on the database to change the default database

schema.

Care should be taken whenever there is a need to change the default schema because of the following:

• All users who can access the database will now automatically have access to the new default schema;

however, they may not have access to the previous default schema!

• If a user requires access to the previous default schema, that user must now be explicitly granted

access to that schema and must specify the schema in their SQL.

• Any SQL that uses the Netezza database dot dot table notation will use the database default schema

when doing a cross database access, so changing the default result in SQL Errors “table not found”.

5.2. Connect to the default User Schema

If the enable_user_schema variable is set to TRUE, and a user does not specify a schema when connecting to a

database, IPDA will connect that user to a schema that matches their user name. This is similar to how many

other DBMS work under these circumstances.

If a schema name matching the user name does not exist in that database, IPDA will automatically create that schema

in that database. For example, if user user1 connects a database named mydb, and there is no schema name user1

in mydb, a new schema named user1 will automatically be created in mydb and that user will be connected to

that schema.

Note that when enable_user_schema variable is set to TRUE, users will not be connected to the database default

schema, unless it matches the user name.

5.3. Connect to a specific User Schema (with Multiple Schema support Enabled)

Starting with release 7.0.3 multi-schema enabled, you can specify the schema to which you want to connect when

using the command line tools. For example, you can enter the following nzsql command to connect to the

schema USER1 in the SALES database:

nzsql -d sales –schema user2 –u user1 -pw 123456

If you wish to avoid the –schema option with nzsql, you can set the environment variable NZ_SCHEMA to the

specific schema prior to issuing the nzsql command.


9

With multiple schema enabled connected sessions can use the SET CATALOG and SET SCHEMA commands to

change to a different database and optionally schema and change to a different schema in that current database

respectively. For example:

sales.user1(user1)=> SET CATALOG orders;

orders.user1(user1)=> SET schema user2;

orders.user2(user1)=> SET schema orders.user1

orders.user1(user1)=> SET schema sales.user1

sales.user1(user1)=> SET CATALOG orders;

orders.user1(user1)=>

If IPDA multiple schemas support is not enabled, -schema is ignored. Also, if you do not specify -schema, the

environment variable NZ_SCHEMA is used; otherwise, either the schema that matches the user name, if

enable_user_schema = TRUE is set, or the database default schema for the database, if enable_user_schema =

FALSE or not set.

There is no way to specify an initial schema other than the default when using ODBC, JDBC, OleDB , so the set

schema command needs to be used.

If you have multiple schemas in a database, as a best practice, you should always use the SET SCHEMA

command to connect to the schema that contains the objects that you want to describe. This can help avoid any

confusion or problems later.

As a best practice enable_user_schema = TRUE should be used when multiple schema support is enabled.

6. Enabling Multiple Schema support on Existing Systems

Current customers who want to enable multiple schema support on exiting system need to be aware of some

existing behaviors.

Releases 4 , 5, 6, and 7 drivers pre 7.0.3 report the object owner rather than schema by default, to make

identifying the tables owner easier. As a result, BI reports and ETL flows may well contain SQL that specified the

table owner as the schema. In single schema mode schema was ignored so the object was always found.

When multiple schema support is enabled all the SQL object are kept in the default schema named for the DBMS

owner and the SQL sent by BI and ETL tools will likely contain schema’s that designate the table owner as

schema. Often the table owner will not match the dbms owner and in these cases with a value of the SQL sent

will likely fail because the table is not found under the schema, assuming the schema exist. If

enable_user_schema is disabled and enable_schema_dbo_check = 1, there will likely not be a schema named

for the table owner, so the server will look under the default schema and find the table; it will then return the

data and issue an SQL warning about the invalid schema name and continue processing.

However as schema's are created, if and they are named for “user”, then at some point the BI, ETL object will

issue SQL that contains a valid schema at which point the server will not find the table under the schema and it

will return an error. To avoid this issue from occurring, all the objects in the database need to be moved into

schemas named for the object owners. This must be done for all databases on the system.

Below is an example script to create schemas and move objects.


10

====================================================================================================

#/bin/sh

#

# This script is provided as-is it does not cover all object types or all cases. It is provided as a starting point. @IBM 2013

#

if [ "${NZ_USER}x" != "adminx" ]; then

echo "This script must be executed by user admin. ...export NZ_USER=admin"

exit

fi

if [ "`nzsql system admin -Atc "show enable_schema_dbo_check" 2>&1 | head -1`" = "NOTICE: ENABLE_SCHEMA_DBO_CHECK is 0" ];

then

echo "Exiting! Multiple schmea mode disabled: restart system afere setting ENABLE_SCHEMA_DBO_CHECK 2 "

exit

fi

if [ "`nzsql system admin -Atc "show enable_user_schema" 2>&1 | head -1`" = "NOTICE: ENABLE_USER_SCHEMA is off" ]; then

echo "*

*

* Warning! ENABLE_USER_SCHEMA is off.

*

* Hit Control C to exit, wait to continue

*

*"

sleep 10

fi

# PROCESS ALL DATABASES

for db in `nzsql -lAt | cut -d '|' -f1 | grep -iv system`

do

# get all the elements we need to work with: tables views procedures ....

objlist=`nzsql \"$db\" admin -Atc 'select distinct class from _v_objs_owned where database = current_catalog and class != ^SCHEMA^'`

# don't process objects in the default schema

dfschem=`nzsql \"$db\" admin -Atc 'select defschema from _v_database where database = current_catalog'`

for ownr in `nzsql \"$db\" admin -Atc " select distinct owner from _V_OBJS_OWNED where database = current_catalog and class !=

^SCHEMA^ and owner != ^$dfschem^"`

do

nzsql \"$db\" admin -c "create schema $ownr"

for cls in $objlist

do

for obj in `nzsql \"$db\" admin -Atc " select distinct objname from _V_OBJS_OWNED where database = current_catalog and class

= ^${cls}^ and owner = ^$ownr^ and schema != ^$ownr^"`

do

# echo "owner: $ownr Class: $cls Object: $obj"

nzsql \"$db\" -eAtc "alter ${cls} \"$obj\" rename to \"${obj}x\"; alter ${cls} \"${obj}x\" rename to $ownr.\"${obj}\""

done

done

done

done

====================================================================================================


11

7. Schemas

Beginning with IPDA release 7.0.3, when you create a database, initialize IPDA for the first time , or upgrade to

7.0.3 two internal or system defined, schemas along with objects are automatically created in every database:

• INFORMATION_SCHEMA (system schema with internal views)

• DEFINITION_SCHEMA (system schema with all system objects as seen in previous releases)

In addition, a schema that matches the user name of the database creator (default schema) must also exist, or will

be created. Therefore, there will always be at least 2 system and 1 user schemas. Previous IPDA releases only had

1 schema.

Any user who can access the database automatically has access to the database default schema if IPDA is

configured with enable_user_schema = FALSE. Otherwise, users can only access a schema name that matches

their user name, wherein user objects can be created.

With multiple schema support, IPDA supports cross-schema write actions for schemas that reside in the same

database. For example:

sales.user1(user1)=> CREATE SEQUENCE myseq.seq1

sales.user1(user1)=> CREATE TABLE orders AS SELECT * FROM user2.orders;

sales.user1(user1)=> INSERT INTO user2.sales SELECT * FROM sales;

However, cross-database writes are still not supported in 7.0.3. You therefore cannot have a transaction that spans

multiple databases.

8. Managing Multiple Schemas

You, you can create, alter, set and drop schemas within the databases, if multiple schema support is enabled. You

must have the necessary privileges to execute the schema commands below.

8.1. Creating Schemas

To create a schema, use the new SQL standard CREATE SCHEMA command.

sales.user1(user1)=> CREATE SCHEMA myschema;

You automatically become the authorization user (owner) of the new schema.

The CREATE SCHEMA command also includes options to create tables and views, grant privileges, set user

authorizations, and search specific locations to resolve unqualified routine names. For example, the following

statement creates a new schema named s2, specifies the schema name s2 to resolve unqualified routine names,

set the owner of the schema to user1, creates the table tab1 within the schema s2, and grants the SELECT

privilege to user user2 on the new table tab1:

CREATE SCHEMA s2 AUTHORIZATION user1 PATH 's2' CREATE TABLE tab1

(c1 int, c2 int, c3 varchar(10)) GRANT SELECT ON tab1 TO user2;

Note that if any one of the SQL commands in the statement above fails, the CREATE SCHEMA command fails

and everything is rolled back.


12

8.2. Changing Schemas.

Creating a schema does not change your current schema by default. You must change your current schema using

the SET SCHEMA statement as follows:

sales.user1(user1)=> SET SCHEMA user2;

sales.user2(user1)=>

If the schema does not exist, the above command will display an error.

You have the option to specify the database name and change to it if the schema is in a different database. If you

specify database-name.schema-name to connect to a schema in a different database, the command implicitly executes

a SET CATALOG command to change to the database. If you include a database name and your in a transaction

the set schema statement will fail because you cannot change databases within a transaction.

Important: You cannot change both the database and schema using SET CATALOG - however, you can change

both with the SET SCHEMA command using the two-level naming convention (database-name.schema-name)!

8.3. Altering Schemas

You can use the new ALTER SCHEMA statement to alter a non-systems-defined schema and change its name, its

owner, or the contents of the schema path. For example:

• Change the name of the schema from user1 to user2:

sales.user1(user1)=> ALTER SCHEMA user1 RENAME TO user3;

• Change the owner of the schema from user1 to user2:

sales.user1(user1)=> ALTER SCHEMA user1 AUTHORIZATION TO user2;

• Set the schema path for schema user1 to search for objects within several other schema names:

sales.user1(user1)=> ALTER SCHEMA user1 SET PATH

'user2, user2, user3.user4';

The schema path specifies a comma-separated list of schemas that define a search path for routines such as user-

defined functions, aggregates, libraries, and stored procedures.

You cannot alter a schema if there are sessions that are connected to that schema. Also, you cannot use the ALTER

SCHEMA command to move the schema to a different database.

8.4. Display Database Schema

You can use the new SHOW SCHEMA command to list all of the schemas defined in the current database. For

example: sales.user1(user1)=> SHOW SCHEMA;

Schema owners can only see information for their schemas and any schemas that they are privileged to see. The

admin user or the database owner can see all schemas in the database, including the system-defined schemas.

8.5. Dropping Schemas

If you are logged in as the admin user or are the schema/database owner, you can drop a schema using the new

DROP SCHEMA statement.

You must specify either CASCADE (drop all objects within the schema) or RESTRICT (do not drop the schema if

it contains objects) when dropping a schema. For example:

sales.admin(admin)=> DROP SCHEMA user1 CASCADE;


13

You cannot drop the default schema , the internal schema or a schema that is currently being accessed by other

users.

Note again that when you drop a schema, you drop all the objects defined within that schema! Therefore, as a

best practice, always use the RESTRICT option to avoid the accidental drop of important objects within a schema.

9. Abbreviated Database Object Naming Notation

In IPDA the terms catalog and database are interchangeable, so the function current_catalog returns the

name of the current database.

In release 3.1, Netezza introduced cross database access and with it a shortcut notation known as DOT DOT or

double dot. This syntax made SQL statements easier to write and took advantage of the fact that Netezza had

only 1 schema per database. By design the missing schema name is filled in with the current_schema for current

database references and by the database default schema for non-current database references. In single schema

mode these are always the same. In multi-schema mode the values can be different and the behavior can change.

The following examples will help make the choice clear.

Database Default Schema

------------ ---------------------

SALES user1

ORDERS user2

A user has the current_schema set to production.

The user first executes the following command:

sales.user1(user1)=> SELECT * from SALES..CUSTOMER;

The above statement will translate into:

SELECT * from SALES.PRODUCTION.CUSTOMER.

Notice that the filled in schema value above is the current_schema! So the user will be referencing the object

CUSTOMER in the production schema of the SALES database!

The user next executes the following command:

sales.user1(user1)=> SELECT * from ORDERS..CUSTOMER;

The above statement will translate into:

SELECT * from ORDERS.USER2.CUSTOMER.

Notice now that the filled in schema is the default for the ORDERS database. So the user will actually be

referencing the object CUSTOMER in the user2 schema of the ORDERS database!

Therefore, to avoid unexpected or unintentional consequences, effective with release 7.0.3 we recommend that the

use of the DOT DOT notation be abandoned to avoid unexpected query behavior.


14

10. Security

By default, the admin user has full access to all databases, schemas, and objects. With multiple schemas in 7.0.3,

the schema owner automatically has full access to all objects within the schema - even if those objects were not

created by that owner. In essence, the schema owner becomes the superuser for that schema!

However, the schema owner cannot grant object or administrative privileges within the schema unless the user is

explicitly granted privileges to do so.

When multiple schemas are enabled, a user who is granted access to a database automatically inherits access to

the default schema of that database. The only exception is when enable_user_schema is set to TRUE. In that

case, the user always connects to a schema that matches their user name, not the database default schema, if a

schema name is not specified. However, if necessary, a user can set their schema to the default or access the default

in an SQL statement.

Be aware that if you change the default schema for a database, the users automatically inherits access to the new

default schema and may loses access to the previous/original default schema. If a user requires access to the

previous default schema, that user must now be explicitly granted access to that schema.

10.1. GRANT/REVOKE Privileges

The GRANT and REVOKE SQL commands now include multiple schema support.

You can use a fully qualified object notation (database.schema.object) with GRANT/REVOKE statements to set the

scope of object privileges from any database:

• database is a specific database name or the keyword ALL to grant the privilege across all databases.

• schema is a specific schema name or the keyword ALL for all schemas in the specified database value.

• object is an object class such as TABLE, VIEW, and so on, or a specific object name.

Important: The keyword ALL applies the privilege to all the databases, or to all schemas in the specified database

if that keyword is specified at the schema level. It also means the privilege applies to all current and future

databases/schemas! If you do not specify a schema, the scope is the current schema. As a best practice, you should

always specify a specific database and schema to restrict the privilege.

You can grant privileges on an object to a user or a group using one of the qualified object notations. For example,

grant the SELECT privilege to user user3 on the table CUSTOMER, which is located in schema user2:

sales.user1(user1)=> GRANT SELECT ON user2.customer TO user3

For the above type of privilege, the object must exist, and this privilege overrides any other defined privilege.

Since multiple schema support allows objects with the same name to reside in different schemas, the GRANT

statement now includes a TYPE clause to specify an object type such as DATABASE, SCHEMA, USER, or

GROUP. For example, to grant the SELECT privilege to all objects in the schema user1 to user user2:

sales.user1(user1)=> GRANT SELECT ON user1.customer TYPE SCHEMA TO user2

IPDA would resolve user1.customer in the above statement to the customer table only if the SCHEMA

keyword was not specified. The TYPE option can therefore help to clarify the object in cases where different

objects could have the same name, and is strongly recommended to avoid unintentional privilege grants.


15

A new CREATE SCHEMA admin privilege has also been introduced to allow users to create schemas in a

database. This, and other admin privileges can be used to define the privilege scope. Previously, when you

granted a privilege in the SYSTEM database, it automatically became a global privilege – it applied to all

databases. Now, you can set the privilege scope as before by specifying the IN database.schema clause from

any database. For example, the following command grants user user1 the Create Schema privilege in all databases

and schemas:

sales.user1(user1)=> GRANT CREATE SCHEMA IN ALL.ALL TO user1;

The power of this is that with very few statements you can grant specific privileges to everyone; then, if need be,

you can override the privileges on a specific user/schema/database basis.

You can use the REVOKE statement to remove access privileges for a user, group, or all users. The range of

syntax supported is similar to that of the GRANT statement, including the examples above. However, as a best

practice, when revoking privileges, you should sign on to the same database and schema where you granted the

privileges, or use the fully qualified name forms that match the locations in which you granted the privileges.

11. Application Considerations

11.1. Migrating from other DBMS to IPDA

For users/applications migrating to IPDA from other DBMS which support multiple schemas, there are some

differences in the behavior that must be accounted for.

IPDA automatically creates a schema in a database, if one does not exists, when multi-schema mode is enabled

and the variable enable_user_schema is set to TRUE. This helps avoid object name collisions. For example, this

feature is useful when multiple users are testing one script which executes DDL and DML statements. Since each

invocation of the script will be in the schema of the executing user, multiple users can use the same script without

any changes. IPDA will create a schema for each user, if one does not exist, and since all tables, views, or other

database objects will be created in the schema of each user, there will not be any conflicts as long as all object are

unique in that schema.

Note that IPDA is an appliance and therefore does not have an option to place certain data in specific storage

spaces/media. You therefore cannot place individual tables in different storage locations. All table records are

distributed across the pre-defined data slices of the system. However, IPDA does distribute rows with the same

distribution key value to the same data slice. Because IPDA does not support the specification of storage locations

at the object level, you cannot use schemas to distribute data, as is possible in other DBMS.

Applications migrating from other DBMS which do not support multiple schema, but wish to take advantage of

this feature, can do so relatively easily. You can logically organize the tables into different schemas. Then, to

avoid significant application changes, you can use synonyms to reference tables, in the same or different schema.

In many environments, there is a need to separate data amongst various user groups, even though the set of

tables and their definitions are the same. For example, development, quality assurance, and production would all

normally use the same table definition but different data sets. The production data and tables are considered

critical and would normally be in a separate database. Often, for cost reasons, the development and quality

assurance databases would be on the same IPDA appliance also. This containment of functionally within separate

databases ensures no accidental overlaps, thereby avoiding critical situations.

However, the overhead in maintaining multiple databases is generally more than managing multiple schemas.

Further, unlike DB2 and some other DBMS, IPDA can only be configured at the system level and not the database


16

level. Therefore, it is recommended that when migrating from other DBMS wherein many databases were

created to divide responsibilities, you should consider creating multiple schemas versus multiple databases to

reduce the management overhead. Code changes should not be required if objects were referenced using the

object-only freeform notation, since each group would still be isolated within their own schema. However, you

now would now have the added capability to perform cross-schema writes if necessary.

Note that it is not uncommon for the development environment to have a different version (definition) of some

tables to that of production. As long as developers can only access objects in their respective schema, the different

table versions will not cause a problem. It would be up to the DBA to formally deploy the different object

versions between the development, quality assurance, and production schemas.

In environments where information for many clients is hosted in separate databases, the IPDA DBA can instead

create multiple schemas, one for each client, and then assign owners for each schema – including the default

schema. Then, if any additional permissions need to be assigned, it can be done on an individual schema basis.

This way, the DBA can quickly setup an environment that already has the default permissions discussed earlier,

which are natural to multiple schema in release 7.0.3.

Important: Like Oracle and some other DBMS, switching schemas is easier than switching databases; therefore,

multiple schemas provide more flexibility. However, note that if the database becomes inaccessible, so do all

schemas in that database! Multiple schemas offer simplicity and flexibility, but at a cost of reduced availability.

As with many other DBMS, schemas provide the flexibility to add/remove users/applications easily. You can

perform the following functions when working with schemas:

• Change ownership of schema using the ALTER command

• Allow multiple database users to share a single schema

• Setup a single schema that contains objects owned by multiple database users

• Objects can be moved between schemas using the ALTER SCHEMA command, but not databases

• Permissions can be setup at the schema level instead of the object level

11.2. Determining the Current Schema

You can use one or more SET SCHEMA statements within a transaction or stored procedure . However, you will

fall back to the schema you were originally connected to once the transaction or stored procedure completes.

After executing one or more SET SCHEMA statements within a transaction or stored procedure, you can use the

new current_tx_schema built-in function to return the value of the current schema.

Note that the existing function current_schema returns the name of the schema a user is currently connected to

and working within. The difference between current_schema and current_tx_schema is that

current_tx_schema can only be used within a transaction or stored procedure, whereas current_schema

only works outside a stored procedure or transaction. Outside of their respective areas, these functions will return

null/incorrect values.

Both functions allow the user/application to query their current environment – as has been the case with

current_catalog when working with databases in previous IPDA releases. For example:

• Find the current schema within a transaction:

sales.user1(user1)=> SELECT CURRENT_TX_SCHEMA;

• Find the current schema outside a transaction:

sales.user1(user1)=> SELECT CURRENT_SCHEMA;


17

As a best practice, when working in multiple schemas, you should always check your current schema prior to

performing any major activities (loads, deletes, updates, etc). This ensures that DBMS tasks will be executed in

the correct environment; otherwise, serious and unexpected problems could result!

11.3. Dynamically Changing Databases

You cannot use the SET CATALOG statement inside of a transaction to change the current database.

11.4. Resolving SQL Statement Database Objects

When you create or alter a schema, you have the option to specify the Schema Path in the definition. This path is a

list of schemas to search in order to resolve the location of a particular object referenced in an SQL statement.

For example, to resolve a stored procedure name, IPDA will first look in current schema, then in the schema

search path, followed by the 2 internal schemas. As soon as a procedure of the same name is found in a schema,

the search will stop and that procedure will be used – irrespective if it is the correct procedure or not!

Because of this, the order specified in the schema path can become very important! Further, if you change your

schema using SET SCHEMA, you could have a completely different search path in the schema you transitioned

to. It is therefore crucial that the user/application understand these changes and account for them. Searching for

objects in an ordered fashion can not only affect performance, but it could result in the use of the wrong object,

leading to unexpected issues and incorrect results!

Be aware that the search_path variable can also be defined in postgresql.conf to resolve database objects. This

variable specifies the order in which schemas are searched when an object is referenced by a simple name with no

schema component. The paths specified by this variable are appended to every schema path of all defined

schemas. Therefore, if you have a common set of schema that you would like to be searched for every schema that

has been (will be) created, you can set this variable accordingly.

You can use wildcards such as current_user, current_database, and current_schema inside the

search path. Searches can therefore be dynamic!

Important: search_path does not change your current schema even if the object is found in a different schema.

12. Application Usage Types

With the introduction of multiple schemas in release 7.0.3, a number of intriguing possibilities arise in the areas of

database design, application processing, and security.

You can take advantage of the major benefits of using multiple schemas, including:

• Easier Administration, by organizing objects into logical groupings and managing them at the object

level, versus on a individual basis

• Effective Sharing, by grouping commonly used objects in a different schema and allowing the

respective users to access them from other schemas

• Enhanced Security, because security can now be maintained at the schema level

• Better Containment, because users in one database but in different schemas cannot interfere with

each other, resulting in fewer collisions

There are many application/user types spread out across many business environments which can benefit from

multiple schemas. Schemas can be used for duplication of object definitions and data, QA testing, developer


18

playgrounds, keeping copies of other databases locally, increasing security, logical database design, user/

application containment, and even to support different application versions.

Below are listed some of the major business environments/functions, and for each, a summary of how IPDA users

in each environment can benefit in the use of multiple schemas.

12.1. ETL / ELT

In most ETL environments, data is continuously sent from various data sources to a Enterprise Data Warehouse

or Data Mart(s). However, before the data can be loaded into the production warehouse tables, the data must be

“transformed” (scrubbed) prior to loading.

Data Flows, together with Control Flows, can move and scrub the data in the source environment, in temporary

directories or other staging areas, or within the production warehouse database environment. If the scrubbing is

performed within the data warehouse database, like IPDA, it is best practice to create a new schema to perform

these transformations. After the data is scrubbed in the non-production schema tables, it can be loaded into the

warehouse production schema tables (fact and dimension).

If multiple users concurrently scrub the data received from different data sources, each user works on a subset of

the source data. Since many transformation scripts create temp (staging) tables to load and scrub the data, if the

enable_user_schema variable is set to TRUE, table creation and scrubbing will automatically occur in the

schema of each user - prior to loading into production schema tables. This will help avoid conflicts or accidental

deletion of records, and even catastrophic drops of production tables.

You also have the option to create schemas for each data source to be transformed, allowing scrubbing activities

to be independent and concurrent.

You could even consider creating a schema for each type of ETL/production table. For example, you could create

separate schemas for the fact, dimension, staging, temp, and other tables. The benefit of this is that the schema

would clearly identify the purpose of each set of tables, the users that need permissions for each schema, and help

with troubleshooting activities.

All the above multiple schema scenarios would work because cross-schema writes are supported in release 7.0.3!

Note that you will need to GRANT the INSERT privilege on the production tables to all users involved in the ETL

process. For example, using the new syntax of the GRANT option, the following will grant user user1 INSERT

privilege to all tables in schema prodschema:

dw.prodschema(user1)=> GRANT INSERT ON user1.fact_tbl TYPE SCHEMA TO user1

However, as a best practice, you should ensure that those same users do not have other privileges on tables in the

production schema.

12.2. System Administration

Typically the system administrator does not have access to databases and their data . However, certain

institutions, such as banking, require a division of responsibilities because of government or other regulations.

Because of this, there could be some privileged and sensitive data stored in specific database tables. Division of

responsibilities requires that not even the DBA should have access to this data. It is therefore essential that these

tables are stored in the admin or an admin-only accessible schema, with no other data. Since the user admin has

full privileges and access to all database objects, you must use this account for only those individuals that require

access to that extremely sensitive data.


19

In these environments the DBA should initially use the admin account to create some IPDA users and groups,

including another administrative-level account to perform tasks such as user management, database

maintenance, and object creation and management. For example, to create an admin_users group that provides

object and administrative level privileges similar to the admin user:

1) Connect to the SYSTEM database as the admin user:

nzsql -d system -u admin -pw password

2) Create a group for your administrative users:

system.admin(admin)=> CREATE GROUP admin_users

3) Grant the group all administrative permissions:

system.admin(admin)=> GRANT ALL ADMIN TO admin_users WITH GRANT OPTION;

4) Grant the group all object permissions EXCEPT schema:

system.admin(admin)=> GRANT ALL ON DATABASE, GROUP, SEQUENCE, SYNONYM, TABLE,

EXTERNAL TABLE, FUNCTION, AGGREGATE, USER, VIEW,

PROCEDURE, LIBRARY TO admin_users WITH GRANT OPTION;

5) Add users to the group to grant them the permissions of the group:

system.admin(admin)=> ALTER GROUP admin_users WITH USER adm_user1, adm_user2

Then, once multi-schema mode has been enabled and the enable_user_schema variable set to TRUE, the DBA

should then relinquish the admin user/password account to the respective individual(s) who has a need to view

sensitive data. That privileged user can then setup an environment to house the sensitive data in an admin-only

accessible schema.

Although the above is not the most efficient solution, it does allow for the separation of responsibilities.

12.3. Database Administration

Schemas allow a database administrator (DBA) to separate database users from database object owners. The DBA

has more flexibility to protect sensitive objects in the database, and also to group logical entities together.

If you do not enable multi-scheme mode, you will default to a pre-7.0.3 release IPDA environment – a single

schema system. This is essentially a public database schema, which does not offer grouping of objects,

containment for abnormal events, nor better security. Single schema environments are recommended when there

are very few users that cooperate amongst each other, and when there is not a need to sensitize data between the

users.

As mentioned earlier, it is recommended that new IPDA release 7.0.3 installations enable multiple schema

support using the variable enable_schema_dbo_check, located in the /nz/data/postgresql.conf file. This variable is

a system level variable implying multi-schema support is enabled for the whole IPDA system – including all

databases. It does not apply on a per database level. You should also set the variable enable_user_schema to

TRUE, allowing IPDA to automatically create schema names that match user names, if it does not already exist.

Through schemas, a DBA can control access to critical objects that would otherwise be open to intentional or

unintentional, and potentially harmful, changes by users. Schemas are often considered a safety net and should

be viewed as so as a best practice. For example, scripts are often used in many database environments to perform

varied functions – including unload/load data, databases maintenance, and periodic deployments. Users allowed

to execute these scripts to accomplish these functions should have limited permissions, often only inside their

schema, so that their actions are contained. This helps prevent any problems from affecting other users/data.


20

As a best practice, you should setup shared database objects in their own schema. Shared objects can include

tables, views, stored procedures, functions. third party objects, etc. You will need to setup the necessary GRANTs

to allow users working in other schemas to access objects in the shared schema. Users accessing objects in the

shared schema will at a minimum need to specify the schema.object notation – this may require code changes, or,

more simply, include the shared schema name in their schema path or the global search_path variable.

Note that the IPDA database group named public includes all database users – they inherit all of that group’s

privileges. Therefore, once you adopt multiple schemas, if you had setup a default set of permissions for all IPDA

user accounts in a previous release, you will need to revisit that and determine if there would be any security

violations in your new multiple schema environment. You may have to revoke access to some or all users so that

they are constrained to their individual schemas.

Views can also be created in separate schemas to hide sensitive, or unnecessary, information that located in the

base tables. The base tables could be in the same or different schema, but the view creator must have access to

those tables - GRANT SELECT on table TO view_creator. Shared views that provide access to the non-

sensitive data can be very useful and easily managed. Since the underlying tables are never exposed, this setup

provides another layer of abstraction and thereby increases security.

Restricting third party or connection pooling applications to one schema using the application id is considered a

best practice. Also restricting users to different schemas based on their job title/function is another best practice

that should be considered. Both isolate the application/functions from other user activities and avoid unexpected

or catastrophic consequences.

12.4. Development

One important usages of multiple schemas in a development environment is in the form of object versioning. For

example, many developers could create their own version of an object and work on it. Each developer can first

“pull” a related set of objects from a shared schema into their own schema. They can modify their own version of

the object(s) and, when they have completed testing, etc., they can “put” the new version of the object back into

the shared schema using an established standard. This setup ensures that the original shared object definitions are

always visible to others for use, and also allows everyone to immediately see changes once object definitions have

been updated in the shared schema.

Setting the schema path and the search_path variable would allow the above functionality to work seamlessly.

Also, since each individual developer is normally isolated in most development environments, the above should

not result in any conflicts. Developers usually work on a specific area of code/application; therefore, the

likelihood of two developers working the same object is often minimal. Further, because a “put” of a new version

of an object into the shared schema would be controlled either through security (ALTER privilege) or some

internal standard, it is unlikely two developers will update the original schema at the same time.

Many development environments work using a modularized model. Creating separate schemas for each module

and ensuring developers work in their respective modules as a “team” could help avoid collisions/accidental

“changes”. Also, this would more clearly identify the objects used in each module. Any objects shared between

modules could be setup in a shared schema with the necessary privileges.

As discussed earlier, creating a schema for each team – developer, testing, QA, production – and ensuring

members of each team can only access their respective schema, can lead to efficient software development.

However, as the application/changes moves through the various software lifecycle stages, only a single individual

should have the authority to “update” the objects in each team’s schema. Further, creating a backup schema at

each stage would help facilitate “rollbacks”.


21

Note that when developing applications, the number of objects within databases can easily run into hundreds, if

not thousands. It is under these circumstances wherein multiple schemas prove extremely beneficial. They help

organize the objects logically, provide the necessary security, and avoid possible problems; their benefits

outweigh their costs.

12.5. Business Intelligence

Business Intelligence (BI) applications and tools mine data within EDWs/Marts to obtain insights into what has

happened within an organization, and also what could be ahead. Since most ETL can be found in these same

environments, it is important to segregate the ETL and BI activities.

Creating multiple schemas for ETL activity is recommended, but also separate schemas for BI tools/applications.

For example, simple reporting applications could be in their own schema, whilst applications that heavily mine

the data to predict future activity/events could be in another schema. The key is that any non-ETL activity should

be segregated so as to not interfere with each other.

Note that it is not necessarily the data that should be segregated into multiple schemas, if practical, but especially

the objects, such as stored procedures and functions, which are directly called/used by the BI tools.

Creating shared and private views, in their respective shcemas can also help segregate information for reports,

data mining, and/or loads. The use of views within multiple schemas can further enhance security where

exposure of underlying data needs to be restricted based on role.

Finally, warehouse environments that support ad-hoc mining could benefit using multiple schemas too. You

could setup a shared schema where data is simply read by all like users; however, any temp objects that are

created to support the analysis can be in the schema of the user. This avoids conflicts and allows separation of the

permanent tables/objects with the “temp” objects that may be created as part of the mining process.

12.6. Scientific/Labs/Class

Many scientific environments usually work with a suite of application packages to perform their analysis, and

record their observations/test results into a DBMS. There often are many scientists and assistants recording data.

Each would normally work in their own database environment but have access to a common set of the public

tables.

When these IPDA environments are upgraded to release 7.0.3, they should remain in compatibility mode, so that

they continue to perform their functions as before. However, if the users upload information using their own ids,

they could take advantage of multiple schemas, wherein they would simplify database management by recording

each test/observation in their own schema. This also provides them the flexibility to update/modify/delete data in

their own environments, without risk of affecting others.

Another scenario where multiple schemas would be useful is in the classroom environment. When students are

following the same lab/class exercise, they would normally create the same objects. In previous IPDA versions,

they would have to create separate databases to avoid conflicts. However, with multiple schemas in release 7.0.3,

the DBA/ADMIN could simply set enable_user_schema = TRUE, which would allow each student work to in

his/her schema without any conflicts.

12.7. HA/Replication

High-Availability (HA) replication environments often move data between a production environment to one or

more secondary (backup) servers. Multiple schemas can be used on the primary/secondary servers to differentiate

the data that is sent and received. to serve multiple business purposes.


22

For example, you could create a production and a non-production schema in a database on the primary, both

with the same tables. In real-time, you could replicate/move the transactions from the primary schema to the non-

production schema. This provides the first level of high-availability. The non-production schema can then

replicate the data asynchronously to a HA server. Best of all, this non-production schema can serve as an

operational data store which provides real-time mining of the data. This prevents conflicts in reading and

replicating the data, and isolates the production schema from unexpected problems.

The data that is replicated to the HA server(s) can be received and moved into several schemas for

transformational, data mart, or other activities.

It is important to note that even though the IPDA appliance is a warehouse appliance, there is an increased

urgency to report/analyze the data; in effect, DSS environments are becoming comparable to OLTP in their

importance, availability, and concurrency.

13. Backup and Restore (BAR)

With the introduction of multiple schemas in Release 7.0.3, the nzbackup and nzrestore commands have

changed.

Now, when you backup a database using the nzbackup command, the backup includes the data for all the

schemas in the database. However, backup of specific schema objects are not supported.

When you restore a database that includes multiple schemas, nzrestore will restore all of the schemas unless you

are restoring to a system that does not support schemas. In this case, the restore process attempts to restore all the

objects in the single, default schema for a database. If there are name conflicts, such as tables in different schemas

that use the same name, the restore reports an error.

Conversely, if you are restoring a database created on a system that did not use multiple schemas to a system that

does support multiple schemas, the restore should complete without error. The restore creates a schema that

matches the owner name of the database in the backup, and restores the objects to that new schema.

Any stored procedures that you create in any schema are backed up and restored by the IPDA backup and restore

operations. As a best practice, you should keep backup copies of your source CREATE OR REPLACE

PROCEDURE definitions in a safe location outside of the IPDA system. Make sure that you have recent backups

of your IPDA systems in the event that you need to recover from an accidental deletion or change, or to restore

IPDA services as part of a disaster recovery situation.

Note that as of the IPDA 7.0.3 release, schema level backup and restores are not supported.

14. Upgrade / Downgrade

14.1. Upgrade

To support compatibility mode, the existing postgresql.conf settings, catalog tables, and views do not change when

upgrading to IPDA release 7.0.3 from a previous release.

However, it is recommended to ensure that multiple schema support is disabled prior to the upgrade – set the

variable enable_schema_dbo_check is set to 0. This will avoid any initial issues/conflicts related to multiple

schema support in release 7.0.3, and allow you time to plan and test your applications with a phased approach as

discussed earlier.


23

As part of the upgrade process to release 7.0.3, a parent record is created for functions and aggregates. A default

schema is also created for every user database during upgrade. This schema will have the same name as the

database owner – exactly as in prior versions. However, this default schema is now an explicit object now.

If you changed the owner of the database in the prior version, the default schema name would reference the new

owner! You can see the name of the new owner in SHOW schema command. However, if you change the owner

of the database after the upgrade, name of the default schema does not change! In release 7.0.3, you can change

the name of the schema and the owner as separate entireties, but these are two different operations – one of which

does not affect the default schemas.

14.2. Downgrade

Once you create multiple schemas in a database, in release 7.0.3, you cannot automatically downgrade to a

previous release.

You must undo all your changes in release 7.0.3 to return to a single schema environment. You can either drop all

the new schemas except the default schema, which will drop all the objects, or move all objects that are in the

new schemas into the default schema and then drop the schemas.

You can use the ALTER TABLE statement to move a table from the current schema to a new schema. For

example, to move table CUSTOMER from the schema user1 to schema user2, you can execute the following

statement:

sales.user1(user1)=>ALTER TABLE customer RENAME TO user2.customer

Be aware that you cannot change the database a table exists in! Use the CREATE TABLE AS … statement in the

new database environment to move the table over. This is similar to what has been done in previous release.

Also, you must undo changes to any postgresql.conf settings made in release 7.0.3. This includes any changes to the

enable_schema_dbo_check variable. The view version of client drivers must also be reverted - if changed. All

these are manual activities and not part of the downgrade process.

Note that the two internal schemas are automatically dropped during the downgrade process.

15. Catalog/Control Tables and other Database Objects

Release 7.0.3 introduces many updates to existing tables and views to support fully qualified names, multiple

schemas, and other improvements.

Important: Unless multiple schema support is enabled, you will not see any changes to catalog tables and views.

This is to facilitate seamless backward compatibility after immediately after upgrading to IPDA release 7.0.3.

The _t_object table now has a new schema column. The table _t_database includes a column to store the

default schema.

In IPDA, all object references by name are resolved against the object table - _t_object. This table contains the

object id (oid), which can be used to join with other catalog tables to get specific details for that object. Since the

object table now also holds the schema oid, this will allow you to find more information on schemas when joining

with other catalog tables. Not that the _t_object table is the only place where the object oid and the schema oid

and their associated names are listed.


24

The security tables, _t_acl, _t_grpobj_priv, _t_usrobj_priv, all have a new column named schema.

A new class named Schema Class has been added on the _t_object table so that all the schemas will have a code

which identifies them as schemas A new table named _t_schema has been introduced to provide information on

the definition of each schema - including the sql path values.

Also, all system views now include two new columns to hold the physical schema name and schema object id

(oid).

Be aware that there are changes to system indexes in release 7.0.3. The unique index on relname in the _t_class

table has been removed since relname is no longer unique inside of a database. This is because you can have same

table name in different schemas. Also, some of the indexes on the _t_object and _t_acl tables have been

updated to include the schema column.

Note: When you execute queries against the system catalog tables, for performance reasons, you should always

use the columns defined in the indexes in the WHERE clause of the SQL statement. Using indexed columns will

force an index scan; otherwise, a full scan will be performed.

In release 7.0.3, the history configuration version has incremented from 1 to 2. The version 2 history tables and

views have numerous updates to support multiple schemas, the ability for users to change databases and

schemas within a session, and timezone offsets. When you create a history database and enable history collection,

specify version 2 to make sure that you collect the complete information for activity on the 7.0.3 or later system.

Synonyms and sequences share the same naming restrictions as tables, views, and functions - both must be

unique with a schema. However, synonyms cannot be the same as global objects such as those of databases,

users, or groups. Also, the owner of the sequence where the sequence is defined has full privileges on all user

sequences in that schema. There is no need to grant any privileges to the owner.

16. Stored Procedures / UDFs / UDAs

16.1. Creating a Stored Procedure in a Schema

When you create a stored procedure using the CREATE [OR REPLACE] PROCEDURE command, the command

adds the procedure to the database and schema to which you are connected. Starting with version 7.0.3, you can

specify a name in the format schema-name.procedure-name to create a procedure in a different schema of the current

database. For example:

• To create a new procedure called cust_name in your current schema:

sales.user1(user1)=>CREATE OR REPLACE PROCEDURE cust_name()

RETURNS INT8 LANGUAGE NZPLSQL AS BEGIN_PROC BEGIN RAISE

NOTICE 'The customer name is John Doe; END; END_PROC;

• To create a new procedure called cust_name in a different schema of the same database:

sales.user1(user1)=>CREATE OR REPLACE PROCEDURE user2.cust_name()

RETURNS INT8 LANGUAGE NZPLSQL AS BEGIN_PROC BEGIN RAISE

NOTICE 'The customer name is John Doe; END; END_PROC;

You can use the same naming convention with the ALTER/DROP PROCEDURE statements to change/drop a

procedure in the same/different schema of the current database. For example, to change the owner for a

procedure in a different schema, enter:


25

sales.user1(user1)=>ALTER PROCEDURE user2.myproc(int4) OWNER TO john;

Note that you cannot create/change/drop a procedure in a different database.

Important: With multiple schemas, if you define stored procedures in a non-default schema, dropping that

schema will drop all the objects in that schema.

16.2. Using Stored Procedures

You can execute a procedure while connected to a database and schema in which it is defined, in other schemas of

the same database, or in other databases. Assuming you have the necessary privileges, the following examples

use various notations of the fully qualified format (database-name.schema-name.object-name) when calling a

procedure object that resides within the current/different database or schema:

sales.user1(user1)=>CALL calc_sales();

sales.user1(user1)=>EXEC user1.calc_sales();

sales.user1(user1)=>EXECUTE sales.user1.calc_sales();

sales.user1(user1)=>EXECUTE PROCEDURE calc_sales();

sales.user1(user1)=>EXEC orders.user2.cals_sales ();

sales.user1(user1)=>EXEC user3.calc_sales();

You can also use the fully qualified object notation to set the scope of object privileges from any database - where

object is either the class PROCEDURE for all stored procedures, or a full signature such as

customer(VARCHAR(20)). You can do the same to set the scope of administrative privileges from any database.

16.3. Functions / UDFs

Similar in behavior to the CREATE TABLE statement, the CREATE FUNCTION statement will fail if a user-

defined function with the same name and signature already exists in the database, or in the same schema.

However, you can now also use any of the below SQL statements to change/create/drop the object in a different

schema of the current database:

• ALTER FUNCTION (schema.function)

• ALTER LIBRARY (schema.library)

• CREATE [OR REPLACE] AGGREGATE (schema.aggregate)

• CREATE [OR REPLACE] LIBRARY (schema.library)

• DROP AGGREGATE/FUCTION/LIBRARY (schema.aggregate/function/library)

None of the above SQL statements can be used across a different database.

Note that when you create an aggregate or function, its signature (that is, its name and argument type list) must

be unique within a schema. No other aggregate or function can have the same name and argument type list in the

same database or schema. As a best practice, avoid creating duplicates of the same function in different schemas

and, instead, always use fully qualified names to reference functions in the different schemas. This can help

reduce the overhead to maintain multiple copies of the same function and ensure the correct UDF is executed.

Also, the nzudxcompile command now has the new schema option that specifies the schema in the database

where you want to register the object. If you do not specify a schema, and the NZ_SCHEMA environment

variable is not set, be aware the command uses the default schema for the database, unless IPDA is configured

with enable_user_schema = TRUE, which then would place the UDX in a schema that matches user name.


26

16.4. Procedure/Function Privileges

By default, the admin user account has execute access to all stored procedures, user-defined functions (UDF), and

aggregates. The database owner has privileges to run these objects in the database. Other users can be given

privileges to run specific or all objects. Starting with release 7.0.3 the schema owner has full permission to these

objects within that schema. In addition, the object owner (user who created and registered the UDF) has

permission to manage and execute the stored procedures that he or she owns.

17. IPDA Utilities / Interfaces

17.1. CLIENT CONNECTIVITY

IPDA clients and CLIs now include support for multiple schemas.

There are multiple versions of system views for each IPDA driver. Drivers are structured in such a way that they

call different views internally. Clients on a release prior to 7.0.3 that connect to a 7.0.3 host may find some

command output confusing, because the older commands do not show schema information that can help to

uniquely identify objects.

In IPDA releases prior to 7.0.3, the default view is version 2, which returns the object owner as the schema field –

this is the name of the user who created the database. When moving to release 7.0.3, it is recommended to change

the views of the driver that ODBC and JDBC clients are using from version 2 to 3. This version will return the

schema field in the schema column attribute of those views. Since this is a system-wide change, the change will

affect all clients!

OLDB and .NET clients should move from view version 1 to 2 to take advantage of the additional schema

information.

17.2. NZLOAD

The nzload command now includes the new –schema option to support multiple schemas. This option specifies

the schema in which to load the table. If you do not specify the -schema option, IPDA uses the value of the

NZ_SCHEMA environment variable; otherwise, the default schema for the database is used.

The nzload control file, which allows you to define load operations in a text file without having to specify the

options on the command line, also includes the schema option to support schema specific table loads. For

example, the following control file options define two data sets to load into two tables with the same name but in

different schemas – john.customer and mary.customer.

DATAFILE /home/operation/data/j_customer.dat { Database sales Schema john TableName customer Delimiter '|' Logfile j_logfile.log Badfile j_customer.bad } DATAFILE /home/imports/data/m_customer.dat { Database sales Schema mary TableName customer Delimiter '#' Logfile m_logfile.log Badfile m_customer.bad }


27

The default nzlog and nzbad log files created by nzload now include a schema name such as

table.schema.database.nzlog and table.schema.database.nzbad. As a best practice, if you are running multiple nzload

jobs to load into a table, you should use unique names for your nzbad files.

Note: If your default system case is uppercase, IPDA displays lowercase table names as uppercase in nzlog files,

for example, CUSTOMER.USER1.SALES.nzlog and CUSTOMER.USER1.SALES.nzbad

As a best practice, because table names are no longer unique within a database, you should always fully qualify

the objects you are loading to avoid data insertion into the wrong tables. This is especially true given that IPDA

will use the object in the default schema if the –schema option and NZ_SCHEMA are not set.

17.3. Other IPDA Utilities

Starting with IPDA release 7.0.3, there have been several changes to many of the IPDA utilities to support

multiple schemas:

• The NZADMIN GUI interface now has a Schemas option to display all schemas within a database.

• nzdumpschema now has the new schemaList option which specifies one or more schemas within the

database to extract information for. By default, the command extracts information for all the schemas

in a database. You can specify one or more schemas in a space-separated list to extract only the

contents of the specified schemas. Note that the command ignores the NZ_SCHEMA setting.

• You can now use the schema option to specify the schema in which to groom the tables with

nzreclaim. This option is compatible with the existing –db option, but not with the -allDbs option. If

you specify -schema and -allTbls, the command grooms all the tables in the schema. If you do not

specify a schema, IPDA uses NZ_SCHEMA; otherwise, the default schema for the database is used.

• The nzsession command, to view and manage sessions, now includes schema information also.

18. Other

In release 7.0.3, many of the messages IPDA displays have been standardized. Because object names are no

longer unique, three-level names have been included in error message text (database-name.schema-name.object-

name). You therefore require all 3 names to physically identify an object.

Also, error messages are now more consistent. For example, if an object does not exist, a standard message will be

displayed, versus different flavors of the same message.

There is a new version of query history configuration – version 2. If you use the SET CATALOG / SET SCHEMA

statements, version 2 of query history will capture these statements. However, if you use version 1 and later

decide to switch to version 2, you will lose these statements as they will not be captured – in effect, you will be

losing data!

Because of this, after an upgrade, it is recommended that you switch to query history schema version 2, so that

you do not lose any data. IPDA does not migrate your version 1 data to version 2. Once you start using release

7.0.3, you will be collecting new data under version 2. It is therefore recommended to use the CREATE HISTORY

command to create a version 2 configuration for query history logging after an upgrade.

Important: As of the IPDA 7.0.2 GA release, IBM IPDA Analytics (INZA) is not supported.


28

19. Conclusion

In summary, just as folders and directories allow you to organize information, schemas allow the logical

separation of objects, but still allow the objects to work together as required. Schemas and their objects are

mutually dependent, and therefore enforce uniqueness in the database by forcing you to qualify database objects.

This makes development easier, simplifies security, permits object name reuse, avoids conflicts, and allows

objects to be manipulated independently of users.

The use of multiple schema is encouraged as the benefits often outweigh their costs – however, overuse, or the

use of schemas for the wrong reasons, can result in additional and unnecessary overhead. You should therefore

always ask how the use of multiple schemas in your situation will be beneficial, prior to implementation.

ibm puredata for analytics (ipda) filepowered by netezza technology multiple schema support in...

Documents