welcome to the data analytics toolkit powerpoint...

Welcome to the Data Analytics Toolkit PowerPoint presentation on EHR

architecture and meaningful use.

When data is collected and entered into the electronic health record, the data

is ultimately stored in a database. When analyzing the objectives related to

Meaningful Use, data needs to be extracted from the database. It is likely that

you’ve all worked with databases in some capacity. Most of you are probably

familiar with Microsoft Excel’s spreadsheets. This a a flat-file type of database.

However, it is unlikely that the data that must be acquired for Meaningful Use

is stored in a flat-file database. That is because there are several limitations

with this type of database - it can’t handle large datasets (more than 1 million

rows), you can’t access the data using a programming language, and it is not

multi-user friendly.

A better approach that has been adopted by most healthcare organizations is

to store the data in a relational database which can be defined in terms of the

relations of data. These databases can handle very large datasets, can be

accessed and queried using a programming language, support multiple users,

and can be accessed and updated remotely.

A relational database is structured into tables, each table is referred to by an

entity or a noun. For instance, one table might have information about a

patient and therefore be referred to as the ‘Patient’ table. The list of adjectives

that would describe the entity are known as attributes and are listed in the

table. For the patients table, the attributes may include gender, state of

residence, year of birth, name, etc. There are typically many different tables

and they often can be related to other tables based on common attributes. The

common attributes are known as relations and they are the verbs that connect

two entities. For instance, we might have a patients table and a medications

table, which lists all the medications patients are taking. Both tables might

share the common attribute of a patient identifier and therefore can be related.

The relation is similar to a verb in that a patient TAKES medication. The

relation between the patients table and medications table can be used for

combining data for queries.

Here is a picture of an entity relation diagram which represents a relational

database. This database includes a patient, medications, and diagnosis table.

The medication and diagnosis tables can be related to the patient table

because they all share the attribute, “PatientID”. Similarly, the medications

table can be linked to the diagnosis table because they share the attribute,

“DiagnosisID”.

There are three different types of relationships that

can exist between tables. These different ways are

understood as rules of cardinality.

The first type of relationship is a Many-to-Many

relationship. These types of relationships are very

common in the real world, but not in a relational

database. An example would be that many different

physicians may prescribe many different

medications. The problem with these types of

relationships is that they lead to a less efficient

database. A solution to this problem is to create a

third, intervening table called an intersection table.

This breaks the many-to-many relationship into two

one-to-many relationships.

The second type of relationship is known as a one-

to-many relationship. This occurs when an entity or

table is related to one or more instances of another

entity. For example, one patient can have many

diagnoses. This is the most common type of

relationship in a relational database.

The last type of relationship is a one-to-one

relationship. This occurs when both entities are

related by one and only one instance of the other

entity. For example, one patient can only have one

date of birth. It is advised to combine entities into

one table if a one-to-one relationship exists.

In the entity-relation-diagram shown on the previous

slide, there are three tables and three one-to-many

relationships. That is, one patient can have many

medications, one patient can have many diagnoses,

and, one diagnosis can have many medications.

The type of relationship for a specific entity is represented as symbols on an

entity-relation diagram. If an entity has a relationship with another table where

one and only one record matches, this would be depicted as two straight lines.

If one or many records match that of another table, this is depicted as a

triangle and a line. A zero, or one, or many relationship is depicted as a

triangle with a circle. Finally a zero or one relationship is depicted as a line and

a circle. These symbols are important for interpreting an entity-relation-

diagram for determining the type of relationship between two tables.

For instance, if we consider the entity-relation-diagram shown previously, we

see that the relationship between the patient and medication table is one-to-

many where zero, or one, or many patients can be taking a medication, and

one and only one patient is assigned to each instance of a medication.

Keys are the attributes that link entities. A primary key is an attribute which can

uniquely identify a particular instance of an entity. For example, the primary

key for the Patient table shown previously is PatientID. It is important to realize

that a primary key must be distinct. Therefore, when considering this

characteristic for primary keys, would a patient’s full name be acceptable?

Probably not, as it may not be unique. A social security number may also not

be unique. A medical record number could be used, however, there are

instances where we find duplicate records and duplicate medical record

numbers for a single patient.

When a table’s primary key is present in another table, this is known as a

foreign key. For instance, PatientID is present in both the medications and

diagnosis tables. Therefore, although PatientID is the primary key in the

patients table it is also the foreign key in the medications and diagnosis tables.

The foreign keys are used to create a link between the different entities.

The primary keys are unique identifiers. Each table

has a Primary key. The primary key in the patient

table is PatientID. The primary key in the

medications table is MedicationID, and the primary

key in the diagnosis table is DiagnosisID.

Foreign keys are shown in both the medications and

the diagnosis table. The foreign keys in the

medications table include PatientID and

DiagnosisID while the foreign keys in the diagnosis

table include PatientID.

Because the patient table does not have any

instances of medications or diagnoses, the patient

table does not have a foreign key. Consider the

one-to-many relationship. For each one patient

there may be many diagnoses and medications.

Therefore, in order to link the patient table with the

other tables, we need to include the patient ID in the

medication and diagnosis tables. Anytime there is a

one-to-many relationship, the many side of that

relationship holds the foreign key.

The reason keys are used is primarily for organizational purposes. Without

them, the tables would become cumbersome and impossible to link or

navigate. If you consider the way the data is stored in spreadsheet form, this

may become more apparent.

The patient table includes data on each of the patients. Each row has

information for one and only one patient. We have the gender, year of birth,

and state of residence for each patient.

However, because the PatientID also shows up in the diagnosis table, and the

fact that each patient can have zero, one, or many diagnoses, we find that one

patientID may show up once, more than once, or not at all.

The medications table is very similar. The same PatientID may show up in the

medications table once, more than once, or not at all. Also, the same

medication for the same patient can be used for more than one diagnosis.

Therefore, the patientID and MedicationID may match but the DiagnosisID

may differ for those rows of data.

A data dictionary is essential in order to fully understand the data elements

within a relational database. The data dictionary lists all of the attributes in

each table, and provides a brief description of the attribute, the data type (e.g.,

date/time, numerical), the length of the data in the field, and several other

fields that can provide information about the data.

For the example shown in this presentation, the data dictionary shows that the

patient table includes 5 columns of data. The PatientID is the primary key and

is a unique identifer data type which is never left blank. The definition of the

PatientID is a unique identifier for a patient record. Gender is stored as

character that is 1 letter long and not left blank. The data is stored as “M” for

males and “F” for female. You can go through each of the other columns to see

their data type and descriptions.

The medication table includes 8 columns of data. The MedicationID is a unique

identifier for a patient medication and is the primary key. There are two foreign

keys: the patientID which is the unique identifier for a patient taking the

medication and DiagnosisID which is the identifier for the diagnosis that the

provider linked to the medication.

The diagnosis table has a primary key called the DiagnosisID which is the

unique identifier for a patient diagnosis. The foreign key in this table is the

PatientID which is the unique identifier for the patient.

When considering EHR architecture, particularly consider the implications of

the data storage. The data that is needed for assessing the core and menu

objectives and clinical quality measures is derived from these databases.

Therefore, an understanding of relational databases is essential for

understanding the data and ensuring data quality.

welcome to the data analytics toolkit powerpoint...

Documents