welcome to the data analytics toolkit powerpoint...
TRANSCRIPT
Welcome to the Data Analytics Toolkit PowerPoint presentation on EHR
architecture and meaningful use.
When data is collected and entered into the electronic health record, the data
is ultimately stored in a database. When analyzing the objectives related to
Meaningful Use, data needs to be extracted from the database. It is likely that
you’ve all worked with databases in some capacity. Most of you are probably
familiar with Microsoft Excel’s spreadsheets. This a a flat-file type of database.
However, it is unlikely that the data that must be acquired for Meaningful Use
is stored in a flat-file database. That is because there are several limitations
with this type of database - it can’t handle large datasets (more than 1 million
rows), you can’t access the data using a programming language, and it is not
multi-user friendly.
A better approach that has been adopted by most healthcare organizations is
to store the data in a relational database which can be defined in terms of the
relations of data. These databases can handle very large datasets, can be
accessed and queried using a programming language, support multiple users,
and can be accessed and updated remotely.
A relational database is structured into tables, each table is referred to by an
entity or a noun. For instance, one table might have information about a
patient and therefore be referred to as the ‘Patient’ table. The list of adjectives
that would describe the entity are known as attributes and are listed in the
table. For the patients table, the attributes may include gender, state of
residence, year of birth, name, etc. There are typically many different tables
and they often can be related to other tables based on common attributes. The
common attributes are known as relations and they are the verbs that connect
two entities. For instance, we might have a patients table and a medications
table, which lists all the medications patients are taking. Both tables might
share the common attribute of a patient identifier and therefore can be related.
The relation is similar to a verb in that a patient TAKES medication. The
relation between the patients table and medications table can be used for
combining data for queries.
Here is a picture of an entity relation diagram which represents a relational
database. This database includes a patient, medications, and diagnosis table.
The medication and diagnosis tables can be related to the patient table
because they all share the attribute, “PatientID”. Similarly, the medications
table can be linked to the diagnosis table because they share the attribute,
“DiagnosisID”.
There are three different types of relationships that
can exist between tables. These different ways are
understood as rules of cardinality.
The first type of relationship is a Many-to-Many
relationship. These types of relationships are very
common in the real world, but not in a relational
database. An example would be that many different
physicians may prescribe many different
medications. The problem with these types of
relationships is that they lead to a less efficient
database. A solution to this problem is to create a
third, intervening table called an intersection table.
This breaks the many-to-many relationship into two
one-to-many relationships.
The second type of relationship is known as a one-
to-many relationship. This occurs when an entity or
table is related to one or more instances of another
entity. For example, one patient can have many
diagnoses. This is the most common type of
relationship in a relational database.
The last type of relationship is a one-to-one
relationship. This occurs when both entities are
related by one and only one instance of the other
entity. For example, one patient can only have one
date of birth. It is advised to combine entities into
one table if a one-to-one relationship exists.
In the entity-relation-diagram shown on the previous
slide, there are three tables and three one-to-many
relationships. That is, one patient can have many
medications, one patient can have many diagnoses,
and, one diagnosis can have many medications.
The type of relationship for a specific entity is represented as symbols on an
entity-relation diagram. If an entity has a relationship with another table where
one and only one record matches, this would be depicted as two straight lines.
If one or many records match that of another table, this is depicted as a
triangle and a line. A zero, or one, or many relationship is depicted as a
triangle with a circle. Finally a zero or one relationship is depicted as a line and
a circle. These symbols are important for interpreting an entity-relation-
diagram for determining the type of relationship between two tables.
For instance, if we consider the entity-relation-diagram shown previously, we
see that the relationship between the patient and medication table is one-to-
many where zero, or one, or many patients can be taking a medication, and
one and only one patient is assigned to each instance of a medication.
Keys are the attributes that link entities. A primary key is an attribute which can
uniquely identify a particular instance of an entity. For example, the primary
key for the Patient table shown previously is PatientID. It is important to realize
that a primary key must be distinct. Therefore, when considering this
characteristic for primary keys, would a patient’s full name be acceptable?
Probably not, as it may not be unique. A social security number may also not
be unique. A medical record number could be used, however, there are
instances where we find duplicate records and duplicate medical record
numbers for a single patient.
When a table’s primary key is present in another table, this is known as a
foreign key. For instance, PatientID is present in both the medications and
diagnosis tables. Therefore, although PatientID is the primary key in the
patients table it is also the foreign key in the medications and diagnosis tables.
The foreign keys are used to create a link between the different entities.
The primary keys are unique identifiers. Each table
has a Primary key. The primary key in the patient
table is PatientID. The primary key in the
medications table is MedicationID, and the primary
key in the diagnosis table is DiagnosisID.
Foreign keys are shown in both the medications and
the diagnosis table. The foreign keys in the
medications table include PatientID and
DiagnosisID while the foreign keys in the diagnosis
table include PatientID.
Because the patient table does not have any
instances of medications or diagnoses, the patient
table does not have a foreign key. Consider the
one-to-many relationship. For each one patient
there may be many diagnoses and medications.
Therefore, in order to link the patient table with the
other tables, we need to include the patient ID in the
medication and diagnosis tables. Anytime there is a
one-to-many relationship, the many side of that
relationship holds the foreign key.
The reason keys are used is primarily for organizational purposes. Without
them, the tables would become cumbersome and impossible to link or
navigate. If you consider the way the data is stored in spreadsheet form, this
may become more apparent.
The patient table includes data on each of the patients. Each row has
information for one and only one patient. We have the gender, year of birth,
and state of residence for each patient.
However, because the PatientID also shows up in the diagnosis table, and the
fact that each patient can have zero, one, or many diagnoses, we find that one
patientID may show up once, more than once, or not at all.
The medications table is very similar. The same PatientID may show up in the
medications table once, more than once, or not at all. Also, the same
medication for the same patient can be used for more than one diagnosis.
Therefore, the patientID and MedicationID may match but the DiagnosisID
may differ for those rows of data.
A data dictionary is essential in order to fully understand the data elements
within a relational database. The data dictionary lists all of the attributes in
each table, and provides a brief description of the attribute, the data type (e.g.,
date/time, numerical), the length of the data in the field, and several other
fields that can provide information about the data.
For the example shown in this presentation, the data dictionary shows that the
patient table includes 5 columns of data. The PatientID is the primary key and
is a unique identifer data type which is never left blank. The definition of the
PatientID is a unique identifier for a patient record. Gender is stored as
character that is 1 letter long and not left blank. The data is stored as “M” for
males and “F” for female. You can go through each of the other columns to see
their data type and descriptions.
The medication table includes 8 columns of data. The MedicationID is a unique
identifier for a patient medication and is the primary key. There are two foreign
keys: the patientID which is the unique identifier for a patient taking the
medication and DiagnosisID which is the identifier for the diagnosis that the
provider linked to the medication.
The diagnosis table has a primary key called the DiagnosisID which is the
unique identifier for a patient diagnosis. The foreign key in this table is the
PatientID which is the unique identifier for the patient.
When considering EHR architecture, particularly consider the implications of
the data storage. The data that is needed for assessing the core and menu
objectives and clinical quality measures is derived from these databases.
Therefore, an understanding of relational databases is essential for
understanding the data and ensuring data quality.