the clinical pathologist's role in genomics: data managent
TRANSCRIPT
The Clinical Pathologist’s Role in Genomics: Data Management, Quality Assurance,
and Consultation
presented by Daniel Schwartz, MD, FCAP
SS114 Bringing Context and Consistency
to the Genomic Revolution September 22, 2004
3:30 – 5:00 PM
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution
CAP ’04 – The Pathologists’ Meeting™ September 19 - 22, 2004
Phoenix, AZ
Disclosure Statements
All speakers must disclose to the program audience any financial interest or relationship with the manufacturer(s) of any commercial product(s) that may be discussed in the educational presentation or with the manufacturer of a product that is directly competitive with a product discussed in the presentation. In addition, all speakers must disclose whether non-FDA approved uses of pharmaceutical products or medical devices are included in the presentation The College of American Pathologists does not view the existence of these interests or uses as implying bias or decreasing the value to participants. The CAP, along with the Accreditation Council for Continuing Medical Education (ACCME), feels that this disclosure is important for the participants to form their own judgment about each presentation. The following speakers have financial interests/relationships or non-FDA approved uses to disclose:
Daniel Schwartz, MD, FCAP – Gene Logic, Inc.
The following speakers have no financial interests/relationships or non-FDA approved uses to disclose:
The following speakers have not provided information on financial interests/relationships or non-FDA approved uses:
Learning Objectives
Upon completion of this course, participants should be able to:
• Describe the information that is generated by gene arrays • Discuss the limitations and opportunities of gene arrays in terms of mechanisms
of disease and describe some of the bioinformatics tools, their use and limitations, including contributions and applications of a standardized pathology terminology (SNOMED CT to data collection and interpretation and pathology archives as valuable genomic and proteomic resources
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 1
Copyright 2004 Gene Logic Inc. Page 1
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and
Consultation
Dan Schwartz, MD, MSIS, FCAPGene Logic Inc.
Gaithersburg, MD
September 22, 2004
Copyright 2004 Gene Logic Inc. Page 2
The clinician’s view of genomics
Genomics: “The large-scale investigation of the structure and function of genes.” * Note the emphasis on large-scale.Most common concern: Expression data on 30,000+ genes.Less commonly noticed: Large quantity of clinico-pathologic data on tissues expressing those genes.How can that data be:• Managed?• Verified?• Conveyed meaningfully to non-clinicians?
*Drug Discovery and Development Magazine
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 2
Copyright 2004 Gene Logic Inc. Page 3
Perceptions: Why these issues are overlooked…
Diagnosis: Necessary, familiar, personally rewarding, intellectually stimulating.Molecular biology: Fascinating, new, challenging, sexy.Data management: Boring, doesn’t require professional expertise, very unsexy.Quality assurance: Reviewing other people’s work is less stimulating than primary diagnosis.• Medicolegal issues are also raised
Consultation: Why me? This isn’t my area of expertise or interest.
Copyright 2004 Gene Logic Inc. Page 4
…and why they shouldn’t be:
Genomic data are meaningless without the appropriate clinicopathologic context of samples from which they were derived.• Clinicians can and should be part of any human genomic research
or application.Data management issues are incredibly challenging once you dig into them.It is a pleasure and a privilege to learn from, and to teach, experts in other fields:• Molecular biologists• Software engineers and database analysts (among others)• Bioinformaticians• Many others
Your research colleagues will be infinitely grateful for your input (though they may not know it yet).
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 3
Copyright 2004 Gene Logic Inc. Page 5
Clinical data management in genomics
Which data points should be captured?What is the meaning of a data point in anatomic pathology?How should this data be:• Conceptualized?• Extracted? (And from where?)• Stored?
Why is it important to maintain a consistent system?How can consistency be achieved?
Copyright 2004 Gene Logic Inc. Page 6
Pathology data
Define a set of clinical features, anatomic descriptors, laboratory values, observations, examinations, etc. to be stored in association with each sample.The set may (probably will) vary with the type of sample.Define mandatory vs. optional data.Define which units data are to be reported in.• Especially important when working with multiple accrual sites
Data types are not limited to the traditionally “pathologic.”W/R/T genomics, pathologists are medical experts, not just pathology experts.
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 4
Copyright 2004 Gene Logic Inc. Page 7
An example of a minimal data set
Sample: Adenocarcinoma of colonData:• Anatomic site (optional: which part of colon?)• Pathomorphology (adenocarcinoma – subclassify?)• Pathologic stage (which system?)• Grade (can this be standardized?)• Other diseases (present; past?)• Medications (present; past?)• CEA (serum; tissue by immunohistochemistry?)• History of prior chemo/radiotherapy (or, not accept sample?)• Family history (what are the relevant diseases?)• CT scan, barium enema, other radiologic info (how to
capture? pre-op, post-op, on primary, for mets?)• Outcome data (response to chemotherapy? survival? other?)• Text-based descriptions have great secondary value
Copyright 2004 Gene Logic Inc. Page 8
Context vs. sample
Patient (“donor”) has adenocarcinoma of colon.Sample is from the uninvolved area of the colon (i.e. morphologically normal).How do you capture this information?• Or, do you consider it important?
An added complication: Cancer arose in a setting of longstanding ulcerative colitis.• How can you capture that as well?• How do you convey to your colleagues that this is
important information to consider?
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 5
Copyright 2004 Gene Logic Inc. Page 9
Anatomic Pathology data storage: CodingOrderly data storage = effective data retrieval.Implies the need for a standardized system for:• Extracting pathology information, and… • Coding it as data.
This system must include:• A way to “tease out” the relevant information from:
− What is known about a sample− What is known about the context in which it was obtained
• A way to encode this info so it can be found easily• A way to ensure that the same coding schema is uniformly employed by
all codersMethod for extracting information will vary among institutions: prospective vs. retrospective, batch vs. one-off, report review vs. real-time, pathologist vs. non-pathologist.Which coding system to use is a matter of style and comprehensiveness.• A controlled vocabulary is essential.
Copyright 2004 Gene Logic Inc. Page 10
Why use a controlled vocabulary?
Standardizes terminology:• Allows effective storage of data in tables
• Allows effective searching for data in database
• The alternative is free-text
Limits the way a concept can be expressed• A good thing: Similar concepts are coded the same way
(usually)
• A bad thing: Medical concepts don’t always fit into neat pigeonholes
Limits the size, complexity, and operational overhead of the database itself• Quicker database response times
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 6
Copyright 2004 Gene Logic Inc. Page 11
Coding system: SNOMED-CT
Among coding systems, SNOMED-CT is a very good choice. But it is not the only choice.• Homegrown systems can work in some situations.
Consider the use of only a small subset of SNOMED concepts (“microvocabulary”).• Represent only concepts in use: diminished complexity.• Included concepts will depend on level of granularity needed.
Often many ways to represent the same concept in SNOMED.• Sometimes subtle differences from one to the next.• Usually, one concept will have many synonymous descriptions.
− You may need to choose the description that will be used –and then communicate it to end-users.
Copyright 2004 Gene Logic Inc. Page 12
Microvocabulary examples and issuesCID 36195005 has synonymous descriptors:• Hyperplastic nodule• Nodular hyperplasia• Nodular regenerative hyperplasia
Which description should be used, so that end-users can search for it?The morphologic descriptor “nodular hyperplasia” is used in thyroid, prostate, and other organs.• Associate the same [concept + description] for different
diseases with a variant of the same general morphology?• Or, should different concepts be used for different
diseases, even though their morphologies are similar?• Fundamentally: Are you coding a disease, a morphology,
or do you need both to adequately convey meaning?
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 7
Copyright 2004 Gene Logic Inc. Page 13
Philosophical issuesIs a tonsil normal or diseased?Is morphologically normal colon from a patient with ulcerative colitis normal or diseased?What about familial adenomatous polyposis coli?Is morphologically normal tissue adjacent to a tumor really “normal?”Is frontal lobe tissue from a schizophrenic patient "normal"?Are low malignant potential tumors best classified as malignant or benign (if you classify tumors at all)?Is the duodenum anatomically distinct enough from the jejunum/ileum to be coded separately?Is leiomyoma a tumor of myometrium or of uterus?How many different flavors of carcinoma should have distinct codes for one organ?In general: How “granular” should coding be?
Copyright 2004 Gene Logic Inc. Page 14
Creating and maintaining a system
There is an art to this.Creating a system like this can be slow and painful – but the long-term payoff is great. A simple spreadsheet can be an excellent tool for ensuring coding consistency.
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 8
Copyright 2004 Gene Logic Inc. Page 15
Coding spreadsheet: Unfiltered
Copyright 2004 Gene Logic Inc. Page 16
Coding spreadsheet: Partial filtering
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 9
Copyright 2004 Gene Logic Inc. Page 17
Coding spreadsheet: Complete filtering
Copyright 2004 Gene Logic Inc. Page 18
Microvocabulary maintenance
There must be administrative oversight of the lexicon - an individual, a committee, etc. - to ensure standardization.There must be a method for updating the lexicon and promulgating updates to users.There must be a method to ensure user access to the lexicon and tools to implement it.There must be an error-checking mechanism, and someone authorized to use it to find and fix errors.
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 10
Copyright 2004 Gene Logic Inc. Page 19
Microvocabulary maintenance
There is no BEST answer, just many GOOD answers; agree on one and stick with it.• If you wish to merge your data with that of another
organization or researcher, you must:− Agree on standards in advance.− Have a way of coordinating lexicon updates.
Add concepts and/or change descriptions carefully.• Are you sure that construct (context + sample
properties) has never been coded in your DB before?
Remove terms with GREAT care if ever at all: can break referential integrity of database!
Copyright 2004 Gene Logic Inc. Page 20
Pathology Quality Assurance
How do you know that the tissue submitted for analysis is what it purports to be?Answer: Don’t take anyone’s word for it; check for yourself.• Majority of errors are clerical and easily fixed.• Samples should be handled with same respect as
clinical specimens.• You will save your colleagues much head-scratching.
Review of a slide is essential:• Tissue directly adjacent to analyte.• Pre-analytic frozen section of actual analyte.
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 11
Copyright 2004 Gene Logic Inc. Page 21
The QC process
Accessioning checks• State of tissue• Correlation of data (donor ID, tissue blocks,
inventory, pathology reports)• Quantity of tissue
Pathology report review and data entry• Coding (SNOMED-CT or other)• Categorization• Text annotation• Other, as needed by your institution
Copyright 2004 Gene Logic Inc. Page 22
Sample QC Process: Accessioning Checks
Site Patient ID• Unique identifier
assigned by collection site (if applicable)
Sample Matches• Identify any samples
from same patient• Tumor-normal matches
Donor Demographics- Age - Gender- Race / ethnicity
Required Data• Date of collection• Time from surgical excision
to freezing• Organ / anatomic tissue site• Tissue category• Tissue / pathologic diagnosis • Patient primary and
secondary diseases• Tissue preparation method
(Frozen, OCT, Paraffin, RNA, etc.)
• Weight of sample
Review Sample Information for:
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 12
Copyright 2004 Gene Logic Inc. Page 23
Sample QC: Slide review
Review representative slide(s)• Correlate with report• Further text-based annotation • Image capture and storage has tangible benefit
Criteria for rejection: The 100% standard• If the pathologist is not 100% certain that the
tissue to be analyzed is correctly identified and of adequate quality, it is rejected.
• Never “unreject” a sample.• Be an independent arbiter.
Copyright 2004 Gene Logic Inc. Page 24
When is a sample acceptable?
Pathology report available and interpretable.All identifications are in line.Pathology report matches slide review.Slide shows tissue of interest is adequate in quantity and quality.If any ambiguities, hold sample until resolved.
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 13
Copyright 2004 Gene Logic Inc. Page 25
Post-QC retrospective review
Feedback to collection sites (if applicable)• Clarification of ambiguities• Review of rejections• Refinement of sample acquisition protocols
Post-publication checks: Lookbacks• Genomics/bioinformatics• User requests• New samples from previous patients• New clinical data available• Previous samples when new, similar samples are encountered
Periodic, broad-based review of dataQuality Assurance Committee
Copyright 2004 Gene Logic Inc. Page 26
Quality Assurance Committee
Meet regularly.
Request participation of senior investigators across disciplines.
Review questions from any source, regarding quality of samples and/or clinical data.
Be empowered to take corrective actions.• Remove samples
• Recode samples
• Institute process improvements
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 14
Copyright 2004 Gene Logic Inc. Page 27
Consultation
The “gray zones” of pathology (and medicine in general) may be difficult territory for non-physicians.• Even SNOMED-CT can’t neatly bin every medical
situation.• Textual annotation is very helpful – if anyone reads
it. And even then it may not be understood.• It’s up to the pathologist to make the case for
understanding the subtleties of clinicopathologic information……
• ….and then to follow up by providing appropriate explanation and support.
Copyright 2004 Gene Logic Inc. Page 28
Active vs. passive consultation
One mode is to wait for questions to arrive.• Ever heard of the Maytag® repair man?
The clinical pathologist should actively engage researchers to anticipate problems before they arise.Pre-formed “sample sets” are one answer.…but in the long run, giving people fish is less effective than teaching them how to fish (in the gene pool…?)
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 15
Copyright 2004 Gene Logic Inc. Page 29
Other forms of outreach
Correlate microarray data with immunohistochemical data (as a service).Teach “Pathology 101” to individuals with no formal pathology background.Know more about genomics than your colleagues know about pathology. Understand their concerns and issues.For this purpose, don’t draw sharp boundaries between medical specialties. You are the medical expert.
Copyright 2004 Gene Logic Inc. Page 30
Summary
The clinical pathologist has a vital and necessary role to play in genomics.This role will evolve from ad hoc to formalized to (potentially) reimbursable.The role of the pathologist is not as boring as it might at first appear…• …and there is ample opportunity for learning on
the fly.
The pathologist will be well-served to understand basic issues of genomics and database design/management.
The Clinical Pathologist’s Role in Genomics:Data Management, Quality Assurance, and ConsultationDaniel Schwartz, MD, FCAP
September 22, 2004
College of American Pathologists 2004. Materials are used with the permission of the faculty. SS114 Bringing Context and Consistency to the Genomic Revolution 16
Copyright 2004 Gene Logic Inc. Page 31
Thank you!
Questions and comments?