appendix f: the ether annotator - ars.els-cdn.com€¦  · web viewsupplementary figure 1 3:...

17
Appendix F: The ETHER Annotator In previous work we surveyed seven different open-source annotation tools: Anafora [26], the Brat Rapid Annotation Tool (BRAT) [27], the Extensible Human Oracle Suite of Tools (eHost) [28], the Event-based Text-mining of Health Electronic Records (ETHER)[2], the General Architecture for Text Engineering (GATE) [29], Knowtator [30], and the Multi-document Annotation Environment (MAE) [31]. The ETHER annotator was selected to perform clinical and temporal annotations of safety surveillance reports. This tool was selected as it was specifically designed to process safety surveillance reports, it has temporal pre-annotation capabilities, and it can be functionally extended by the FDA project team to meet our specific requirements. In addition, the FDA annotators have an existing familiarity with ETHER that will facilitate the annotation training process. The ETHER Annotator graphic user interface is shown in Supplementary Figure 4. Areas of importance for annotators are the Narrative box (center), the Control panel (bottom right), the Feature Annotation box (bottom left), and the Time Annotation box (bottom middle).

Upload: vanduong

Post on 25-Aug-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Appendix F: The ETHER Annotator

In previous work we surveyed seven different open-source annotation tools: Anafora [26], the Brat Rapid Annotation Tool (BRAT) [27], the Extensible Human Oracle Suite of Tools (eHost) [28], the Event-based Text-mining of Health Electronic Records (ETHER)[2], the General Architecture for Text Engineering (GATE) [29], Knowtator [30], and the Multi-document Annotation Environment (MAE) [31].

The ETHER annotator was selected to perform clinical and temporal annotations of safety surveillance reports. This tool was selected as it was specifically designed to process safety surveillance reports, it has temporal pre-annotation capabilities, and it can be functionally extended by the FDA project team to meet our specific requirements. In addition, the FDA annotators have an existing familiarity with ETHER that will facilitate the annotation training process.

The ETHER Annotator graphic user interface is shown in Supplementary Figure 4. Areas of importance for annotators are the Narrative box (center), the Control panel (bottom right), the Feature Annotation box (bottom left), and the Time Annotation box (bottom middle).

Supplementary Figure 4: Screenshot of the ETHER Annotator graphic user interface. Text to be annotated can be highlighted and right clicked to open a tooltip of available Feature Types. Once a Feature Type has been selected, that annotation will appear as a row in the Feature Annotation box.

Page 2: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Narrative BoxThe narrative box contains the free text narrative for adverse event reports loaded into the ETHER annotator. Within the narrative box, users can highlight text and right click to assign a specific Feature type to a span of text.

Control PanelThe control panel allows users to check a box and highlight all existing clinical and temporal features. If selected, spans of text free text in the narrative box will become highlighted with different colors indicating different feature types.

Feature Annotation BoxThis is the main workspace the annotators will use to adjust Feature text and edit annotated features. Individual features selected in the Feature annotation box will become highlighted in the narrative box thus allowing the user to step through the narrative and visualize their annotations. Each entry contains six columns:

ID: The feature identification number for that case that will be stored in the database. Feature Text: This shows the text output that will be annotated. Feature text can be edited

by single clicking within the row. TimeID: If a clinical feature has been linked to a temporal feature then the TimeID of the

that temporal feature is shown in this box. The link text can be edited by clicking within the box, but a valid Time ID must still be provided in that case.

Type: Displays the selected feature type. A dropdown menu allows the user to edit the feature type without re-highlighting the text in the narrative box.

Relation: This is used to define the temporal association between the clinical feature and the temporal expression indicated by the Time ID. If the event occurs at the time indicated by the temporal expression, then no relation needs to be specified. However, for events occurring before or after the time of the temporal expression, or beginning or ending at that time, a relation type should be specified.

Comments: Allows the annotator to leave comments next to the annotation.

Control ButtonsButtons that allow the user to save the annotated features, as well as add and delete highlighted clinical features. The Link Feature-Time button and the Delete Link Button are shown below the Feature Annotation box and are used for the creation/deletion of Temporal Links between selected features in the Clinical and Temporal Feature Annotation boxes. This is further explained in the Temporal Expressions and Associations section.

Feature Pre-annotationsThe “feature pre-annotation” checkbox runs the ETHER tool to identify potential annotations that the annotator may wish to include. The ETHER tool has been proven to successfully annotate clinical features in safety surveillance reports using a similar annotation model to the one outlined in this

Page 3: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

document. However, as the ETHER tool was not updated to include additions to the annotation model and guidelines, these pre-annotations should be carefully reviewed before accepting the annotations. Pre-annotations are marked with an asterisk which disappears after the user hits the save button.

Post-annotationsThe “post-annotations” checkbox runs the feature text post annotation algorithm which identifies potential annotations based on annotations already saved by the annotator. Using this algorithm, a user can annotate a span of text and then run the algorithm to annotate every future occurrence of that text, which will be tagged with that feature type. Post-annotations are marked with an asterisk which disappears after the user hits the save button.

Time Annotation boxThis is the main workspace the annotators will use to adjust temporal text and edit annotated features. By highlighting a span of text in the narrative and assigning it to a particular temporal expression type, new rows will be created in the time annotation list. Individual features selected in the Time Annotation box will become highlighted in the narrative box thus allowing the user to step through the narrative and visualize their annotations. An initial exposure date is automatically populated from the Exposure Date structured field and serves as a temporal anchor for cases that may not provide absolute information. Each entry contains seven columns:

TimeID: The feature identification number for that case that will be stored in the database.

Time Text: This shows the text output that will be annotated. Feature text can be edited by single clicking within the row.

Type: Displays the selected expression type. A dropdown menu allows the user to edit the expression type without re-highlighting the text in the narrative box.

Date: This is the entry for the exact string date that should be specified for most temporal expressions. All dates should be in YYYY-MM-DD format, and partial dates should be filled with the ‘X’ character for any missing information.

RefID: This will be the Time ID of another temporal expression that the selected expression is linked to. Many expression types, like Relative and Time, should be linked to another temporal expression that provides the actual date or the context for the given expression. The link text can be edited by clicking within the box, but a valid Time ID must still be provided in that case.

Relation: This column is not used. Comments: Allows the annotator to leave comments next to the annotation.

Control Buttons Buttons that allow the user to save the annotated features, as well as add and delete highlighted features. The Link Time-Time button can be used to create the links between two temporal expressions. First, the annotator must select one temporal expression in the list, and then CTRL+Click a second temporal expression so that both rows are highlighted. Then, clicking the Link Time-Time button will

Page 4: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

create link, represented by putting the TimeID of one expression in the RefID column of the other expression. The linking logic prioritizes certain expression types, so the button may sometimes create the link in the opposite direction the annotator intended. In this case, the RefID column can be edited manually by double clicking within the column space. Also, the Delete Link button next to the Link Time-Time button can be used to remove links from the selected temporal expression.

The Link Feature-Time button is found below the Feature Annotation Box and can be used to annotate temporal associations between clinical features and temporal expressions. The annotator must select one clinical feature from the list in the Feature Annotation Box and one temporal expression from the Time Annotation Box and then click the Link Feature-Time button. This will create the association, represented by putting the TimeID of the temporal expression in the TimeID column of the clinical feature. The annotator should then use the drop-down box in the Relation column to define the association type. If a clinical feature is associated with more than one temporal expression, multiple associations can be created by clicking the Link Feature-Time button again after selecting another temporal expression. This will create an extra row in the Feature Annotation Box showing the same clinical feature but with a separate TimeID and Relation column indicating a separate temporal association.

Time pre-annotations checkboxThe “Time pre-annotations” checkbox will use the ETHER tool to identify potential temporal expression annotations that the annotator may wish to include. The ETHER tool has been proven to successfully annotate temporal expressions in safety surveillance reports using a similar annotation model to the one outlined in this document. However, as the ETHER tool was not updated to include additions to the annotation model and guidelines, these pre-annotations should be carefully reviewed before accepting the annotations. Pre-annotations are marked with an asterisk which disappears after the user hits the save button.

Annotation Walk ThroughSupplementary figures 5-18 provide a walk-through of the annotation process for clinical features annotation in a VAERS report using ETHER. Once the reports have been loaded into ETHER the user is presented with the ETHER Annotation GUI shown in Supplementary Figure 5.

Page 5: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Supplementary Figure 5: The annotation graphic user interface for the ETHER annotator. The free text narrative is displayed in the “Narrative” box while the “Feature Annotation” box is empty as no features have been tagged.

Page 6: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Supplementary Figure 6: One of the first things an annotator may do is check the “Feature pre-annotations” box in the lower left corner. This populates the “Feature Annotation” box with clinical features derived from the rule-based ETHER algorithm. Pre-annotated features appear in the “Feature Annotation” box with an asterisk (“*”) next to the feature ID to identify the annotation as derived from the ETHER algorithm. Selecting a specific feature in the “Feature Annotation” box will highlight the feature text in the narrative to allow the user to see where that feature came from. If the user does not think the annotation is appropriate they may correct the feature text or delete the feature using the “Delete Feature” button. Once the user is satisfied that ALL the pre-annotated features are appropriate, the user can click the “Save” button to save the annotations. Once saved, the pre-annotations lose the asterisk identification so be sure that the features are correct before saving them. If the “Feature pre-annotations” box is unchecked all features with an asterisk next to the feature ID will be removed from the “Feature Annotation” box. Be sure to check not only the feature text but also the feature type when using pre-annotations. As the ETHER algorithm was built using a slightly different annotation model some of features may not have the correct type. While the “pneumonia” feature type is correct in the above example, feature IDs f5-f9 have an incorrect feature type as the preceding word, “experienced” is a trigger for the secondary diagnosis feature type. It is therefore important to be careful when using the “Feature pre-annotations” checkbox.

Page 7: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Supplementary Figure 7: To annotate a span of text, highlight the text you wish to tag and right-click the highlighted text. A tool-tip will drop down showing the available clinical and temporal feature types that may be assigned. In this example the medical history tag (“MHx”) is assigned to the text, “chronic obstructive pulmonary disease”.

Supplementary Figure 8: Once a feature type is assigned, it will appear in the “Feature Annotation” box.

Page 8: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Supplementary Figure 9: Sometimes feature text will need to be reduced to provide clarity in the annotation. In this example, we need to take the phrase “hemogram and urinalysis were normal” and create two annotations: “hemogram normal” and “urinalysis normal”. The whole phrase should be highlighted as the start position will be the beginning of the word “hemogram” and the end position will be the last latter in “normal”. To add the highlighted text, right-click the highlighted text and select the “Lab” feature type.

Supplementary Figure 10: The annotated feature will appear in the “Feature Annotation” box.

Page 9: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Supplementary Figure 11: Single left-clicking the feature text in the “Feature Annotation” box will allow the annotator to reduce the feature text. Highlight the unnecessary feature text and delete it from the feature text.

Supplementary Figure 12: The feature text for the feature has now been reduced to clarify the annotation.

Page 10: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Supplementary Figure 13: Despite the start and end position of the highlighted text in the “narrative” box, the feature text includes on “hemogram normal”.

Supplementary Figure 14: Alternatively, the annotator could simply highlight the starting word and type the word “normal” into the feature text. After clicking off the annotated feature, the ETHER annotator automatically tries to find the end position for the annotation based on last word in the feature text. In this example, ETHER will search the rest of the sentence for the word “normal” and end the annotation in the same place as in Figure 13. If the word is not found the entire sentence will be highlighted for the start and end features.

Page 11: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Supplementary Figure 15: Another feature of ETHER is post-annotations to reduce highlighting the same repetitive phrase throughout long reports. To start this, tag the features you want ETHER to search for.

Supplementary Figure 16: ETHER searched the rest of the document for this feature text and assigns an identical feature type. In this example the word “pneumonia” was chosen as a primary diagnosis. The “Post-annotations” checkbox on the bottom left has been selected and a second occurrence of the text “pneumonia has been found. Post-annotations are marked with a “+” sign to identify them as post-annotations. These “+” markers will disappear once the annotation has been saved.

Page 12: Appendix F: The ETHER Annotator - ars.els-cdn.com€¦  · Web viewSupplementary Figure 1 3: Despite the start and end position of the highlighted text in the “narrative” box,

Supplementary Figure 17: Once you have completed the clinical annotations for the report, click the “Save” button under the “Feature Annotation” box to save the annotations in this box to the ETHER database.

Supplementary Figure 18: If an annotator attempts to proceed to the next case before saving their annotations to the ETHER databse they will be meet with the above popup menues. The one on the left will appear first and ask if the annotator wishes to save their annotations. The second checks if there are text spans that match a saved annotation but is not included as a saved anotations and should prompt the annotator to review their annotations to see if there are missing annotations.

We hope this walk through has allowed annotators unfamiliar with the annotation process to better understand the annotation process followed during the annotation of clinical features.