© d. wong 2002 1 cs610 database term project pubmed database to handle journal articles at...

39
1 1 © © D. Wong 2002 D. Wong 2002 CS610 Database Term CS610 Database Term Project Project PubMed Database PubMed Database to handle journal articles to handle journal articles at Pharmaceutical Company, at Pharmaceutical Company, “Drug R Us” “Drug R Us” ChangBin Won ChangBin Won [email protected] [email protected]

Upload: hector-chambers

Post on 23-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

11 © © D. Wong 2002D. Wong 2002

CS610 Database Term ProjectCS610 Database Term Project

PubMed DatabasePubMed Databaseto handle journal articlesto handle journal articlesat Pharmaceutical Company,at Pharmaceutical Company,“Drug R Us”“Drug R Us”

ChangBin WonChangBin [email protected]@cis.uab.edu

Page 2: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

22 © © D. Wong 2002D. Wong 2002

ContentsContents

Situation & ObjectivesSituation & Objectives E/R DiagramE/R Diagram Relational SchemaRelational Schema SQL schemaSQL schema XML DTDXML DTD Example XML fileExample XML file Text User InterfaceText User Interface

- How to input(store) the journal article of XML format- How to input(store) the journal article of XML format into PubMed database into PubMed database (source code : XmlLoad.java) (source code : XmlLoad.java)

- How to query according to PMID, keyword, or author- How to query according to PMID, keyword, or author (source code : QueryXml.java) (source code : QueryXml.java)

Page 3: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

33 © © D. Wong 2002D. Wong 2002

Situation & ObjectivesSituation & Objectives SituationSituation

- - The pharmaceutical company, “The pharmaceutical company, “Drug R UsDrug R Us”,”, researches researches new miracle drugs and remedies new miracle drugs and remedies..

- The pharmaceutical researchers / developers hope- The pharmaceutical researchers / developers hope to to store their research results(journal articles) into the store their research results(journal articles) into the company database company database, which will be called, which will be called PubMed PubMed database database ..

-- They also hopeThey also hope to find useful informationto find useful information related to their related to their researches researches in PubMed databasein PubMed database andand in the medline in the medline database database, which is, which is already existing public database already existing public database storing medical articles storing medical articles. .

-- The dataThe data(journal articles)(journal articles) in PubMed databasein PubMed database is is generated, as needed, by the company scientistsgenerated, as needed, by the company scientists and and used in the process of drug design. used in the process of drug design.

- The company requested- The company requested IT department to create IT department to create PubMed database PubMed database ..

Page 4: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

44 © © D. Wong 2002D. Wong 2002

Situation & ObjectivesSituation & Objectives

ObjectivesObjectives

- - IT department has to create and to develop PubMed IT department has to create and to develop PubMed database. database.

- PubMed database should carry out the following tasks.- PubMed database should carry out the following tasks.

1) Storing Journal Articles as XML file format1) Storing Journal Articles as XML file format

2) Searching Journal Articles by PubMedID(PMID), 2) Searching Journal Articles by PubMedID(PMID), keyword, or author name keyword, or author name

3) Creating XML files as the search results3) Creating XML files as the search results

Page 5: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

55 © © D. Wong 2002D. Wong 2002

E/R DiagramE/R Diagram

Page 6: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

66 © © D. Wong 2002D. Wong 2002

Relation SchemaRelation Schema

Conversion of Entity SetsConversion of Entity Sets- PubMedData (- PubMedData (ArticleIDArticleID, , IDtypeIDtype, PubMedPubDate,, PubMedPubDate,

PublicationStatus, PubStatus) PublicationStatus, PubStatus)

- PubMedlineCitation (MedlineID, - PubMedlineCitation (MedlineID, PMIDPMID, DateCreated,, DateCreated,

DateCompleted, DateRevised, CitationSubSet)DateCompleted, DateRevised, CitationSubSet)

- MedlineJournalInfo (- MedlineJournalInfo (NlmUniqueIDNlmUniqueID, MedlineTA, MedlineCode, , MedlineTA, MedlineCode, Country) Country)

- Article (- Article (JournalISSNJournalISSN, , JournalVolumeJournalVolume, , JournalIssueJournalIssue,,

MedlinePgnMedlinePgn, PubDate, ArticleTitle, Abstract, , PubDate, ArticleTitle, Abstract, Affiliation,Affiliation,

Language, PublicationType)Language, PublicationType)

- Authorlist (- Authorlist (AuthorAuthor))

- Grants (- Grants (GrantIDGrantID, Acronym, Agency), Acronym, Agency)

- ChemicalList (CASRegistryNumber, - ChemicalList (CASRegistryNumber, NameOfSubstanceNameOfSubstance))

- MeshHeadingList (- MeshHeadingList (DescriptorDescriptor, SubHeading, MajorTopicYN), SubHeading, MajorTopicYN)

Page 7: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

77 © © D. Wong 2002D. Wong 2002

Relation Schema (continued)Relation Schema (continued) Conversion of RelationshipsConversion of Relationships

- PubMedDataOf (PubMedDataOfPMID, - PubMedDataOf (PubMedDataOfPMID, PubMedDataOfArticleIDPubMedDataOfArticleID, ,

PubMedDataOfIDtypePubMedDataOfIDtype))

- JournalInfoOf (- JournalInfoOf (JournalInfoOfNlmUniqueIDJournalInfoOfNlmUniqueID, JournalInfoOfPMID), JournalInfoOfPMID)

- ArticleOf (- ArticleOf (ArticleOfNlmUniqueIDArticleOfNlmUniqueID, ArticleOfJournalISSN, , ArticleOfJournalISSN,

ArticleOfJournalVolume, ArticleOfJournalIssue, ArticleOfJournalVolume, ArticleOfJournalIssue,

ArticleOfMedlinePgn)ArticleOfMedlinePgn)

- WrittenBy (- WrittenBy (WrittenByAuthorWrittenByAuthor, WrittenByJournalISSN, , WrittenByJournalISSN,

WrittenByJournalVolume, WrittenByJournalIssue, WrittenByJournalVolume, WrittenByJournalIssue,

WrittenByMedlinePgn)WrittenByMedlinePgn)

- GrantedBy (- GrantedBy (GrantedByGrantIDGrantedByGrantID, GrantedByJournalISSN, , GrantedByJournalISSN,

GrantedByJournalVolume, GrantedByJournalIssue,GrantedByJournalVolume, GrantedByJournalIssue,

GrantedByMedlinePgn)GrantedByMedlinePgn)

- HasChemicalList (- HasChemicalList (HasChemicalListNameOfSubstanceHasChemicalListNameOfSubstance,,

HasChemicalListJournalISSNHasChemicalListJournalISSN, , HasChemicalListJournalVolumeHasChemicalListJournalVolume,,

HasChemicalListJournalIssueHasChemicalListJournalIssue, , HasChemicalListMedlinePgnHasChemicalListMedlinePgn))

- HasMeshHeadingList (- HasMeshHeadingList (HasMeshHeadingListDescriptorHasMeshHeadingListDescriptor, ,

HasMeshHeadingListJournalISSNHasMeshHeadingListJournalISSN, , HasMeshHeadingListJournalVolHasMeshHeadingListJournalVol, ,

HasMeshHeadingListJournalIssueHasMeshHeadingListJournalIssue, , HasMeshHeadingListMedlinePgnHasMeshHeadingListMedlinePgn))

Page 8: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

88 © © D. Wong 2002D. Wong 2002

SQL schemaSQL schemaCREATE TABLE PubMedData (CREATE TABLE PubMedData (

ArticleIDArticleID NUMBER(20) NOT NULL, NUMBER(20) NOT NULL,

IDtypeIDtype VARCHAR(9) NOT NULL CONSTRAINT IDTypeConstraintVARCHAR(9) NOT NULL CONSTRAINT IDTypeConstraint

CHECK (IDtype IN ('pubmed' , 'medline')), CHECK (IDtype IN ('pubmed' , 'medline')),

PubMedPubDatePubMedPubDate DATE, DATE,

PublicationStatusPublicationStatus VARCHAR(20) CONSTRAINT PublicationStatusConstraintVARCHAR(20) CONSTRAINT PublicationStatusConstraint

CHECK (PublicationStatus IN ('ppublish', 'epublish', 'aheadofprint')),CHECK (PublicationStatus IN ('ppublish', 'epublish', 'aheadofprint')),

PubStatusPubStatus VARCHAR(9) CONSTRAINT PubStatusConstraint VARCHAR(9) CONSTRAINT PubStatusConstraint

CHECK (PubStatus IN ('pubmed' , 'medline')),CHECK (PubStatus IN ('pubmed' , 'medline')),

CONSTRAINT PubMedDataKeyConstraint PRIMARY KEY (ArticleID, IDtype)CONSTRAINT PubMedDataKeyConstraint PRIMARY KEY (ArticleID, IDtype)

););

CREATE TABLE PubMedlineCitation (CREATE TABLE PubMedlineCitation (

PMIDPMID NUMBER(20) NOT NULL, NUMBER(20) NOT NULL,

MedlineIDMedlineID NUMBER(20) NOT NULL UNIQUE, NUMBER(20) NOT NULL UNIQUE,

DateCreatedDateCreated DATE, DATE,

DateCompletedDateCompleted DATE, DATE,

DateRevisedDateRevised DATE, DATE,

CitationSubSetCitationSubSet VARCHAR(10),VARCHAR(10),

CONSTRAINT PubMedlineCitationKey PRIMARY KEY(PMID)CONSTRAINT PubMedlineCitationKey PRIMARY KEY(PMID)

););

Page 9: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

99 © © D. Wong 2002D. Wong 2002

SQL schema (continued)SQL schema (continued)CREATE TABLE MedlineJournalInfo (CREATE TABLE MedlineJournalInfo (

NlmUniqueIDNlmUniqueID VARCHAR(20) NOT NULL, VARCHAR(20) NOT NULL,

MedlineTAMedlineTA VARCHAR(100), VARCHAR(100),

MedlineCodeMedlineCode VARCHAR(20) NOT NULL UNIQUE,VARCHAR(20) NOT NULL UNIQUE,

CountryCountry VARCHAR(50),VARCHAR(50),

CONSTRAINT MedlineJournalInfoKey PRIMARY KEY(NlmUniqueID)CONSTRAINT MedlineJournalInfoKey PRIMARY KEY(NlmUniqueID)

););

CREATE TABLE Authorlist (CREATE TABLE Authorlist (

AuthorAuthor VARCHAR(50) NOT NULL VARCHAR(50) NOT NULL

CONSTRAINT AuthorlistKey PRIMARY KEYCONSTRAINT AuthorlistKey PRIMARY KEY

););

Page 10: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1010 © © D. Wong 2002D. Wong 2002

SQL schema (continued)SQL schema (continued)CREATE TABLE Article (CREATE TABLE Article (

JournalISSNJournalISSN VARCHAR(20) NOT NULL,VARCHAR(20) NOT NULL,

JournalVolumeJournalVolume NUMBER(10) NOT NULL,NUMBER(10) NOT NULL,

JournalIssueJournalIssue VARCHAR(10) NOT NULL,VARCHAR(10) NOT NULL,

MedlinePgnMedlinePgn VARCHAR(20) NOT NULL,VARCHAR(20) NOT NULL,

PubDatePubDate DATE,DATE,

ArticleTitleArticleTitle VARCHAR(500),VARCHAR(500),

AbstractAbstract VARCHAR(4000),VARCHAR(4000),

AffiliationAffiliation VARCHAR(500),VARCHAR(500),

LanguageLanguage VARCHAR(50),VARCHAR(50),

PublicationTypePublicationType VARCHAR(50) DEFAULT 'Journal Article',VARCHAR(50) DEFAULT 'Journal Article',

CONSTRAINT ArticleKey PRIMARY KEY (JournalISSN, JournalVolume, CONSTRAINT ArticleKey PRIMARY KEY (JournalISSN, JournalVolume,

JournalIssue, MedlinePgn)JournalIssue, MedlinePgn)

););

CREATE TABLE Grants (CREATE TABLE Grants (

GrantIDGrantID VARCHAR(20) NOT NULL,VARCHAR(20) NOT NULL,

AcronymAcronym VARCHAR(20),VARCHAR(20),

AgencyAgency VARCHAR(20),VARCHAR(20),

CONSTRAINT GrantsKey PRIMARY KEY(GrantID)CONSTRAINT GrantsKey PRIMARY KEY(GrantID)

););

Page 11: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1111 © © D. Wong 2002D. Wong 2002

SQL schema (continued)SQL schema (continued)CREATE TABLE ChemicalList (CREATE TABLE ChemicalList (

NameOfSubstanceNameOfSubstance VARCHAR(100) NOT NULL,VARCHAR(100) NOT NULL,

CASRegistryNumberCASRegistryNumber VARCHAR(25) DEFAULT '0',VARCHAR(25) DEFAULT '0',

CONSTRAINT ChemicalListKey PRIMARY KEY(NameOfSubstance)CONSTRAINT ChemicalListKey PRIMARY KEY(NameOfSubstance)

););

CREATE TABLE MeshHeadingList (CREATE TABLE MeshHeadingList (

DescriptorDescriptor VARCHAR(100) NOT NULL,VARCHAR(100) NOT NULL,

SubHeadingSubHeading VARCHAR(100),VARCHAR(100),

MajorTopicYNMajorTopicYN VARCHAR(5) CONSTRAINT VARCHAR(5) CONSTRAINT

MajorTopicYNConstraint CHECK (MajorTopicYN IN ('Y', 'N')),MajorTopicYNConstraint CHECK (MajorTopicYN IN ('Y', 'N')),

CONSTRAINT MeshHeadingListKey PRIMARY KEY(Descriptor)CONSTRAINT MeshHeadingListKey PRIMARY KEY(Descriptor)

););

Page 12: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1212 © © D. Wong 2002D. Wong 2002

SQL schema (continued)SQL schema (continued)CREATE TABLE PubMedDataOf (CREATE TABLE PubMedDataOf (

PubMedDataOfPMIDPubMedDataOfPMID NUMBER(20),NUMBER(20),

PubMedDataOfArticleIDPubMedDataOfArticleID NUMBER(20),NUMBER(20),

PubMedDataOfIDtypePubMedDataOfIDtype VARCHAR(9) CONSTRAINT VARCHAR(9) CONSTRAINT

PubMedDataOfIDtypeConstraint CHECK (PubMedDataOfIDtypePubMedDataOfIDtypeConstraint CHECK (PubMedDataOfIDtype

IN ('pubmed' , 'medline')),IN ('pubmed' , 'medline')),

CONSTRAINT PubMedDataOfPMIDForeignKey FOREIGN KEY(PubMedDataOfPMID) REFERENCES CONSTRAINT PubMedDataOfPMIDForeignKey FOREIGN KEY(PubMedDataOfPMID) REFERENCES PubMedlineCitation(PMID),PubMedlineCitation(PMID),

CONSTRAINT PubMedDataOfOtherForeignKeys CONSTRAINT PubMedDataOfOtherForeignKeys

FOREIGN KEY(PubMedDataOfArticleID, PubMedDataOfIDtype)FOREIGN KEY(PubMedDataOfArticleID, PubMedDataOfIDtype)

REFERENCES PubMedData(ArticleID, IDtype),REFERENCES PubMedData(ArticleID, IDtype),

CONSTRAINT PubMedDataOfKey PRIMARY KEY(PubMedDataOfArticleID,CONSTRAINT PubMedDataOfKey PRIMARY KEY(PubMedDataOfArticleID,

PubMedDataOfIDtype)PubMedDataOfIDtype)

););

CREATE TABLE JournalInfoOf (CREATE TABLE JournalInfoOf (

JournalInfoOfNlmUniqueIDJournalInfoOfNlmUniqueID VARCHAR(20)VARCHAR(20)

CONSTRAINT JournalInfoOfNlmUniqueIDForeignKeyCONSTRAINT JournalInfoOfNlmUniqueIDForeignKey

REFERENCES MedlineJournalInfo(NlmUniqueID),REFERENCES MedlineJournalInfo(NlmUniqueID),

JournalInfoOfPMIDJournalInfoOfPMID NUMBER(20) NUMBER(20)

CONSTRAINT JournalInfoOfPMIDForeignKey CONSTRAINT JournalInfoOfPMIDForeignKey

REFERENCES PubMedlineCitation(PMID),REFERENCES PubMedlineCitation(PMID),

CONSTRAINT JournalInfoOfKey PRIMARY KEY(JournalInfoOfNlmUniqueID));CONSTRAINT JournalInfoOfKey PRIMARY KEY(JournalInfoOfNlmUniqueID));

Page 13: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1313 © © D. Wong 2002D. Wong 2002

SQL schema (continued)SQL schema (continued)CREATE TABLE ArticleOf (CREATE TABLE ArticleOf (

ArticleOfNlmUniqueIDArticleOfNlmUniqueID VARCHAR(20) VARCHAR(20)

CONSTRAINT ArticleOfNlmUniqueIDForeignKeyCONSTRAINT ArticleOfNlmUniqueIDForeignKey

REFERENCES MedlineJournalInfo(NlmUniqueID),REFERENCES MedlineJournalInfo(NlmUniqueID),

ArticleOfJournalISSNArticleOfJournalISSN VARCHAR(20),VARCHAR(20),

ArticleOfJournalVolumeArticleOfJournalVolumeNUMBER(10),NUMBER(10),

ArticleOfJournalIssueArticleOfJournalIssue VARCHAR(10),VARCHAR(10),

ArticleOfMedlinePgnArticleOfMedlinePgn VARCHAR(20),VARCHAR(20),

CONSTRAINT ArticleOfKey PRIMARY KEY(ArticleOfNlmUniqueID),CONSTRAINT ArticleOfKey PRIMARY KEY(ArticleOfNlmUniqueID),

CONSTRAINT ArticleOfOtherForeignKeys FOREIGN KEYCONSTRAINT ArticleOfOtherForeignKeys FOREIGN KEY

(ArticleOfJournalISSN, ArticleOfJournalVolume,(ArticleOfJournalISSN, ArticleOfJournalVolume,

ArticleOfJournalIssue, ArticleOfMedlinePgn)ArticleOfJournalIssue, ArticleOfMedlinePgn)

REFERENCES Article (JournalISSN, JournalVolume, JournalIssue,REFERENCES Article (JournalISSN, JournalVolume, JournalIssue,

MedlinePgn)MedlinePgn)

););

Page 14: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1414 © © D. Wong 2002D. Wong 2002

SQL schema (continued)SQL schema (continued)CREATE TABLE WrittenBy (CREATE TABLE WrittenBy (

WrittenByAuthorWrittenByAuthor VARCHAR(50)VARCHAR(50)

CONSTRAINT WrittenByAuthorForeignKeyCONSTRAINT WrittenByAuthorForeignKey

REFERENCES Authorlist(Author),REFERENCES Authorlist(Author),

WrittenByJournalISSNWrittenByJournalISSN VARCHAR(20),VARCHAR(20),

WrittenByJournalVolumeWrittenByJournalVolume NUMBER(10),NUMBER(10),

WrittenByJournalIssueWrittenByJournalIssue VARCHAR(10),VARCHAR(10),

WrittenByMedlinePgnWrittenByMedlinePgn VARCHAR(20),VARCHAR(20),

CONSTRAINT WrittenByKey PRIMARY KEY (WrittenByAuthor,CONSTRAINT WrittenByKey PRIMARY KEY (WrittenByAuthor,

WrittenByJournalISSN, WrittenByJournalVolume,WrittenByJournalISSN, WrittenByJournalVolume,

WrittenByJournalIssue, WrittenByMedlinePgn),WrittenByJournalIssue, WrittenByMedlinePgn),

CONSTRAINT WrittenByOtherForeignKeys FOREIGN KEYCONSTRAINT WrittenByOtherForeignKeys FOREIGN KEY

(WrittenByJournalISSN, WrittenByJournalVolume,(WrittenByJournalISSN, WrittenByJournalVolume,

WrittenByJournalIssue, WrittenByMedlinePgn)WrittenByJournalIssue, WrittenByMedlinePgn)

REFERENCES Article (JournalISSN, JournalVolume,REFERENCES Article (JournalISSN, JournalVolume,

JournalIssue, MedlinePgn)JournalIssue, MedlinePgn)

););

Page 15: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1515 © © D. Wong 2002D. Wong 2002

SQL schema (continued)SQL schema (continued)CREATE TABLE GrantedBy (CREATE TABLE GrantedBy (

GrantedByGrantIDGrantedByGrantID VARCHAR(20) VARCHAR(20)

CONSTRAINT GrantedByGrantIDForeignKeyCONSTRAINT GrantedByGrantIDForeignKey

REFERENCES Grants(GrantID),REFERENCES Grants(GrantID),

GrantedByJournalISSNGrantedByJournalISSN VARCHAR(20),VARCHAR(20),

GrantedByJournalVolumeGrantedByJournalVolume NUMBER(10),NUMBER(10),

GrantedByJournalIssueGrantedByJournalIssue VARCHAR(10),VARCHAR(10),

GrantedByMedlinePgnGrantedByMedlinePgn VARCHAR(20),VARCHAR(20),

CONSTRAINT GrantedByKey PRIMARY KEY(GrantedByGrantID),CONSTRAINT GrantedByKey PRIMARY KEY(GrantedByGrantID),

CONSTRAINT GrantedByOtherForeignKeys CONSTRAINT GrantedByOtherForeignKeys

FOREIGN KEY (GrantedByJournalISSN, GrantedByJournalVolume, FOREIGN KEY (GrantedByJournalISSN, GrantedByJournalVolume, GrantedByJournalIssue, GrantedByMedlinePgn)GrantedByJournalIssue, GrantedByMedlinePgn)

REFERENCES Article(JournalISSN, JournalVolume, JournalIssue,REFERENCES Article(JournalISSN, JournalVolume, JournalIssue,

MedlinePgn)MedlinePgn)

););

Page 16: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1616 © © D. Wong 2002D. Wong 2002

SQL schema (continued)SQL schema (continued)CREATE TABLE HasChemicalList (CREATE TABLE HasChemicalList (

HasChemicalListNameOfSubstanceHasChemicalListNameOfSubstance VARCHAR(100) VARCHAR(100)

CONSTRAINT HasChemicalListNameOfSubstanceForeignKeyCONSTRAINT HasChemicalListNameOfSubstanceForeignKey

REFERENCES ChemicalList(NameOfSubstance),REFERENCES ChemicalList(NameOfSubstance),

HasChemicalListJournalISSNHasChemicalListJournalISSN VARCHAR(20),VARCHAR(20),

HasChemicalListJournalVolumeHasChemicalListJournalVolume NUMBER(10),NUMBER(10),

HasChemicalListJournalIssueHasChemicalListJournalIssue VARCHAR(10),VARCHAR(10),

HasChemicalListMedlinePgnHasChemicalListMedlinePgn VARCHAR(20),VARCHAR(20),

CONSTRAINT HasChemicalListKey PRIMARY KEYCONSTRAINT HasChemicalListKey PRIMARY KEY

(HasChemicalListNameOfSubstance, HasChemicalListJournalISSN,(HasChemicalListNameOfSubstance, HasChemicalListJournalISSN,

HasChemicalListJournalVolume, HasChemicalListJournalIssue,HasChemicalListJournalVolume, HasChemicalListJournalIssue,

HasChemicalListMedlinePgn),HasChemicalListMedlinePgn),

CONSTRAINT HasChemicalListOtherForeignKeys FOREIGN KEY CONSTRAINT HasChemicalListOtherForeignKeys FOREIGN KEY (HasChemicalListJournalISSN, HasChemicalListJournalVolume,(HasChemicalListJournalISSN, HasChemicalListJournalVolume,

HasChemicalListJournalIssue, HasChemicalListMedlinePgn) REFERENCESHasChemicalListJournalIssue, HasChemicalListMedlinePgn) REFERENCES

Article(JournalISSN, JournalVolume, JournalIssue, MedlinePgn)Article(JournalISSN, JournalVolume, JournalIssue, MedlinePgn)

););

Page 17: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1717 © © D. Wong 2002D. Wong 2002

SQL schema (continued)SQL schema (continued)CREATE TABLE HasMeshHeadingList (CREATE TABLE HasMeshHeadingList (

HasMeshHeadingListDescriptorHasMeshHeadingListDescriptor VARCHAR(100) VARCHAR(100)

CONSTRAINT HasMeshHeadingListDescriptorForeignKey CONSTRAINT HasMeshHeadingListDescriptorForeignKey

REFERENCES MeshHeadingList(Descriptor),REFERENCES MeshHeadingList(Descriptor),

HasMeshHeadingListJournalISSNHasMeshHeadingListJournalISSN VARCHAR(20),VARCHAR(20),

HasMeshHeadingListJournalVolHasMeshHeadingListJournalVol NUMBER(10),NUMBER(10),

HasMeshHeadingListJournalIssueHasMeshHeadingListJournalIssue VARCHAR(10),VARCHAR(10),

HasMeshHeadingListMedlinePgnHasMeshHeadingListMedlinePgn VARCHAR(20),VARCHAR(20),

CONSTRAINT HasMeshHeadingListKey PRIMARY KEYCONSTRAINT HasMeshHeadingListKey PRIMARY KEY

(HasMeshHeadingListDescriptor, HasMeshHeadingListJournalISSN, (HasMeshHeadingListDescriptor, HasMeshHeadingListJournalISSN, HasMeshHeadingListJournalVol, HasMeshHeadingListJournalIssue, HasMeshHeadingListJournalVol, HasMeshHeadingListJournalIssue, HasMeshHeadingListMedlinePgn),HasMeshHeadingListMedlinePgn),

CONSTRAINT HasMeshHeadingListOtherForeignKeys FOREIGN KEYCONSTRAINT HasMeshHeadingListOtherForeignKeys FOREIGN KEY

(HasMeshHeadingListJournalISSN, HasMeshHeadingListJournalVol, (HasMeshHeadingListJournalISSN, HasMeshHeadingListJournalVol, HasMeshHeadingListJournalIssue, HasMeshHeadingListMedlinePgn) HasMeshHeadingListJournalIssue, HasMeshHeadingListMedlinePgn) REFERENCES REFERENCES

Article (JournalISSN, JournalVolume, JournalIssue, MedlinePgn)Article (JournalISSN, JournalVolume, JournalIssue, MedlinePgn)

););

Page 18: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1818 © © D. Wong 2002D. Wong 2002

XML DTDXML DTD cs610PubmedDTD.dtd<!DOCTYPE PubMedArticle [

<!ELEMENT PubMedArticle(MedlineCitation, PubmedData)>

<!ELEMENT MedlineCitation (MedlineID,

PMID,

DateCreated,

DateCompleted,

DateRevised,

Article,

MedlineJournalInfo,

ChemicalList,

CitationSubset,

MeshHeadingList)>

<!ELEMENT MedlineID (#PCDATA)>

<!ELEMENT PMID (#PCDATA)>

<!ELEMENT DateCreated (#PCDATA)>

<!ELEMENT DateCompleted (#PCDATA)>

<!ELEMENT DateRevised (#PCDATA)>

Page 19: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

1919 © © D. Wong 2002D. Wong 2002

XML DTD (Continued)XML DTD (Continued)<!ELEMENT Article (Journal,

ArticleTitle,

Pagination,

Abstract,

Affiliation,

AuthorList,

Language,

GrantList,

PublicationTypeList)>

<!ELEMENT Journal (ISSN, JournalIssue)>

<!ELEMENT ISSN (#PCDATA)>

<!ELEMENT JournalIssue (Volume, Issue, PubDate)>

<!ELEMENT Volume (#PCDATA)>

<!ELEMENT Issue (#PCDATA)>

<!ELEMENT PubDate (#PCDATA)>

<!ELEMENT ArticleTitle (#PCDATA)>

<!ELEMENT Pagination (MedlinePgn)>

<!ELEMENT MedlinePgn (#PCDATA)>

<!ELEMENT Abstract (AbstractText)>

<!ELEMENT AbstractText (#PCDATA)>

<!ELEMENT Affiliation (#PCDATA)>

Page 20: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2020 © © D. Wong 2002D. Wong 2002

XML DTD (Continued)XML DTD (Continued)<!ELEMENT AuthorList (Author*)>

<!ELEMENT Author (#PCDATA)>

<!ELEMENT Language (#PCDATA)>

<!ELEMENT GrantList (Grant)>

<!ELEMENT Grant (GrantID, Acronym, Agency)>

<!ELEMENT GrantID (#PCDATA)>

<!ELEMENT Acronym (#PCDATA)>

<!ELEMENT Agency (#PCDATA)>

<!ELEMENT PublicationTypeList (PublicationType)>

<!ELEMENT PublicationType (#PCDATA)>

<!ELEMENT MedlineJournalInfo (Country, MedlineTA, MedlineCode, NlmUniqueID)>

<!ELEMENT Country (#PCDATA)>

<!ELEMENT MedlineTA (#PCDATA)>

<!ELEMENT MedlineCode (#PCDATA)>

<!ELEMENT NlmUniqueID (#PCDATA)>

Page 21: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2121 © © D. Wong 2002D. Wong 2002

XML DTD (Continued)XML DTD (Continued)<!ELEMENT ChemicalList (Chemical*)>

<!ELEMENT Chemical (CASRegistryNumber, NameOfSubstance)>

<!ELEMENT CASRegistryNumber(#PCDATA)>

<!ELEMENT NameOfSubstance(#PCDATA)>

<!ELEMENT CitationSubset (#PCDATA)>

<!ELEMENT MeshHeadingList (MeshHeading*)>

<!ELEMENT MeshHeading (Descriptor, SubHeading)>

<!ELEMENT Descriptor (#PCDATA)>

<!ELEMENT SubHeading (#PCDATA)>

<!ATTLIST SubHeadingMajorTopicYN (Y) #IMPLIED>

<!ELEMENT PubmedData (History, PublicationStatus, ArticleIdList)>

<!ELEMENT History (PubMedPubDate*)>

<!ELEMENT PubMedPubDate(#PCDATA)>

<!ATTLIST PubMedPubDate PubStatus (pubmed|medline) #IMPLIED>

Page 22: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2222 © © D. Wong 2002D. Wong 2002

XML DTD (Continued)XML DTD (Continued)<!ELEMENT PublicationStatus (#PCDATA)>

<!ELEMENT ArticleIdList (ArticleId*)>

<!ELEMENT ArticleId (#PCDATA)>

<!ATTLIST ArticleId IdType (pubmed|medline) #IMPLIED>

]>

Page 23: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2323 © © D. Wong 2002D. Wong 2002

Example XML fileExample XML file pubmed2.xml<?xml version = "1.0" encoding="UTF-8" standalone="no"?>

<!DOCTYPE PubMedArticle SYSTEM "http://www.dpo.uab.edu/~jaenni99/cs610PubmedDTD.dtd">

<PubMedArticle>

<MedlineCitation>

<MedlineID>21096906</MedlineID>

<PMID>11159948</PMID>

<DateCreated>22-FEB-2001</DateCreated>

<DateCompleted>05-APR-2001</DateCompleted>

<DateRevised>05-APR-2001</DateRevised>

<Article>

<Journal>

<ISSN>0964-6906</ISSN>

<JournalIssue>

<Volume>10</Volume>

<Issue>3</Issue>

<PubDate>01-FEB-2001</PubDate>

</JournalIssue>

</Journal>

Page 24: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2424 © © D. Wong 2002D. Wong 2002

Example XML file (continued)Example XML file (continued)<ArticleTitle>Gentamicin-mediated suppression of Hurler syndrome stop

mutations restores a low level of alpha-L-iduronidase activity and reduces lysosomal glycosaminoglycan accumulation.</ArticleTitle>

<Pagination>

<MedlinePgn>291-9</MedlinePgn>

</Pagination>

<Abstract>

<AbstractText>Hurler syndrome is the most severe form of a lysosomal storage disease caused by loss of the enzyme alpha-L-iduronidase (encoded by the IDUA gene), which participates in the degradation of glycosaminoglycans (GAGs) within the lysosome. In some populations, premature stop mutations represent roughly two-thirds of the mutations that cause Hurler syndrome. In this study we investigated whether the aminoglycoside gentamicin can suppress stop mutations within the IDUA gene. We found that a Hurler syndrome fibroblast cell line heterozygous for the IDUA stop mutations Q70X and W402X showed a significant increase in alpha-L-iduronidase activity when cultured in the presence of gentamicin, resulting in the restoration of 2.8% of normal alpha-L-iduronidase activity. Determination of alpha-L-iduronidase protein levels by an immunoquantification assay indicated that gentamicin treatment produced a similar increase in alpha-L-iduronidase protein in Hurler cells. Both the alpha-L-iduronidase activity and protein level resulting from this treatment have previously been correlated with mild Hurler phenotypes. Although Hurler fibroblasts contain a much higher level of GAGs than normal, we found that gentamicin treatment reduced GAG accumulation in Hurler cells to a normal level. We also found that a reduced GAG level could be sustained for at least 2 days after gentamicin treatment was discontinued. The reduction in the GAG level was also reflected in a marked reduction in lysosomal vacuolation. Taken together, these results suggest that the suppression of premature stop mutations may provide an effective treatment for Hurler syndrome patients with premature stop mutations in the IDUA gene.</AbstractText>

</Abstract>

Page 25: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2525 © © D. Wong 2002D. Wong 2002

Example XML file (continued)Example XML file (continued)<Affiliation>Department of Human Genetics, University of

Alabama at Birmingham, Birmingham, AL 35294-2170, USA.</Affiliation>

<AuthorList>

<Author>K Keeling</Author>

<Author>D Brooks</Author>

<Author>J Hopwood</Author>

<Author>P Li</Author>

<Author>J Thompson</Author>

<Author>D Bedwell</Author>

</AuthorList>

<Language>eng</Language>

<GrantList>

<Grant>

<GrantID>DK53090</GrantID>

<Acronym>DK</Acronym>

<Agency>NIDDK</Agency>

</Grant>

</GrantList>

Page 26: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2626 © © D. Wong 2002D. Wong 2002

Example XML file (continued)Example XML file (continued)<PublicationTypeList>

<PublicationType>Journal Article</PublicationType>

</PublicationTypeList>

</Article>

<MedlineJournalInfo>

<Country>England</Country>

<MedlineTA>Hum Mol Genet</MedlineTA>

<MedlineCode>BRC</MedlineCode>

<NlmUniqueID>9208958</NlmUniqueID>

</MedlineJournalInfo>

<ChemicalList>

<Chemical>

<CASRegistryNumber>0</CASRegistryNumber>

<NameOfSubstance>Antibiotics, Aminoglycoside</NameOfSubstance>

</Chemical>

Page 27: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2727 © © D. Wong 2002D. Wong 2002

Example XML file (continued)Example XML file (continued)<Chemical>

<CASRegistryNumber>0</CASRegistryNumber>

<NameOfSubstance>Codon, Terminator</NameOfSubstance>

</Chemical>

<Chemical>

<CASRegistryNumber>0</CASRegistryNumber>

<NameOfSubstance>Gentamicins</NameOfSubstance>

</Chemical>

<Chemical>

<CASRegistryNumber>0</CASRegistryNumber>

<NameOfSubstance>Glycosaminoglycans</NameOfSubstance>

</Chemical>

<Chemical>

<CASRegistryNumber>0</CASRegistryNumber>

<NameOfSubstance>Heat-Shock Proteins 70</NameOfSubstance>

</Chemical>

<Chemical>

<CASRegistryNumber>EC 3.2.1.76</CASRegistryNumber>

<NameOfSubstance>Iduronidase</NameOfSubstance>

</Chemical>

</ChemicalList>

Page 28: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2828 © © D. Wong 2002D. Wong 2002

Example XML file (continued)Example XML file (continued)<CitationSubset>AIM</CitationSubset>

<MeshHeadingList>

<MeshHeading>

<Descriptor>Antibiotics, Aminoglycoside</Descriptor>

<SubHeading MajorTopicYN="Y">pharmacology</SubHeading>

</MeshHeading>

<MeshHeading>

<Descriptor>Gentamicins</Descriptor>

<SubHeading MajorTopicYN="Y">pharmacology</SubHeading>

</MeshHeading>

<MeshHeading>

<Descriptor>Glycosaminoglycans</Descriptor>

<SubHeading MajorTopicYN="Y">metabolism</SubHeading>

</MeshHeading>

<MeshHeading>

<Descriptor>Iduronidase</Descriptor>

<SubHeading MajorTopicYN="Y">drug effects</SubHeading>

</MeshHeading>

Page 29: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

2929 © © D. Wong 2002D. Wong 2002

Example XML file (continued)Example XML file (continued)<MeshHeading>

<Descriptor>Lysosomes</Descriptor>

<SubHeading MajorTopicYN="Y">drug effects</SubHeading>

</MeshHeading>

<MeshHeading>

<Descriptor>Mucopolysaccharidosis I</Descriptor>

<SubHeading MajorTopicYN="Y">enzymology</SubHeading>

</MeshHeading>

<MeshHeading>

<Descriptor>Nasal Mucosa</Descriptor>

<SubHeading MajorTopicYN="N">physiopathology</SubHeading>

</MeshHeading>

</MeshHeadingList>

</MedlineCitation>

<PubmedData>

<History>

<PubMedPubDate PubStatus="pubmed">13-FEB-2001</PubMedPubDate>

<PubMedPubDate PubStatus="medline">06-APR-2001</PubMedPubDate>

</History>

Page 30: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3030 © © D. Wong 2002D. Wong 2002

Example XML file (continued)Example XML file (continued)<PublicationStatus>ppublish</PublicationStatus>

<ArticleIdList>

<ArticleId IdType="pubmed">11159948</ArticleId>

<ArticleId IdType="medline">21096906</ArticleId>

</ArticleIdList>

</PubmedData>

</PubMedArticle>

Page 31: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3131 © © D. Wong 2002D. Wong 2002

How to input(store) the journal article of XML How to input(store) the journal article of XML format into PubMed databaseformat into PubMed database

- Source code : XmlLoad.java- Source code : XmlLoad.javaXmlLoad( )XmlLoad( ) // constructor// constructor

// Setup basic information to // Setup basic information to connect to Oracle DB connect to Oracle DB

connect( )connect( ) // Connet to Oracle DB// Connet to Oracle DB

list( )list( ) // Display XML file// Display XML file

initializePreparedStatements( )initializePreparedStatements( ) // Initialize prepareStatements// Initialize prepareStatements

loadPubMedArticle( )loadPubMedArticle( ) // Load entities// Load entities with their values and attributes with their values and attributes

insertDataIntoOracleDB( ) insertDataIntoOracleDB( ) // Insert XML data// Insert XML data into Oracle DB into Oracle DB

closePreparedStatements( )closePreparedStatements( ) // Close preparedStatements// Close preparedStatements

disconnect( ) disconnect( ) // Disconnet// Disconnet

Text User InterfaceText User Interface

Page 32: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3232 © © D. Wong 2002D. Wong 2002

1. java XmlLoad –xmlFile xmlFilename.xml1. java XmlLoad –xmlFile xmlFilename.xml

Text User Interface (continued)Text User Interface (continued)

Page 33: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3333 © © D. Wong 2002D. Wong 2002

Text User Interface (continued)Text User Interface (continued)

2. 2.

3.3.

Page 34: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3434 © © D. Wong 2002D. Wong 2002

Text User Interface (continued)Text User Interface (continued)

How to query according to PMID, keyword, orHow to query according to PMID, keyword, orauthor nameauthor name- Source code : QueryXml.java- Source code : QueryXml.java

Select one of optionsSelect one of options // Search by ‘1’ PMID, ‘2’ keyword, or // Search by ‘1’ PMID, ‘2’ keyword, or ‘3’ Author name ‘3’ Author name

QueryXml( )QueryXml( ) // constructor// constructor// Setup basic information to // Setup basic information to connect to Oracle DB connect to Oracle DB

connect ( )connect ( ) // Connet to Oracle DB// Connet to Oracle DB

executeQuery ( )executeQuery ( ) // Search articles in PubMed database// Search articles in PubMed database according to option and your query according to option and your query

ResultSetToXMLConverter ( )ResultSetToXMLConverter ( )

// Display search results as XML file format// Display search results as XML file format

ResultTOXMLfiles ( )ResultTOXMLfiles ( ) // Save search results as XML file format // Save search results as XML file format

disconnect( ) disconnect( ) // Disconnet// Disconnet

Page 35: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3535 © © D. Wong 2002D. Wong 2002

Text User Interface (continued)Text User Interface (continued)

1. java QueryXml1. java QueryXml

Page 36: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3636 © © D. Wong 2002D. Wong 2002

Text User Interface (continued)Text User Interface (continued)

2. Select one option2. Select one option

Page 37: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3737 © © D. Wong 2002D. Wong 2002

Text User Interface (continued)Text User Interface (continued)

3. Enter more information to search3. Enter more information to search

Page 38: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3838 © © D. Wong 2002D. Wong 2002

Text User Interface (continued)Text User Interface (continued)

4. Search Articles in PubMed database4. Search Articles in PubMed database

Page 39: © D. Wong 2002 1 CS610 Database Term Project PubMed Database to handle journal articles at Pharmaceutical Company, “Drug R Us” ChangBin Won wonc@cis.uab.edu

3939 © © D. Wong 2002D. Wong 2002

Text User Interface (continued)Text User Interface (continued)

5. Create XML files as the search results5. Create XML files as the search results