the mammogrid project
DESCRIPTION
The MammoGrid Project. - an EU FP5 funded project On behalf of the MammoGrid Consortium : CERN, Geneva; Mirada Solutions, Oxford; Universities of Oxford, Pisa, Sassari, West of England; University Hospitals, Cambridge (Addenbrooke’s) & Udine (Policlinico Universitario) - PowerPoint PPT PresentationTRANSCRIPT
The MammoGrid ProjectThe MammoGrid Project- an EU FP5 funded project
On behalf of the MammoGrid Consortium:CERN, Geneva; Mirada Solutions, Oxford;
Universities of Oxford, Pisa, Sassari, West of England;
University Hospitals, Cambridge (Addenbrooke’s)& Udine (Policlinico Universitario)
Tony Solomonides, UWE
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 2 / N
Key StaffKey Staff
University of Cambridge (Addenbrooke’s Hospital)Ruth Warren (Radiology)
CERNRoberto Amendolia (Project Leader), Predrag Buncic (AliEn), Josè Galvez
Mirada SolutionsRalph Highnam (SMFTM), David Schottlander (MAS)
University of OxfordMike Brady, Chris Tromans (Quality Control)
University of PisaPasquale Delogu, Evelina Fantacci (CADe)
University of SassariUbaldo Botigli, Piernicola Oliva (CADe)
University of West of England, BristolFlorida Estrella, Tamas Hauer, David Manset, Dmitri Rogulin (DICOM, Grid)Richard McClatchey (Technical Director)Mohammed Odeh (User Requirements)Tony Solomonides (User Requirements, Database, Dissemination)
Policlinico Universitario di Universita degli Studi di UdineMassimo Bazzocchi, Chiara Del Frate (Radiology)
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 3 / N
ContentsContents
1. Background & Motivationa. Breast screening programmesb. Mammography and its problems
2. Technologies & Gridsa. Mammographic toolsb. CRISTALc. AliEn
3. MammoGrida. Objectivesb. Methods
4. Conclusions & questions
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 4 / N
Breast Cancer ScreeningBreast Cancer Screening
• Breast cancer is a major problem:– 12% lifetime risk of breast cancer– 19% of cancer deaths are due to breast cancer– 24% of all cancer cases are breast cancers– in EU & USA 350,000 are diagnosed and
115,000 die annually– but 73% of diagnosed cases survive 5 years or
more
• Early diagnosis improves prognosis
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 5 / N
Breast Cancer Screening - EUBreast Cancer Screening - EU
• UK: covers women 50 – 64; established in 1987; 1.3m women screened annually at 92 centres (230 radiologists)
• Sweden, Finland, The Netherlands, Ireland, France and Germany have or are about to establish national programmes
• Italy: depends on where you live; expertise and areas covered by screening programmes do not necessarily correlate
• Variable availability and coverage in other countries
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 6 / N
Breast Cancer Screening - UKBreast Cancer Screening - UK
• Low tech, ‘rapid, high-throughput’ screening• Basic frequency every three years• Ideally, a full ‘series’ taken at screening consists of
two images (cranio-caudal and medio-lateral oblique) for each breast – four in all
• Historically, only two CC images taken• A radiologist may have 40 seconds to study the
series• A study in 2000 estimates a 6.4% contribution to
reduced mortality to the screening programme; better treatments account for 14.9%
• More ‘interval cancers’ than anticipated• Pressure is mounting to begin screening every 2
years
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 7 / N
Breast Cancer Screening - ProblemsBreast Cancer Screening - Problems
• Sensitivity and specificity – i.e. percentages of false negatives and false positives – are both unacceptable
• In the US, 80% of biopsies are benign – error on the side of caution
• A combination of computer-aided detection and radiologist screening improve performance significantly
• There is a world-wide shortage of trained radiologists and radiographers (radiologic technicians)
• The work is perceived as ‘boring but risky’; in US 12% of malpractice lawsuits are against radiologists
• Over 20% of films are practically unavailable for re-use or comparison – misfiled, in transit or lost
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 8 / N
Mammography - ProblemsMammography - Problems
• Image quality (brightness/visibility and contrast/distinction) are affected by tube kVp and mAs, screen and film characteristics and automatic exposure control
• The interpretation of images is difficult, there are issues both about training and about experience
• There are ‘good practice’ guidelines, but radiographers have some latitude in the decisions they make
• In both UK and Italy quality control issues have arisen– Failure to record settings– Inconsistency of settings– Unusable images
• MammoGrid is concerned with film-screen systems; the move to all-digital mammography is slow
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 9 / N
What is MammoGrid?What is MammoGrid?
• EU project to prefigure a pan-European distributed database of mammographic images using GRID Technologies.
• Aim: To provide a demonstrator for use in epidemiological studies, quality control and validation of computer aided detection algorithms.
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 10 / N
MammoGrid ObjectivesMammoGrid Objectives
1.1. To To evaluate current Grids technologiesevaluate current Grids technologies and determine the requirements for Grid-compliance in a pan-European mammography database.
2.2. To To implement the MammoGrid databaseimplement the MammoGrid database, , using novel Grid-compliant and using novel Grid-compliant and Federated-Database technologiesFederated-Database technologies that will provide improved access to distributed data and will allow rapid deployment of software packages to operate on locally stored information.
3. To deploy enhanced versions of a standardization systemstandardization system that enables that enables comparison of mammogramscomparison of mammograms in terms of intrinsic tissue properties independently of scanner settings, and to explore its place in the context of medical image formats (DICOM).
4. To develop software tools to automatically extract image informationsoftware tools to automatically extract image information that can be used to perform quality controls on the acquisition process of participating centers (e.g. average brightness, contrast).
5. To develop software tools to automatically extract tissuesoftware tools to automatically extract tissue informationinformation that can be used to perform clinical studies (e.g. breast density, presence, number and location of micro-calcifications) in order to increase the performance of breast cancer screening programs.
6. To use the annotated information and the images in the database toto benchmarkbenchmark the performance of the softwarethe performance of the software described in points 3, 4 and 5.
7. To exploit the MammoGrid database and the algorithms to propose initial to propose initial pan-European quality controlspan-European quality controls on mammographic acquisition and ultimately to provide a benchmarking system to third party algorithms.to provide a benchmarking system to third party algorithms.
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 11 / N
MammoGrid PhilosophyMammoGrid Philosophy
• Project concentrates on applying emerging GRID technology rather on developing it.
• It plans to implement a ‘lightweight’ (but fully functional) GRID and study its usage in hospitals
• It will draw heavily on other Grids projects e.g. DataGrid
• It will deliver a prototype federated database of mammograms in hospitals in the UK and Italy
• It will provide rapid feedback from the Hospital community
• And will inform the next generation of health grids developments
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 12 / N
Why a Mammography Why a Mammography Database?Database?
Improved reliability of screening and early diagnosis requires:– better epidemiological understanding– improved diagnostic tools– enhanced quality control– continuous training– efficient management of data and records.
• Need to establish research and training repositories that contain sufficiently large statistical samples:– MammoGrid-EU– NDMA-US– eDIAMonD-UK– GPCalma-Italy
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 13 / N
TechnologiesTechnologies
1.Mammography– SMF™ (Mirada)
– CADe (CALMA)
– DICOM (Medical Imaging Standard)
2.Distributed computation– CRISTAL Database (CERN/UWE)
– AliEn GRID (CERN)
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 14 / N
Standard Mammogram FormStandard Mammogram Form
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 15 / N
The theory of SMF™The theory of SMF™
• Mirada’s Standard Mammogram Form (SMF™) measures the column of non-fatty tissue between the compression plate and the imaging surface.
• SMF algorithm models the physics of image formation, including extrafocal radiation, scatter, grid effects, film-screen characteristics, etc.
• The contribution of the imaging system is factored out.
• The image is decomposed into fatty tissue and non-fatty tissue.
• The new representation gives a numerical value for the amount of non-fatty tissue at any point on the image.
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 16 / N
A normalized imageA normalized image
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 17 / N
CMS DetectorCMS Detector
ECAL
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 18 / N
ECAL BarrelECAL Barrel
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 19 / N
ECAL productionECAL production
SIC (China)Crystals
Bogorodisk(Russia)Crystals
IPN Lyon(France)
Electronics
LPNHE(France)Alveoli
ENEA/INFN (Italy)
Modules
CERN (Switzerland)
Modules & Supermodules
CERN (Switzerland)
SM in testbeam
CERN (Switzerland)ECAL in CMS
Parts
CEA/DAPNIA (France)
Monitoring
Parts
• Parts are shipped between centres• Must control overall construction
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 20 / N
Integration via Meta-ObjectsIntegration via Meta-Objects
PartDefin ition
Com positePartDef E lem entaryPartDef
PartCom positeM em ber
0..n
ActivityDefin ition
Elem entaryActDef Com positeActDef
ActCom positeM em ber
0..n
0..n
0..1
Conditions
0..n 0..n
10..n
10..n
CRISTAL 1 concept for integrated product (structure) and process (workflow) description
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 21 / N
CRISTAL Object LayersCRISTAL Object Layers
Item C lass Item D escrip tionC lass
Item Item D escrip tion
Is described by
Is described by
Data Meta-Data
Is an instance of Is an instance of
M odel Layer
Instance Layer
Type Object Pattern
Type Object Pattern
CRISTAL 2 Kernel Architecture
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 22 / N
CRISTALCRISTAL
• Data and metadata are versioned asynchronously
– Flexibility to cope with and propagate end-users’ changes
– Clear separation between object lifecycles of the domain:
• Metadata: What has to be done
• Data: What is being done
• Data and metadata models are largely domain independent
• ‘Design patterns’ have emerged from the process of abstraction
• Large volume of data can be managed
• Large distributed data processes can be tracked
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 23 / N
DICOM Information ModelDICOM Information Model
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 24 / N
MammoGrid Object ModelMammoGrid Object Model
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 25 / N
Assessment Object ModelAssessment Object Model
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 26 / N
The MammoGrid ChallengeThe MammoGrid Challenge
• Building this repository is not trivial because:– Large numbers of exemplars are required.– Cases must be obtained from many
geographically remote locations.– Data itself is large: 2 breasts × 2 views × 4K
× 4K pix × 2 bytes = 128Mbyte per patient per visit, 1.5M women per year UK, ~ 200 Terabytes in UK alone,
– Acquisition is highly variable, same image may look different depending on machine and parameters. How do you compare?
– Patient privacy and data security are key.– Many relevant items of metadata.
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 27 / N
GRIDGRID
Seeking a solution for resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations.
Data GRID goal:
“Enable a geographically distributed community of thousands to pool their
resources in order to perform sophisticated, computationally
intensive analyses on petabytes of data”
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 28 / N
Layered Grid TechnologiesLayered Grid Technologies
Knowledge Grid
DataControl
DataAbstraction
Data Grid
Information Grid
Distributed databases, streaming, near-line storage, large objects, access mechanisms, data staging …
Metadata, middleware, intelligent retrieval, information modelling, warehousing, workflow …
Data mining, visualisation, simulation, problem solving methods/environments …
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 29 / N
AliEn ChallengeAliEn Challenge
AliEn = Alice Environment
“Can we provide, building on top of available public domain and open source components and standards, a functional distributed computing infrastructure to the community of our users which will remain operational even if underlying technologies keep changing?”
Geneva, May 2001
Instead of using Globus toolkit or waiting for DataGRID to deliver re-packaged version of Globus, we decided to try different path and use Web Services and related standards as a backbone of our GRID implementation.
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 30 / N
AliEn Services in MammoGridAliEn Services in MammoGrid
• Authentication– User’s credentials checked
• Resource Broker– Job/algorithm scheduling
• Storage Element– Data and file management
• File transfer– Scheduled file transfers
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 31 / N
Service Oriented ArchitectureService Oriented Architecture
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 32 / N
High Energy Physics vs. High Energy Physics vs. MammoGridMammoGrid
• MammoGrid relies heavily on technologies developed primarily in the field of high energy physics.
• Similarities– Large number of big files – Files can be sensibly organized in directory tree– Need to replicate and move file copies between
sites– Need to execute commands on the node which
hosts data locally
• Difficulties– Complexity of co-working in medical environment– Lack of trained IT personnel – Confidentiality
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 33 / N
A GRID Infrastructure is A GRID Infrastructure is idealideal
• To test image-based clinical hypotheses, databasesdatabases must be: Populated by large number of cases Contain large files (1 mammogram ~ 32Mb) Geographically distributed repositories Heterogeneous database formats Need to be accessible to co-workers
• Development / validationDevelopment / validation of medical image analysis solutions demands: Computationally expensive simulations Repeated runs for optimal parameter tuning Statistical test rigs Remote execution and maintenance
• Services (e.g. security) must be system-resident, invisible, Services (e.g. security) must be system-resident, invisible, genericgeneric
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 34 / N
Federated System Federated System SolutionSolution
Hospital Italy
Healthcare Institute
University Database
Hospital UK
Shared meta-data
Analysis-specific data•Knowledge is stored alongside data
•Active (meta-)objects manage various versions of data and algorithms
•Small network bandwidth required
Clinician’s Workstations
QueryResult
LocalQuery
LocalAnalysis
LocalAnalysis
LocalAnalysis
LocalAnalysis
Massively distributed dataAND distributed analyses
GRIDLocalQuery
LocalQuery
LocalQuery
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 35 / N
AliEn Virtual OrganizationAliEn Virtual Organization
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 36 / N
Multi-level Virtual Multi-level Virtual OrganizationOrganization
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 37 / N
Overall Grids Overall Grids ArchitectureArchitecture
GridBox
GRID
VPN
WorkstationsWorkstations
MammoGrid Data
MammoGrid Data
MammoGrid Data
Udine
Oxford
CERN
MammoGrid Data
Cambridge
GridBox
GridBox
GridBox
High Security Level
Mirada WST(‘‘MAS’’)
Mirada WST(‘‘MAS’’)
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 38 / N
Stack of Grid ServicesStack of Grid Services
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 39 / N
Store onLocalStore
Workstation Local GridBox
DICOM File
Extract Meta-Data and
SaveCentrally
ALIEN
Meta-Data
Search Criteria
PREPAREList of Files that W/S is Interested In (LFNs)
Find PFNRETRIEVE
LFN
RemoteRetrieval of
Files to LocalData Store
Cache
Requests toother gridservers
2. QUERYQuery
DatabaseList of matching results, including Logical
File Names (LFN)
LocalLFN /PFN
Mapping
1. AcquireNew
Image
Local Data Store
Local Cache
File
Store
DICOMSCU
DICOMSCP
ADD(Trigger)
DICOMSCU
DICOMSCP
C_STORE
C_GET
PFN (dicom://ip:port:aetitle)
DicomGet
List of LFNs Sorted by Estimated Access Time
MammoGrid TransactionsMammoGrid Transactions
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 40 / N
MammoGrid QueriesMammoGrid Queries
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 41 / N
Main Deliverables and Main Deliverables and MilestonesMilestones
• User Requirements Specification and Technical System Specification (published)
• Packaged medical imaging workstation with interface to GRID, secure GRID box, (October 2003)
• Grid compliant SMF software (December 2003)
• Prototype GRID-compliant database and information infrastructure (March 2004)
• Application software (under development)
• Clinical Trials (late 2004 – end of project)
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 42 / N
Conclusions and Further Conclusions and Further ResearchResearch
• Distributed Health informatics is an important application area for Grids technologies – Health Grids
• Many similarities with High Energy Physics
• MammoGrid user requirements specified, but need rapid feedback from the user community
• Effective Grid deployment needed now
• Many questions subject to further research:
– How to resolve distributed queries ?
– What role for meta-data ?
– How to maintain secure, reliable data ?• MammoGrid: First results expected late 2003
The MammoGrid ProjectThe MammoGrid Project
Thank you
Tony [email protected] [email protected] XIX NEC 15-20 September 2003XIX NEC 15-20 September 2003 page 44 / N