the matsu wheel - nasa · a n a ly tic s plug in easily matsu analytic wheel maria patterson...
TRANSCRIPT
![Page 1: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/1.jpg)
HyspIRI Symposium, 5 June, 2014
Maria Patterson, PhD
Open Science Data Cloud
Center for Data Intensive Science (CDIS)
University of Chicago
The Matsu Wheel:
A Cloud-based
Scanning Framework
for Analyzing Large Volumes of
Hyperspectral Data
![Page 2: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/2.jpg)
The Open Science Data Cloud (OSDC) is an open-source,
cloud-based infrastructure that allows scientists to manage,
share, and analyze medium to large size scientific datasets.
Application for resources available to anyone doing scientific research:
www.opensciencedatacloud.org
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
![Page 3: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/3.jpg)
User view: 1) login
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
![Page 4: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/4.jpg)
User view: 2) launch virtual machine
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
![Page 5: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/5.jpg)
User view: 3) run analysis
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
![Page 6: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/6.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
![Page 7: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/7.jpg)
• Joint effort between the Open Cloud Consortium (lead, Robert
Grossman) and NASA (lead, Dan Mandl) to develop open source
technology for cloud-based processing of satellite imagery to
support earth sciences.
• The OSDC is used to process Earth Observing 1 (EO-1) satellite
imagery from the Advanced Land Imager and the Hyperion
instruments and to make this data available to interested users.
• Namibia flood dashboard, WCPS
• Hadoop-based ‘Matsu Wheel’ scanning data algorithm
Project Matsu
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
![Page 8: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/8.jpg)
Wheel analytics run
over data using MapReduceOSDC Public
Data Commons
(GlusterFS)
Earth Observing-1
NASA Goddard
Space Flight
Center
New data observed by EO-1
and downloaded to NASA
NASA images sent to OSDC Public Data
Commons cloud for permanent storage
HDFSData read into
HDFS only once
NoSql Database
(Accumulo)
Metadata stored
Analytic results stored
Analytic reports generated by
Wheel are accessible via web browser
Secondary
analysis can be
done from
analytic database
contours + clusters
rare pixelfinder
spectral blobs
supervisedclassifier
report generators
Additional analytics
plug in easily
Matsu Analytic Wheel
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
![Page 9: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/9.jpg)
Wheel analytics run
over data using MapReduceOSDC Public
Data Commons
(GlusterFS)
Earth Observing-1
NASA Goddard
Space Flight
Center
New data observed by EO-1
and downloaded to NASA
NASA images sent to OSDC Public Data
Commons cloud for permanent storage
HDFSData read into
HDFS only once
NoSql Database
(Accumulo)
Metadata stored
Analytic results stored
Analytic reports generated by
Wheel are accessible via web browser
Secondary
analysis can be
done from
analytic database
contours + clusters
rare pixelfinder
spectral blobs
supervisedclassifier
report generators
Additional analytics
plug in easily
Matsu Analytic Wheel
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
• The Wheel “watches” for new data to become
available, using Apache Storm.
• When new data are detected, loaded into Hadoop’s
distributed file system for analysis using MapReduce.
• The Wheel analytics run each night, daily reports
available the morning after data are received.
![Page 10: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/10.jpg)
Wheel analytics run
over data using MapReduceOSDC Public
Data Commons
(GlusterFS)
Earth Observing-1
NASA Goddard
Space Flight
Center
New data observed by EO-1
and downloaded to NASA
NASA images sent to OSDC Public Data
Commons cloud for permanent storage
HDFSData read into
HDFS only once
NoSql Database
(Accumulo)
Metadata stored
Analytic results stored
Analytic reports generated by
Wheel are accessible via web browser
Secondary
analysis can be
done from
analytic database
contours + clusters
rare pixelfinder
spectral blobs
supervisedclassifier
report generators
Additional analytics
plug in easily
Matsu Analytic Wheel
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
• The Wheel is efficient for
processing large volumes
of data with many types of
analysis by simply requiring
a common input format.
![Page 11: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/11.jpg)
Wheel analytics run
over data using MapReduceOSDC Public
Data Commons
(GlusterFS)
Earth Observing-1
NASA Goddard
Space Flight
Center
New data observed by EO-1
and downloaded to NASA
NASA images sent to OSDC Public Data
Commons cloud for permanent storage
HDFSData read into
HDFS only once
NoSql Database
(Accumulo)
Metadata stored
Analytic results stored
Analytic reports generated by
Wheel are accessible via web browser
Secondary
analysis can be
done from
analytic database
contours + clusters
rare pixelfinder
spectral blobs
supervisedclassifier
report generators
Additional analytics
plug in easily
Matsu Analytic Wheel
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
![Page 12: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/12.jpg)
matsu-analytics.opensciencedatacloud.org
Matsu Wheel Daily Reports
![Page 13: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/13.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
Matsu Wheel Daily Reports
![Page 14: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/14.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
Matsu Wheel Daily Reports
![Page 15: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/15.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
Matsu Wheel Daily Reports
![Page 16: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/16.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
Matsu Wheel Daily Reports
![Page 17: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/17.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
Matsu Wheel Daily Reports
![Page 18: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/18.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
Matsu Wheel Daily Reports
![Page 19: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/19.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
Matsu Wheel is open source
github.com/opencloudconsortium/matsu-project
![Page 20: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/20.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
New wheel analytic (beta):
Support Vector Machine (SVM) classifier
• A supervised machine learning classification algorithm
• Train the classifier by hand classifying areas in a set of training
images
• Beta classifier has 4 classes: clouds, dry land, vegetation, water
![Page 21: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/21.jpg)
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
New wheel analytic (beta):
Support Vector Machine (SVM) classifier
![Page 22: The Matsu Wheel - NASA · a n a ly tic s plug in easily Matsu Analytic Wheel Maria Patterson (mtpatter@uchicago.edu) Center for Data Intensive Science, University of Chicago • The](https://reader033.vdocument.in/reader033/viewer/2022042220/5ec6a8f0b673cc1e1f66086e/html5/thumbnails/22.jpg)
• SVM classifier adapt regionally to geographic area (classes
depend on geography)
• Incorporate SVM classifier into Matsu Wheel
• Additional wheel analytics
• Web Map Service and tiling using Geoserver
• Add additional data to the Wheel
Continuing work
Maria Patterson ([email protected]) Center for Data Intensive Science, University of Chicago
• Make your data available to Project Matsu
• Port your analysis tools and applications
• Use the Matsu cloud to facilitate making discoveries that require
integrating multiple large datasets
• Contribute a Wheel analytic
What you can do