hdf5 opendap project update and demo

Download HDF5 OPeNDAP Project Update and Demo

If you can't read please download the document

Upload: dolf

Post on 09-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

HDF5 OPeNDAP Project Update and Demo. MuQun Yang and Hyo-Kyung Lee (The HDF Group) James Gallagher (OPeNDAP, Inc.). OPeNDAP Review. A software framework that allows simple access to remote data Data Access Protocol Client-server model Widely used by Earth Science Community. - PowerPoint PPT Presentation

TRANSCRIPT

  • HDF5 OPeNDAP ProjectUpdate and DemoMuQun Yang and Hyo-Kyung Lee (The HDF Group)

    James Gallagher (OPeNDAP, Inc.)

    *HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • OPeNDAP Review A software framework that allows simple access to remote data Data Access Protocol Client-server model Widely used by Earth Science Community

    *HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Client Library(libnc-dap)

    DAP Protocol (via http)

    Server(apache)

    Handler(hdf4_handler)

    Remote Data(HDF4)

    View Data (User)OPeNDAP Concept*HDF and HDF-EOS Workshop XII10/17/2008Visualization Tools(gradsdap)Typical Remote AccessFTP/HTTPVisualization tools(e.g. grads)

    HDF and HDF-EOS Workshop XII

  • DAP Protocol (via http)

    Server(apache)

    Remote HDF5

    View Data (User)OPeNDAP HDF5 Handler*HDF and HDF-EOS Workshop XII10/17/2008

    HDF5 Handler

    HDF and HDF-EOS Workshop XII

  • Mapping HDF5 to DAP Compound Datatype Groups Object/Regional References

    *HDF and HDF-EOS Workshop XII10/17/2008Challenges

    HDF and HDF-EOS Workshop XII

  • DAP Protocol (via http)

    Server(apache)

    View Data (User)OPeNDAP HDF5 Handler with HDF-EOS5*HDF and HDF-EOS Workshop XII10/17/2008

    HDF5 Handler

    Remote HDF5

    Client Library(libnc-dap)Visualization tools(grads)

    Remote HDF-EOS5

    HDF and HDF-EOS Workshop XII

  • Challenges: HDF- EOS5 Grid with No Geolocation data Clients expect Grid with Geolocation data Some Attributes stored as Extremely Long String(s) (e.g. StructMetada.0) Clients expect structured attributes *HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Tweaks for HDF-EOS5 Added two default HDF5 handler configuration options: --enable-eos-grid --enable-eos-meta*HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Challenges: OPeNDAP Clients Need special attributes on dataset. Need shared geolocation variables.*HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Tweaks for OPeNDAP Clients Added two optional handler configuration options: --enable-short-name --enable-CF*HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • HDF5 Handler

    DAP Protocol (via http)

    Server(apache)

    View Data (User)OPeNDAP HDF5 Handler with HDF-EOS5*HDF and HDF-EOS Workshop XII10/17/2008

    HDF5 Handlerw/ CF options

    Remote HDF5

    Client Library(libnc-dap)Visualization tools(grads)

    Remote HDF-EOS5

    HDF and HDF-EOS Workshop XII

  • Day After Server Tweaks Finally, Happy Clients! *HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Problems of Tweaking HDF5 Handler*HDF and HDF-EOS Workshop XII10/17/2008Remember that we added two optional HDF5 handler configuration options:--enable-short-name--enable-CFCauses ambiguity among variable names (e.g. /GroupA/ozone vs. /GroupB/ozone) Drops some key attributes (e.g. StructMetdata, HDF_ROOT_GROUP)

    HDF and HDF-EOS Workshop XII

  • HDF5-Friendly OPeNDAP Client Library

    *HDF and HDF-EOS Workshop XII10/17/2008

    DAP Protocol (via http)

    Server(apache)

    View Data (User)

    HDF5 Handler

    Client Library(libnc-dap)Visualization tools(grads)

    Remote HDF5

    HDF5 Handlerw/ CF optionsHDF5-FriendlyClient Library(liboc-dap)HDF5 GroupsView Groups???

    HDF and HDF-EOS Workshop XII

  • Example: Groups in HDF5Traditional OPeNDAP client library: Its an attribute that I dont understand. Ill ignore it.HDF5-Friendly OPeNDAP client library: I was waiting for this key attribute to re-construct HDF5*HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Example: Reference in HDF5 Important for NPOESS Object / Regional Reference Map to DAP URL*HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Example: dap2h5 A test application for the client library It can construct HDF5 from DAP output

    *HDF and HDF-EOS Workshop XII10/17/2008

    DAP Protocol (via http)

    Server(apache)

    View Data (User)HDF5-F. Client Library App.(dap2h5)

    Remote HDF5

    HDF5 HandlerHDF5-FriendlyClient Library(liboc-dap)Group/Ref.View Group/Ref.

    HDF and HDF-EOS Workshop XII

  • One more reason: Help Clients to view SwathOur Client Library(prototype) No Latitude and Longitude Courtesy of NASA*HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • HDF5 Handler

    DAP Protocol (via http)

    Server(apache)

    View Data (User)Visualizing HDF-EOS5 Grids*HDF and HDF-EOS Workshop XII10/17/2008

    HDF5 Handlerw/ CF option

    Remote HDF5

    Client Library(libnc-dap)Visualization tools(grads)

    Remote HDF-EOS5SwathVisualizing HDF-EOS5 Swath Problem???HDF5-FriendlyClient Library(liboc-dap)Visualization Tools (gradsoc)View Swath

    HDF and HDF-EOS Workshop XII

  • Demo: MLS swath via GrADS GrADS coupled with our client library The client library provides grid mapping from swath data GrADS displays swath through HDF-EOS5 specific client library API calls *HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Summary of Client Library (Prototype)Finished coding Tested with the GrADS visualization clientDocumented Working on a demo DAP to HDF5 tool by using the client library prototypeWill test with NCL if time allows10/17/2008HDF and HDF-EOS Workshop XII*

    HDF and HDF-EOS Workshop XII

  • Caution Our Client Library is ONLY A PROTOTYPE! It does NOT support all DAP data types It does NOT support all AURA files It does NOT support all Visualization clients

    *HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Summary HDF5 Access via OPeNDAP Is Easy Is used by GES DISC to serve Aura files HDF5-Friendly OPeNDAP Client Library Serves HDF5 better(EOS swath) Benefits visualization clients

    *HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Future Work HDF5 to DAP2 Mapping Document Release HDF5-friendly OPeNDAP Client Library (Prototype) URL: http://hdfgroup.org/projects/opendap*HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Credits*HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

  • Acknowledgement This work was supported basing upon the Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA grant NNX06AC83A, NNX08A077A and NNX06AG75A.Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of NASA.

    *HDF and HDF-EOS Workshop XII10/17/2008

    HDF and HDF-EOS Workshop XII

    Well present the HDF5 OPeNDAP project update and demo.This is a joint project between the HDF Group and James Gallagher at OPeNDAP, Inc.Well present an introduction to OPeNDAP and then present how HDF5 can be served effectively in OPeNDAP.*DAP is a mere protocol like FTP and OPeNDAP is an open and free software implementation of DAP.It is a server-client model and provides APIs for both.And the biggest advantage of OPeNDAP is that it is very popular among Earth Science Community.

    *This is how remote data can be accessed via OPeNDAP. Normally, users can view local data file using scientific data using visualization tools like GrADS. In DAP, this can be done for remote file. First DAP provides both generic server and client. For different data format like HDF and NetCDF, it needs an additional server that can map such data into a standard DAP format.Next, by modifying the existing visualization tools to adopt OPeNDAP client library, they can show the remote data on demand.The benefit of OPeNDAP is clear when you want to view a small dataset; you dont have to download the entire data from the server.

    *This diagram explains the need of HDF5 handler in OPeNDAP. *The first effort was made in 2001 and we received a NASA grant in 2006 to make it a robust product.Mapping HDF5 in DAP required matching HDF5 objects with DAP objects. We enhanced the prototype server by adding support for HDF5 objects like Group, Compound datatype and References.The first product was released in March 2008 through OPeNDAP site.

    *After creating the handler, we tried HDF-EOS5 files and met some challenges in visualizing data via OPeNDAP. We tried to find a good solution.

    *The dataset in HDF EOS files that NASA produce cannot be served directly in DAP that vis. clients like. Since HDF5 data producer like NASA doesnt necessarily have to keep DAP in mind, they created HDF-EOS5 files in a way that they feel convenient.

    *Thus, we believe its our job to make the raw HDF5 data into a new form that clients will like. --enable-eos-grid processes the raw data into Grid that can client can consume easily.--enable-eos-meta chops the long string into a better format that can client can handle.

    *Some OPeNDAP clients are very picky in terms of what DAP-server can provide.They may even ask something that the original HDF5 doesnt have in the attribute.They simply reject what the standard DAP protocol allows.

    *Thus, we provided some configuration options during installation. This can make the most picky OPeNDAP client happy.However, enabling these options are risky due to you may not serve some dataset in HDF5 through server.

    *Now we succeeded visualizing HDF-EOS5 files with hdf5 handler tweaks.*So after applying the tweaks for clients and HDF-EOS5, our server can serve many clients. *There are some problems of the experimental optional tweaks. First, it can cause ambiguity by using short names. For example, the two variable names will have the same ozone when its shortened without group path information. Second, some key attributes are dropped to help many clients to visualize data.*For example, in the current hdf5 handler implementation, the group information cannot be served with the tweaked handler. So, if we replace with the default handler, we need a special client library to handle the group information. This diagram illustrates the challenge and the need of HDF5-friendly OPeNDAP client library.

    *Heres really one good example.In DAP, an attribute plays a role like comment in computer programming.Although DAP doesnt have any concept of group, the group may play an important role in HDF5 like disambiguation of same variable name.This entire group structure is being sent as a single attribute in DAP.Thus, if a user wants to re-construct HDF5 using a DAP client, this group attribute is essential and HDF5-friendly OPeNDAP client library should handle it properly.

    *Another good example is the reference in HDF5.There are two types of reference in HDF5 object and regional.Both object and regional reference can be mapped to a special data type called URL in DAP.However, the current OPeNDAP client library doesnt support the de-referencing of the URL.That is, there is no way for a OPeNDAP client can access the dataset that the URL points to.This is particularly important for NPOESS data since it uses tons of HDF5 reference inside.

    *To prove our client library concept, we created a demo application called dap2h5.As a prototype, it has a limited capability but this demo shows how to re-construct group information and dataset from DAP output successfully.

    Since it is a prototype, attributes on groups are ignored by dap2h5 at this point.

    *Theres another reason that we want to pursue HDF5-friendly OPeNDAP client library.

    A swath is a 3-D scan of very small region on the earth.Aura Scientists are not much interested in seeing 2-D Grid data but a vertical profile of a swath like this.

    However, among the six OPeNDAP clients that we tried, only one client could display a vertical profile of swath data properly.

    The main reason is that many DAP clients impose un-necessary restrictions in the format that the DAP Protocol produces. One such restriction is a CF convention that weve discussed before. Thus, our goal is to build a client library that can let visualization clients to easily deal with the data output from DAP.

    *To view swath, you need to do some swath to grid mapping and it can be done at the client library level.This is an added benefit of client library. By modifying grads slightly to use our library, we could display MLS swath data directly.

    *This demo shows the carbon monoxide level near Beijing before and during Olympics. CO level is high but later goes down significantly due to air quality control enforcement by Chinese government. After Olympics, it goes up again.

    *Heres a summary of our effort regarding client library.

    *Unfortunately, it is only prototype so it cannot support all HDF5/HDF5-EOS can provide through DAP.*Heres a summary of our project.

    Providing an OPeNDAP-way access of HDF5 is easy, efficient and cool.However, it can lose some information that HDF5 originally has due to the nature of DAP-way transformation.This requires either the modification of DAP or the creation of HDF5-friendly client.Since its hard to modify the well-established DAP, we think its easier to implement HDF5-friendly OPeNDAP client library.When its done properly, we showed that the client library could serve the HDF5 files better and help the visualization clients to visualize HDF5-EOS data directly via DAP.

    *Heres our future work.

    First, well give a detailed document on mapping between HDF5 and DAP.

    Second, well finish and release the HDF5-Friendly OPeNDAP Client Library prototype and test it with one more client like NCL.

    By the way, the library will remain as only prototype due to the limited funding availability.

    *Wed like to thank these people.They have provided us the right direction in development, an early access to data files and numerous feedback.**