cce lter: information management (2004-2006)interoperability.ucsd.edu/docs/06asm_cce... · cce...

1
CCE LTER: Information Management (2004-2006) KSBaker, SJJackson, and JRWanetick, 2005. Strategies Supporting Heterogeneous Data and Interdisciplinary Collaboration: Towards an Ocean Informatics Environment. Proceedings of the 38th Hawaii International Conference on System Sciences. HICSS38, IEEE Computer Society, 2-6 January 2005, Big Island, Hawaii. HKarasti, KSBaker, and EHalkola, in press. Enriching the Notion of Data Curation in e-Science: Data Managing and Information Infrastructuring in the Long Term Ecological Research (LTER) Network. In M. Jirotka, R. Procter, T. Rodden, and G. Bowker (eds). Computer Supported Cooperative Work: An International Journal. Special Issue: Collaboration in e-Research. KSBaker and KStocks. Building Environmental Information Systems: Myths and Interdisciplinary Lessons. Proceedings of the 40th Hawaii International Conference on System Sciences (HICSS) 2007, 3-6 January, Big Island, Hawaii, pp. 1-10, IEEE Computer Society, New Brunswick, NJ, 2007 KSBaker and FMillerand. Articulation Work Supporting Information Infrastructure Design: Coordination, Categorization, and Assessment in Practice. Proceedings of the 40th Hawaii International Conference on System Sciences (HICSS), 2007, 3-6 January, Big Island, Hawaii, pp. 1-10, IEEE Computer Society, New Brunswick, NJ, 2007 Eventlogger Karen S. Baker, Lynn Yarmey, Mason Kortz, Shaun Haber, Jerry Wanetick, and Florence Millerand University of California San Diego, Scripps Institution of Oceanography and University of Quebec, Montreal 2.0 http://cce.lternet.edu Information Infrastructure Strategies Collaborative Environment: Partnerships CCE Site Environment Field Environment metadata work incomporation & format eventlogging visualizing sampling web delivery quality control PRE-PROCESSING local sharing acquisition planning & data scoping data transport data exchange repository storage documenting The California Current Ecosystem LTER is a coastal upwelling biome off the coast of California. An interdisciplinary group is working to understand and communicate the effects of long term climate variability on the California Current pelagic ecosystem. The CCE site became part of the LTER network in 2004 and is based at the Scripps Institution of Oceanography - University of California, San Diego. An automated Eventlog System is a step towards integrating diverse measurements through a common time and location stamp within a shared authoritative file created as data is collected. It is in development, prototyped on a series of cruises with design modifications introduced after each deployment. The CCE LTER is building a contemporary information environment - Ocean Informatics - focusing on participant engagement, process-building, and local design. Ocean Informatics is a community of practice emerging to meet the challenges of articulating requirements and collaborative design in support of heterogeneous data collections and information management practices. Our focus is on developing processes that recognize intertwined technological, organizational, and social factors nherent to design work. Our goal is to create an adaptive information infrastructure that facilitates long-term science (Baker, Jackson, Wanetick, 2006). Ocean Informatics Environment LTER and Data Stewardship Data Management Elements quality assurance DATA COLLECTING data ingestion PROCESSING LTER IM/All-Scientists Meeting, Sep 2006 DATA SHARING archival storage Design Environment Collaborative Data Environment Who are the data stewards? Are there other key elements In your design environment? LTER represents a unique setting for data stewardship, characterized and challenged by a long-term science perspective coupled with an open data sharing policy for primary data and a highly distributed interdisciplinary collaborative environment. Understandings of the extent and scope of data stewardship are beginning to unfold. Data stewardship involves recognizing the value of maintaining established data collections. Local data stewardship guides data management efforts, engagement, and planning for change along side active scientific programs, so that data is available for local research yet research and network science informs data handling and data management activities (Karasti, Baker and Halkola, in press) Shared tools: apache, php, html, css Shared tool use: forms, plotting, and visualization Gather & share site content and data for participants, partners, and public Process cruise page, grid calculators, dynamic elements such as mapper, bibliographic harvest file Design and support of project web presence Web http://cce.lternet.edu Two-component achitecture model: highly structured (ie automated navigation and dadta exchange) and less structured (table specific local needs) Focus on open source architecture and tools: mySQL, php, templates, xml and eml Database schema design and ties to web delivery Establish a cross-project data system that is flexible yet queriable data and metadata repository capable of online web delivery DataZoo Shared tools: mySQL, perl Shared architecture: design for community-side data and user needs using shared tools Create reuseable code and shared data structures; provide ease of data entry into system as well as immediate mapping to local and community standards and formats Bibliographic, media gallery, personnel directory, unit and attribute dictionaries and mappings Create extensible information system elements Shared Modules Cruise planning, resource planning, science- information management joint planning Partner with those carrying out infrastructure studies, social informatics, design studies, and ethnographic work Design schema working group, user groups, and reading groups Information exchange events, disksharing, blogs, forums, shared storage arrays, web services Examples Research and prototype informatics techniques valuable for ongoing work with field research programs while establishing and maintaining a bidirectional dialogue between domain scientists and data workers Introduce interdisciplinary collaborative projects with capacity to inform development of infrastructure Design studio, design teams, and communities of practice Identify shared resources and facilitate community building by focusing on communication and mutual learning framed by codesigning joint tasks Infrastructure Elicit and coordinate community-wide needs through dialogue, developing working standards, documentation, and self- assessment Identify, design, develop, deploy, enact, and support improved data planning and processes Data Work Engage in participatory action research with social scientists with expertise in science studies, ethnography, communication, and human-computer interface Identify and align technical, social, organizational and community elements that are part of the data processes Articulation Work Use collaborative design, comparative analysis, and shared resources Augment and prototype shared services (storage, computation, web services) and collaborative technologies (content management systems, webdav remote mount disks) Mechanisms Build long-term (adaptive) information infrastructure Enhance project data handling capabilities by creating a cross-project informatics environment Goals Sociotechnical Design Strategies Collaborative Strategies Elements References Short-Term Perspective Long-Term Perspective Technology solution driven Science inquiry driven Digital maintainability Data sustainability Data deluge concerns Data sharing concerns Data grid structures Information infrastructure arrangements Metadata enactment Data description development Data curation procedures Data stewardship practices Scientific Timeframes and their Features (Karasti et al, 2006)

Upload: others

Post on 12-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CCE LTER: Information Management (2004-2006)interoperability.ucsd.edu/docs/06ASM_CCE... · CCE LTER: Information Management (2004-2006) KSBaker,SJJackson,andJRWanetick,2005.StrategiesSupportingHeterogeneousDataandInterdisciplinaryColaboration

CCE LTER: Information Management (2004-2006)

KSBaker, SJJackson, and JRWanetick, 2005. Strategies Supporting Heterogeneous Data and Interdisciplinary Collaboration: Towards an Ocean Informatics Environment.Proceedings of the 38th Hawaii International Conference on System Sciences. HICSS38, IEEE Computer Society, 2-6 January 2005, Big Island, Hawaii.

HKarasti, KSBaker, and EHalkola, in press. Enriching the Notion of Data Curation in e-Science: Data Managing and Information Infrastructuring in the Long TermEcological Research (LTER) Network. In M. Jirotka, R. Procter, T. Rodden, and G. Bowker (eds). Computer Supported Cooperative Work: An International Journal.Special Issue: Collaboration in e-Research.

KSBaker and KStocks. Building Environmental Information Systems: Myths and Interdisciplinary Lessons. Proceedings of the 40th Hawaii International Conference onSystem Sciences (HICSS) 2007, 3-6 January, Big Island, Hawaii, pp. 1-10, IEEE Computer Society, New Brunswick, NJ, 2007

KSBaker and FMillerand. Articulation Work Supporting Information Infrastructure Design: Coordination, Categorization, and Assessment in Practice. Proceedings of the40th Hawaii International Conference on System Sciences (HICSS), 2007, 3-6 January, Big Island, Hawaii, pp. 1-10, IEEE Computer Society, New Brunswick, NJ, 2007

Eventlogger

Karen S. Baker, Lynn Yarmey, Mason Kortz, Shaun Haber, Jerry Wanetick, and Florence MillerandUniversity of California San Diego, Scripps Institution of Oceanography and University of Quebec, Montreal

2.0

http://cce.lternet.edu

Information Infrastructure Strategies

Collaborative Environment: Partnerships

CCE Site Environment Field Environment

metadata workincomporation & format

eventlogging

visualizing

sampling

web delivery

quality control

PRE-PROCESSING

local sharing

acquisition planning& data scoping

data transport

data exchange

repository storage

documenting

The California Current Ecosystem LTER is acoastal upwelling biome off the coast ofCalifornia. An interdisciplinary group is workingto understand and communicate the effects of longterm climate variability on the California Currentpelagic ecosystem. The CCE site became part ofthe LTER network in 2004 and is based at theScripps Institution of Oceanography - Universityof California, San Diego.

An automated Eventlog System is astep towards integrating diversemeasurements through a commontime and location stamp within ashared authoritative file created asdata is collected. It is in development,prototyped on a series of cruises withdesign modifications introduced aftereach deployment.

The CCE LTER is building a contemporary information environment - OceanInformatics - focusing on participant engagement, process-building, and localdesign. Ocean Informatics is a community of practice emerging to meet thechallenges of articulating requirements and collaborative design in support ofheterogeneous data collections and information management practices. Ourfocus is on developing processes that recognize intertwined technological,organizational, and social factors nherent to design work. Our goal is to createan adaptive information infrastructure that facilitates long-term science (Baker,Jackson, Wanetick, 2006).

Ocean Informatics Environment

LTER and Data Stewardship

Data ManagementElements

quality assurance

DATA COLLECTING

data ingestion

PROCESSING

LTER IM/All-Scientists Meeting, Sep 2006

DATA SHARING

archival storage

Design Environment

Collaborative Data Environment

Who are the data stewards?

Are there other key elements

In your design environment?

LTER represents a unique setting for data stewardship, characterizedand challenged by a long-term science perspective coupled with anopen data sharing policy for primary data and a highly distributedinterdisciplinary collaborative environment. Understandings of theextent and scope of data stewardship are beginning to unfold.

Data stewardship involves recognizing the value of maintainingestablished data collections. Local data stewardship guides datamanagement efforts, engagement, and planning for change along sideactive scientific programs, so that data is available for local researchyet research and network science informs data handling and datamanagement activities (Karasti, Baker and Halkola, in press)

Shared tools: apache, php,html, css

Shared tool use: forms,plotting, and visualization

Gather & share site contentand data for participants,partners, and public

Process cruise page, gridcalculators, dynamicelements such as mapper,bibliographic harvest file

Design and support of projectweb presence

Webhttp://cce.lternet.edu

Two-component achitecture model:highly structured (ie automatednavigation and dadta exchange) andless structured (table specific localneeds)

Focus on open source architectureand tools: mySQL, php, templates,xml and eml

Database schema design and ties toweb delivery

Establish a cross-project data systemthat is flexible yet queriable data andmetadata repository capable of onlineweb delivery

DataZoo

Shared tools:mySQL, perl

Shared architecture:design for community-sidedata and user needs usingshared tools

Create reuseable code andshared data structures;provide ease of data entryinto system as well asimmediate mapping to localand community standardsand formats

Bibliographic, mediagallery, personnel directory,unit and attributedictionaries and mappings

Create extensibleinformation systemelements

SharedModules

Cruise planning, resource planning, science-information management joint planning

Partner with those carrying outinfrastructure studies, socialinformatics, design studies, andethnographic work

Design schema working group,user groups, and reading groups

Information exchange events, disksharing,blogs, forums, shared storage arrays, webservices

Examples

Research and prototype informaticstechniques valuable for ongoing work withfield research programs while establishingand maintaining a bidirectional dialoguebetween domain scientists and data workers

Introduce interdisciplinarycollaborative projects with capacityto inform development ofinfrastructure

Design studio, design teams,and communities of practice

Identify shared resources and facilitatecommunity building by focusing oncommunication and mutual learningframed by codesigning joint tasks

Infrastructure

Elicit and coordinate community-wideneeds through dialogue, developing workingstandards, documentation, and self-assessment

Identify, design, develop, deploy, enact, andsupport improved data planning andprocesses

DataWork

Engage in participatory actionresearch with social scientists withexpertise in science studies,ethnography, communication, andhuman-computer interface

Identify and align technical, social,organizational and communityelements that are part of the dataprocesses

ArticulationWork

Use collaborative design,comparative analysis, andshared resources

Augment and prototype shared services(storage, computation, web services) andcollaborative technologies (contentmanagement systems, webdav remotemount disks)

Mechanisms

Build long-term (adaptive)information infrastructure

Enhance project data handling capabilitiesby creating a cross-project informaticsenvironment

Goals

Sociotechnical DesignStrategies

CollaborativeStrategies

Elements

References

Short-Term Perspective Long-Term Perspective

Technology solution driven Science inquiry driven

Digital maintainability Data sustainability

Data deluge concerns Data sharing concerns

Data grid structures Information infrastructure arrangements

Metadata enactment Data description development

Data curation procedures Data stewardship practices

Scientific Timeframes and their Features

(Karasti et al, 2006)