an ecosystem approach to data services and digital ... · an ecosystem approach to data services...

16
An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers([email protected]) Managing Digital Research Objects in an Expanding Science Ecosystem, November 15, 2017 Cooperative agreement #OCI0940824 The National DATA SERVICE Consortium

Upload: others

Post on 30-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

AnEcosystemApproachtoDataServicesandDigitalResearch

ObjectsJimMyers([email protected])

ManagingDigitalResearchObjectsinanExpandingScienceEcosystem,November15,2017

Cooperative agreement #OCI0940824

The National DATA SERVICE

Consortium

Page 2: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

Why do we need “Data Patriotism”?

Thefutureneedsyourdata!

“Andso,myfellowAmericans,asknotwhatyourdatacandoforyou,askwhatyoucandoforyourdata”

Onlyyoucanpreventdataloss!

Page 3: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

Why?

• Researchersmanagedataontheirownduringprojectsandthenweaskthemtodoitagain:– Whentheyarebusy– Inunfamiliarsoftware,usingdifferentterminology– Forthepotentialbenefitofothers

• whoaren’tyetready,and• beforethesoftwaretoleveragerichdataexists

– Whiletellingthemtheyaren’tdoingagoodenoughjob…– Withoutgivingthemcredit– Withoutgivingthemanyguaranteeoflongevityfortheirdataorassurancethattheirdatawillbepartoftheecosystem

Page 4: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

SEAD: Sustainable Environment – Actionable Data

• StartedinOctober,2011aspartoftheNSFDataNetprogram

• Aninternationalresourceforsustainabilityscience

• Aprovideroflight-weightDataServicesbasedonnoveltechnicalandbusinessapproaches:– Adoptanintegratedlifecycleviewtocreate

value– Virtuouscycle

• providingimmediate,incrementalvalue• supportingintegrationandextension

– Supportingthelong-tailofresearch– Scaling/lowoperatingcosts

http://sead-data.net/

MargaretHedstrom,PIPraveenKumar,co-PIJimMyers,co-PIBethPlale,co-PI

Page 5: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

SEAD Data Services: Start today!

Sign-upandrequestasecure,brandedProjectSpaceinthecloud…

Page 6: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

SEAD Data Services: Upload and Share

Drag-and-dropyourdata,orupload100K+filesfromdiskPreview,share,analyze…

ExtractedMetadata

Page 7: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

SEAD Data ServicesAnnotate, Link, Use!

Tags,FormalVocabularies,FullTextIndexing

Citedin

HasCorrectionUsesCalibration

GeneratedUsing

UsesProcedure

http://sedexp.net/wiki/p7

YoursoftwarecanannotateforyouwiththeRestfulAPI

Customizetouseanycommunityvocabulary(ies)

Page 8: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

IUSDA

SEAD Data ServicesPublish and Catalog

• Pushthebuttons!– FindaRepository– MatchTheirRequirements– SubmitYourData

• PublicationIncludes:– PersistentDataID(i.e.DOI)with

discoverymetadata– Repository-specificstorage,or– Lightweight,standards-based

packageforlong-termstorage(BagIT,OAI-ORE,JSON-LD)@IU,NDS• WebDOIlandingpage• Data,metadata,license,fixityinfo

– RegistrationwithDataOneCatalog– Branded“PublishedData”pagein

yourSpacetolinkinyourwebsite

Maintaininglong-termaccesstoSEAD-publisheddata@NDSrequiresmaintaining~2000linesofcode

http://doi.org/10.5967/M0Z0368W

Page 9: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

Rich Data Objects in Forest ResearchJudy Cushing, Michelle Wallace, Noah Weiner, Nalini Nadkarni, Sharon McIntee, Anne McIntosh, Peter LynnSEAD: Jim Myers, Anna Ovchinnikova

Custom Databank Database Generator:DB, Entry forms, dictionary, EML output

Populated Project Databases for 11 sitesImage Gallery:

1300+ field images and visualizationsCanopyView:

Interactive visualization tool - tree structure,canopy coverage, db fields …

Project Website:Containing extensive metadata, documentation,software

Data Rescue Project:Plan and artifacts from the effort to organize andpublish this research

~650metadataentries(DC,PROV,ODM,custom)describingandlinkingdatafilesinthecollectionsandreferenceexternalresourcesviaDOI,ORCID,andURLs

Manuals

SoftwareInstallers

ArchivalFormats

Location

MethodsDatabaseSchema

Reports

Creators

Cyberinfrastructuredevelopmentand11projectscharacterizingthecomposition,density,surfacearea,biomass,andspatialdistributionoftrees,saplings,andunderstoryvegetation.

QueriableDBVMs

Vizualizations

LocalLodgingNearestHospitalLockedGatesSiteCharacterization

“Walkupfromwhereyouparkedatthebeginningoftheplantationtoaroundthefirstbendandlookforatree(~45cmdbh)onthenorthsideoftheroadthathasasilverrectangulartag….”

CollectionAnalysis&QA

Page 10: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

Examples from related projects

• http://terraref.org/ (LeBauer)- roboticfieldsensorsandhigh-throughputphenotypeanalytics

• SEADTrain (Plale)– Internet-of-ThingsdemonstrationofdirectpublicationofIOTdatausingRDAstandards

• SedimentExperimentalistsNetwork(SEN)KnowledgeBase– equipment,method,facilityinformationthatcanbelinkedwithdatapublishedthroughSEAD

Page 11: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

SEAD Data ServicesBest-effort Operations

• InitialCommunity:SustainabilityResearch(Ecological,Social)– Largecenterstogradstudentsandcountyparkmanagers– 3M+files,4TB+,40+groups– Rescues&newdata– 50+Publications

• <1MB– 0.6TB• 1– 135Kfiles• Citedlinksto/fromhigh-impactjournals• Basicmetadatatorichprovenanceanddocumentation

• Relatedprojects- >0.6PB,100’sofpublications• Continuingbest-effort:

– Opentoresearchersinlongtailofresearchprojectsneeding• hosteddataservices• coreshare/curate/publishcapabilitiesforcustomCI

– Operatingthroughvoluntarycontributions,relatedgrants,andbest-effortsupportfromNationalDataServicemembers

Page 12: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

An Ecosystem Approach

• Infrastructureshouldprovidecurrentvalueandsupport/catalyzethecreationofnewvaluebythirdparties(researchers,otherinfrastructureefforts)

• Sometypesofinfrastructureplaytheroleofkeystonespeciesinecosystems,helpingdefineandstabilizethecharacteroftheecosystem(butnotbeingthemostvisibleorpopulousspecies)

• It’stimetoreplaceFUDandCatch-22’sthatlimitadoptionwithbasecapabilitiesthatcansupportanecosystem-widevirtuouscycle!

JamesMyers(2017).Where'sMyUniversalDataBrowser? ResearchDataManagementImplementationsWorkshop,Arlington,VA,Sept.14-15,2017

The National DATA SERVICE

Consortium

Page 13: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

Thank you!

• Acknowledgements:– SEAD,NCED,SEN,NDSandotheractiveprojectsthathaveprovidedguidance,feedback,andsupport

• Formoreinformation:– http://sead-data.net/– https://sead2.ncsa.illinois.edu/– http://www.nationaldataservice.org/

Page 14: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate
Page 15: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

SEAD as Infrastructure:

Sharing,Curation,Publication,Reuse

WebApp/RESTAPIScala/MongoDB/ElasticSearch

Data/MD

BrownDog Svcs(RabbitMQ)EarthCube

GeosemanticsServices

SEADPublishingMatchmaking,Publishing,CI

EcosystemIntegration(Profiles,papers,catalogs,events,provenance,…)

RESTAPIJava/MongoDB

ComputationalEnvironments

Tools

Apps

Browser

Long-termRepositories

Reference&SDAPublishersDOILandingPages

WebappJava/Javascript

PublishingAgentApp

FileSystem/CloudStore

DomainCyberinfrastructures

Profiles/Pubs

ProfileSources,SSO

Catalogs(DataOne,…)

Third-PartyRepositories

SEADProjectSpaces(Clowder)

Page 16: An Ecosystem Approach to Data Services and Digital ... · An Ecosystem Approach to Data Services and Digital Research Objects Jim Myers(myersjd@umich.edu) ... Your software can annotate

SEAD Interacts with:

• Projects&theirwebsites• Authenticationservices(Google,ORCID,local,…)• ResearcherProfileServices(ORCID,(VIVO),…)• DataSources(TerraPop,NEON,‘any’,…)• DataProcessors(BrownDog,Geoserver,image/videoplayers,…)• Repositories(Dspace,Fedora,Cloud,openICPSR,…)• DiscoveryServices(DataOne,DataCite,…)• Applications/Services(R,ECube Geosemantics,VIC/DFC,…)• NationalDataServiceUniversallyAccessibleDataPublicationsPilot

– signupnow!

-- withoutdeepagreementonarchitectural/modeldetails-- withmechanismstohelpinteroperability/synthesis