why are e-infrastructures useful from a small business perspective?
DESCRIPTION
Slides of talk at seminar for the EuroRIs network (http://www.euroris-net.eu) of National Contact Points (NCPs) for EU funding programmes on Research Infrastructures.TRANSCRIPT
Nikos Manouselis Agro-Know Technologies
Why are e-Infrastructures useful from a small business
perspective?
intro
“The future belongs to the companies
that turn data into products”
We help organizations and people to address societal and
environmental challenges using solutions that are
informed and enhanced by high-quality data
We develop and put in real practice end-to-end, modular
solutions that transform data into meaningful knowledge
and services
Our values
use open data to solve meaningful societal challengescreate a data-powered ecosystem that may bootstrap agricultural & food innovationembrace all data sources, formats & types relevant to agricultural research & innovationpromote open source and open data
Our vision
To add value to the rich information available in the
wide spectrum of agricultural and biodiversity sciences
To make it universally accessible, useful and meaningful, through
innovative tools, services and applications
Unorganized Content in local and remote sites
Widgets
Authoring services
Data Discovery Services
Analytics services
Agro-Know Data Platform
Ingestion Translation Publication
Harvesting BlossomCultivation
Organized and structured Content in local and remote
DBs
Educational
Bibliographic
Other
Enrichment
Aggregate data from diverse sources
Works with different type
of data
Prepare data for
meaningful services
Educational
Bibliographic
data aggregation & sharing hub
• Value Generation Methods & Tools– Green Learning Network (GLN) Data Pool– Agricultural Bibliography Network (ABN) Data Pool
• Data Sharing Tools– OER & educational pathways– digital libraries & repositories– digitized specimens & observations– learning management systems
• Discovery Spaces– Landing pages, Micro-sites, Web portals, Apps
• Innovation Methods & Tools– Creativity Accelerator, Training curricula, Open Data Incubator
product families
why?
Resilience, flexibility and policies that favor R&D investment in staple food
research and efficient input use will be the pillars on which future food security
depends.
- FAO Report(http://www.fao.org/docrep/014/i2280e/i2280e10.pdf)
10
11
Key facts about agricultural trends
Agriculture is about to experience a “growth shock” in order to cover the exponentially increasing food needs of the global population
• All demographic and food demand projections suggest that, by 2050, the planet will face severe food crises due to our inability to meet agricultural demand – by 2050:
• 9.3 billion global population, 34% higher than today• 70% of the world’s population will be urban, compared to
49% today• food production (net of food used for biofuels) must
increase by 70%
• According to these projections, and in order to achieve the forecasted food levels by 2050, a total investment of USD 83 billion per annum will be required
• A large part of this investment will need to be focused on R&D
12
Open Data in Agriculture
One of the most promising routes to agriculture modernisation is the provision of Open Data to all interested parties
• In an era of Big Data, one of the most promising routes to achieve R&D excellence in agriculture is Open Data, and in particular:– provisioning, – maintaining,– enriching with relevant metadata and– making openly available a vast amount of open agricultural data
• The use and wide dissemination of these data sets is strongly advocated by a number of global and national policy makers such as:– The New Alliance for Food Security and Nutrition G-8 initiative– FAO of the UN– DEFRA & DFID in UK– USDA & USAID in the US
13
There is a tremendous global business opportunity for
companies that can leverage open agricultural data and expose such data into real-
world agricultural applications
at the core
• publications, theses, reports, other grey literature• educational material and content, courseware• primary data, such as measurements & observations
– structured, e.g. datasets as tables– digitized, e.g. images, videos
• secondary data, such as processed elaborations– e.g. dendrograms, pie charts, models
• provenance information, incl. authors, their organizations and projects
• experimental protocols & methods• social data, tags, ratings, etc.• …
research(+) content
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..
educators’ view
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..
researchers’ view
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..
practioners’ view
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..
• aim is:promoting data sharing and
consumption related to any research activity aimed at improving productivity and quality of crops
ICT for computing, connectivity, storage, instrumentation
research data infrastructures
Publisher
Date Catalog
SubjectID
AuthorTitle
we actually share metadata
…sometimes, data also included
metadata aggregations
• concerns viewing merged collections of metadata records from different sources
• useful: when access to specific supersets or subsets of networked collections– records actually stored at aggregator– or queries distributed at virtually aggregated
collections
23
typically look like this
24 Ternier et al., 2010
metadata aggregation tools
More than a harvester:
Validation Service Repository Software Registry Service Harvester
25
Powered by
workflows with commonalities
Harvesting Validating Transforming
OAI target - XMLs
IndexingStoring
Automatic metadata generation
De - duplication service
XMLs
Triplification
typical problem: computing
typical problem: hosting
to curate & preserve we need
even when machinery exists there are problems
• hardware maintenance• technical support• interoperability limitations
– no APIs for the dissemination of data across systems
• hardware costs
the cloud approach
Students
Researchers
Academics
Storage and Processing Monitoring/Management/Allocation layer
Virtualization of Infrastructure Layer
Virtual Machines
Virtualization of Infrastructure LayerVirtualized Infrastractures Management LayerGUI tools and APIs
Cloud provider A Cloud provider B Cloud provider B
what can be hosted on the cloud
• Data storage & management tools– APIs for content dissemination in large networks
• Processing & visualisation tools• Metadata aggregation infra• Search engines and apps for institutions or
communities
what data providers need
… only a browser and internet connection
examples
CASE 1: DATA MANAGEMENT TOOL OVER THE CLOUD
Educational Pathway Authoring Tool
Educational Pathway Authoring Tool
today
in the cloud
comparing costs for hosting data management tool at own site and cloud
Cloud•cloud hosting = 20 euros/month•set up effort = 1hr•back up included
•Total for 5 years = 1200 euros
Hosting at institution•1 server+monitor+ups = 1200 euros•set up > 1 day effort or 100 euros•hardware maintenance effort = difficult to be defined but significant
•Total for 5 years = 1300 +personnel for hardware maintenance+ costs of unexpected HW breakdowns e.g. supplier, hard disk
Costs of software support could be the same for both cases
Costs of software support could be the same for both cases
After 5 years the HW should be renewed/upgraded
After 5 years the HW should be renewed/upgraded
CASE 2: GRID-POWERED MEGA DATA POOLS
today
today
today
we create data silos
CASE 3: SETTING UP SEARCH SERVICE/PORTAL OVER THE CLOUD
today
Metadata aggregator for educational content
Search API
Template customizationhtml, css, Ajax, JS
Agg
rega
tor
Educational collection management tool
Metadata aggregator for other data types
Search API
Data management tool
Inst
itutio
n
specialise & replicate (a lot!)
Metadata aggregator for educational content
Specialised API
Template customizationhtml, css, Ajax, JS
Clo
ud
Educational collection management tool
Metadata aggregator for other data types
Specialised API
Data management tool
widget in Facebook page
exploitation
Our aim
To create data-powered innovation ecosystems around
organisations generating, managing & sharing digital
collections+
Need: to cover a specific gap in a data-powered innovation ecosystem
Open data providers (cultural institutions,
public sector etc)
Open data providers (cultural institutions,
public sector etc)
Creative start ups & industry
Creative start ups & industry
Innovative data-powered start upsInnovative data-
powered start upsVCs / angel investors
IncubatorsVCs / angel investors
IncubatorsOpen DataIncubatorOpen DataIncubator
Data scientists, tech start ups,
etc.
Data scientists, tech start ups,
etc.
54
missing component
• We work in focused efforts that will bring together and support three different groups of start-ups:
– Start-ups that process agro data (data science powered)
– Start-ups that build apps on agro data (agro data consumers, agro apps producers)
– Start-ups that develop innovative agro/ food products (agro apps consumers)
55
We want to create a new generation of domain-focused SMEs
Open Agro Data Incubation programme
Open Agro Data Hackathon
Open Agro Data Hackathon
Open Agro Data Boot camp
Open Agro Data Boot camp
Open Agro Data Meet Ups
Open Agro Data Investor Days
Open Agro Data Investor Days
Open Agro Data Introductory
Course
Open Agro Data Introductory
Course
We believe that a community-powered comprehensive, end-to-end, modular approach can greatly facilitate the process of attracting, selecting
and incubating data-powered start-ups in the knowledge domain of agriculture
56
Ope
n Da
ta
Incu
bato
r
Abst
ract
and
gen
eric
Appl
icabl
e to
any
know
ledg
e do
mai
n
Attra
ctive
to m
ajor
stak
ehol
ders
such
as
Euro
pean
a
Ope
n Ag
ro D
ata
Incu
bato
r
A re
al-w
orld
, tan
gibl
e
proo
f-of-c
once
pt fo
r
the
Open
Dat
a In
cuba
tor
Appl
icabl
e to
the
Agro
-Bio
dive
rsity
know
ledg
e do
mai
ns
Attra
ctive
to
sust
aina
bilit
y
incu
bato
rs, i
nves
tors
,
and
stak
ehol
ders
we believe that it can be generalised
summing up