reconnect webinar : research classification in … understand and report on their portfolios, ......

46
a portfolio company of ReConnect Webinar : Research classification in practice 2 March 18, 2014 a portfolio company of ÜberResearch helps funding and research institutions better understand and report on their portfolios, looking internally and comparing globally. www.uberresearch.com [email protected] [email protected] [email protected] Research classification in practice: Stamping or Understanding!

Upload: vuongnga

Post on 21-May-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

a portfolio company of!

ReConnect Webinar : Research classification in practice 2

March 18, 2014

a portfolio company of!

ÜberResearch helps funding and research institutions better understand and report on their portfolios,

looking internally and comparing globally.

www.uberresearch.com [email protected]

[email protected] [email protected]

Research classification in practice: Stamping or Understanding!

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

About ÜberResearch

UberResearch: @n About us

•  Team’s 10 year experience delivering tools and services for funding and research institutions

•  Over 20 development partners

•  Active member of CASRAI, ORCID, CrossRef / FundRef, TAG

•  Example: UberWizard for ORCID

•  Portfolio company of Digital Science, the younger sibling of the Nature Publishing group

Experienced, international team

Part of the Digital Science family

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

About ÜberResearch

UberResearch: @ What we provide

Tools and services

•  Portfolio analysis and reporting

•  Reviewer identification

•  Categorization tools and support

•  Integrations – systems and content

•  Leveraging a growing global award database

•  1.2m grants from 60plus funders

•  Covering more than $600,000,000,000 of historic funding

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Today’s contents & goals

1.  Recap from webinar 1

2.  Sharing experiences with existing classification systems

–  Gerry Lawson (NERC) on the RCUK classification

–  CADRO – a disease specific classification

3.  Making Research Classification operational

–  Semantic research classification

–  Use cases and document sets

–  Implementing and operating research classification systems

–  Discussion

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

1. RECAP & QUESTIONS FROM WEBINAR 1

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Different Levels of Classification Systems

Classifications across all of science (e.g. Australian and New Zealand Standard Research Classification (ANZSRC), CASRAI Standard Classification Scheme)

Some%are%discipline)specific))+)Health%Research%Classifica2on%System%(HRCS),%ICD10,%the%NIH’s%Research,%Condi2on,%and%Disease%Categoriza2on%(RCDC)%or%MeSH%used%as%a%classifica2on.%

Some%are%domain)specific)for)narrower)topic)areas)like%Common%Alzheimer's%Disease%Research%Ontology%(CADRO)%for%Alzheimer’s;%Common%Scien2fic%Outline%(CSO)%for%Cancer,%etc.%

Funder)&)organiza8on)specific)and)one)of)the)three)levels)above:)RCUK%Subject%Classifica2ons,%RCDC%from%NIH,%DFG%Disciplinary%Classifica2on%System%%%Even%divisional)and)personal)ones)–%classifica2on%categories%to%plan%and%structure%programs%etc.%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Quality of coding

Costs Applicable to various doc

sets

Ability to mature with

science

Breadth of use cases

Summary ‘How Applied’?

Low)

High)

Semantic / automatic

Manual

Training set

Thesaurus only indexing / automatic

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Key takeaways from webinar 1

Use%case%

Classifica2on%schema%%(own,%exis2ng,%one,%many)%

Applica2on%rou2ne%%(manual,%seman2c)%%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

hVp://gtr.rcuk.ac.uk/%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

a portfolio company of!

gtrdocs.hackpad.com%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

CouncilResearch.Subject

Research.Topic

AHRC 14 138BBSRC 9 77EPSRC 24 150ESRC 14 24NERC 8 50STFC 9 43Total 78 482

Numbers%of%Classes%‘origina2ng’%with%each%Council%%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

NERC%Second%Level%Classifica2on%(Topics)%Propor2onal%Awards%for%starts%each%calendar%year%50%NERC%Core%Topics%only%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

NERC%Top%Level%Classifica2on%(Subjects)%Propor2onal%Awards%for%starts%each%calendar%year%Top%30%Subjects%%

IADRP – International Alzheimer’s Disease Research Portfolio

IADRP – International Alzheimer’s Disease Research Portfolio •  Collaborative effort of the US NIH National Institute on Aging

(NIA) and the Alzheimer’s Association (AA) •  To help coordinate and plan Alzheimer’s research

–  …enable strategic coordination of efforts among funding agencies but it will also be a tremendous resource for the research community, and the public at large. Source: NIA

•  Began in 2010 •  Public database available now, and adding funders globally

%

%

IADRP’s classification scheme – “CADRO” •  CADRO – Common Alzheimer’s Disease Research Ontology •  A unified classification system to enable comparative analysis •  Jointly developed by the NIA and AA •  Created from a study of NIA and AA funded projects from

FY04-FY10 •  Exemplar of a deep / granular scheme

•  Seven top-level categories, a three-tiered scheme to capture the complete range of AD research – basic, translational and clinical – and AD research-related resources.

•  Manually coded, with one code per project •  Coding performed by NIA IADRP, AA staff, and the staff of

the other participating funders with help from the CADRO Coding Guidelines.

Sources:%NIA,%NIA%CADRO%

Source:%IADRP%

An example three-level code, using a topic recently in the news – a possible blood test for AD:

Code: “B.1.b”: •  Category B. Diagnosis,

Assessment, and Disease Monitoring

•  1. Fluid Biomarkers •  b. Blood Biomarkers

IADRP – classification use cases

A portfolio analysis tool to: •  Track changes in the AD research landscape over

time •  Identify research gaps and areas of overlap within and

across AD funding agencies •  Identify collaborative opportunities aimed at

advancing AD research and alleviating the socioeconomic burden of this devastating disease.

Source:%NIA%

Source:%NIA%%

Source:%nia.nih.gov%%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

2. SEMANTIC RESEARCH CLASSIFICATION

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Two non-exclusive options when creating semantic classification 1.  Modelling an existing classification with semantic expressions

–  E.g. RCDC (done by us), CASRAI classification etc.

–  Starting with an existing one is perfect - since a common understanding exists already

–  It is just ‘translating’ it to a ‘machine readable’ definition

2.  Building a new category schema

–  Requires same process leading to common understanding

–  Semantic modelling can help to inform the discussion process

–  Result is a ‘machine readable’ classification

–  More work compared to model an existing system

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Continuum from Search to Categorisation

•  Quick overview

•  High flexibility

•  Challenge of long tail of irrelevant result

•  Flat

•  ‘Instant’ category

Search% Classifica2on%

•  Agreed on definitions

•  ‘ordering’ a discipline or field

•  Hierarchy

•  ‘static’ category

The%difference%is%who%is%involved,%how%it%is%defined%and%how%much%2me%is%spent%on%the%query%/%defini2on.%Not%more.%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

RCDC Example – Modelling a Classification Semantically •  RCDC is a classification from the

NIH, using thesauri as a basis for a classification

•  Requires normally a text mining technology called Fingerprinting (Collexis)

•  ÜberResearch remodelled the RCDC category system to make it applicable to other funders portfolio and datasets

•  Complex definitions requiring terms to be above certain thresholds, combination of terms etc.

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

RCDC Example – Modelling a Classification Semantically

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Challenges when designing a semantic category… •  Common understanding (internal and external) of the classification term – that

is, do all topic specialists agree which grants should be in/out of this set? Definitions matter.

•  Are you expecting a grants to appear in multiple sets? Do you mind ‘double counting’?

–  Report on amount of funding on Atherosclerosis and on the statin related research

•  2700 projects on Atherosclerosis, 403 on Statin

•  72 are overlapping. Where should they be assigned? Or can they be double counted?

•  Accept that there can never be a definitive answer – there will always be grants where their inclusion or exclusion is a moot point.

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Challenges when designing a semantic category…

False positive

Long tail of less relevant documents

Get all documents which are relevant

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

It is clear that…. Subject Matter Expertise…

•  Is still required…

•  Semantic categories are just a way to preserve it…

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

3. USE CASES, RESEARCH CLASSIFICATIONS AND DOCUMENT SETS

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Automation allows more content to be tagged…

Grant%applica2ons,%reports%

Patents%

Publica2ons%

Own%grants%

Other%grant%poraolios%

Clinical%trials%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Which document sets for which use cases?

Categorising%poraolio%for%science%policy%level%

Comparing%and%aligning%ac2vi2es%with%other%funders%

Repor2ng%on%research%funding%

Planning%b%define%‘grand%challenges’%

Reviewer%iden2fica2on%

Connec2ng%output%/%evalua2on%data%with%grants%

Gap%analysis%/%trend%analysis%

Report%financial%side%of%research%funding%%

Impact%/%outcome%analysis%

Internal%documents%(applica2ons,%progress%

reports)%Own%awards% Global%awards%

Publica2ons%(General%and%related)%

%Patents% Clinical%trials%

%%%%%%%%

%%%%%%%%

Use)case)Document)

Understanding%own%poraolio%in%detail%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

4. OPERATIONALIZATION RESEARCH CLASSIFICATION

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Content to be tagged…

•  Integration of content requires

–  Collecting content

–  Harmonizing it

–  Disambiguating it across data sources (in addition to ORCID)

–  Integrating it into one database

–  Building the analytical layer

–  Building the process support

Grants% Classifica2on%1%

Publica2ons%

Other%grants%

Classifica2on%2%

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Applying research classifications

•  Potentially developing a bespoke classification to suit your needs

•  Leverage eventually implementation of semantic representations of classifications (RCDC, CASRAI etc.)

•  Implementing the routines to automatically apply the categories

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Key takeaways from webinar 1&2

Use%case%

Classifica2on%schema%%(own,%exis2ng,%one,%many)%

Applica2on%rou2ne%%(manual,%seman2c)%%

Content%to%be%integrated%(applica2ons,%grants,%other%funder’s%poraolios,%publica2ons%patents,%trials)%%

Implementa2on%and%opera2on%

!

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Imagine a Babel Fish for Research Classifications…*

•  One system

•  All content

•  All classifications systems with automatic routines to assign them

•  Analytical views to get insights in seconds

•  Report consistently leveraging the above

•  … all other use cases served as well from the same data and same application…

* �Mapping global health research investments, time for new thinking - A Babel Fish for research data��(Terry et.al. 2012)

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Our Babel Fish has to learn a few classifications still, but… •  Shared database

•  Classification systems already modelled / in process

•  Support for ad hoc searches to modelling classifications

•  Supports the ability to share them with others

•  22 development partners (mostly funders) helping us to define the features and functions

•  Realising savings for all funders by taking care of the basics as a shared solution

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

•  Launched together with ORCID in the beginning of February

•  Free and open tool to allow researchers adding their grants from many funders in one wizard

•  Growing database of 1.2m grants from 60+ funders

•  Funders can benefit from it by adding their grants at no cost

•  Contribution to drive ORCID adoption and the representation of grants in the ORCID records

Free and open tool…

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

5. DISCUSSION & QUESTIONS

a portfolio company of!

CASRAI ReConnect Webinar: Research classification 2

Thank you! Christian Herzog ([email protected]) Ashlea Higgs ([email protected]) Steve Leicht ([email protected])