data sharing and analytics in research and learning
TRANSCRIPT
Data sharing and analytics in research and learningChair: Professor Martin Hall
01/05/2023
1
01/05/2023
IntroductionProfessor Martin Hall
01/05/2023
Learning analytics: progress and solutionsNiall Sclater and Michael Webb, Jisc
Learning analytics: progress & solutions 4
Learning analyticsProgess & Solutions
Niall Sclater & Michael Webb, Jisc@sclater @michaeldwebb
06/07/2016
Learning analytics: progress & solutions 5
“learning analytics is the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimising learning and the environments in which it occurs”
SoLAR – Society for Learning Analytics Research
06/07/2016
Learning analytics: progress & solutions 6
» Problems identified in 2nd week of semester
» Interventions include:› Posting signal on student’s home page› Emailing or texting them› Arranging a meeting
» Courses that deploy signals see consistently better grades
» Students on Signals sought help earlier and more frequently
Early alert and student success
06/07/2016
Learning analytics: progress & solutions 7
Recommender systems
06/07/2016
Desire2Learn Degree Compass
Learning analytics: progress & solutions 8
Adaptive learning
06/07/2016
The Brightspace LeaP adaptive learning engine
9
Curriculum design
» A key piece of learning content is not being accessed by most students
» Some students are not participating well in collaborative work
» A particular minority group is underperforming in an aspect of the curriculum
» Students across several discussion groups are making only minimal contributions to their forums
06/07/2016Learning analytics: progress & solutions
Learning analytics: progress & solutions 10
» Total hits is strongest predictor of success
» Assessment activity hits is second» Metrics relating to current effort
(esp VLE usage) are much better predictors of success than historical or demographic data.
(John Whitmer)
California State University - Chico
06/07/2016
Learning analytics: progress & solutions 11
“a student with average intelligence who works hard is just as likely to get a good grade as a student that has above-average intelligence but does not exert any effort”(Pistilli & Arnold, 2010)
06/07/2016
Learning analytics: progress & solutions 12
» Predictive early alert model transferred to different institutions
» Around 75% of at-risk students were identified» Most significant predictors were:
› Marks on course so far› GPA› Current academic standing
(Jayaprakesh et al.)
Marist College, New York
06/07/2016
13
Retention in England
»178,100 students aged 16-18 failed to finish post-secondary school qualifications they started in the 2012/13 academic year› costing £814 million a year - 12 per cent of all government
spending on post-16 education and skills (Centre for Economic and Social Inclusion)
»8% of undergraduates drop out in their first year of study › This costs universities around £33,000 per student
»students with 340 UCAS points or above were considerably less likely (4%) than those with fewer UCAS points (9%) to leave their courses without their award
06/07/2016Learning analytics: progress & solutions
14
Attainment in England
» 70% of students reporting a parent with HE qualifications achieved an upper degree, as against 64% of students reporting no parent with HE qualifications
»Overall, 70% of White students and 52% of BME students achieved an upper degree
06/07/2016Learning analytics: progress & solutions
Learning analytics: progress & solutions 15
Jisc Effective Learning Analytics project
06/07/2016
»Expressions of interested: 85+»Engaged in activity: 35»Discovery to Sept 16: agreed (28), completed (18),
reported (11)»Learning Analytics Pre-Implementation: (12)»Learning Analytics Implementation: (7)
16
Effective learning analytics programme
ECAR Analytics Maturity Index for Higher Education
UK Learning Analytics Network
06/07/2016Learning analytics: progress & solutions
Learning analytics: progress & solutions 1706/07/2016
18
Group Name QuestionMain type
Importance Responsibility
2 Consent Adverse impact of opting out on individual
If a student is allowed to opt out of data collection and analysis could this have a negative impact on their academic progress?
Ethical 1 Analytics Committee
7 Action Conflict with study goals
What should a student do if the suggestions are in conflict with their study goals?
Ethical 3 Student
8 Adverse impact
Oversimplification How can institutions avoid overly simplistic metrics and decision making which ignore personal circumstances?
Ethical 1 Educational researcher
86 issues in 9 groups
Available from Effective learning analytics blog: analytics.jiscinvolve.org06/07/2016Learning analytics: progress & solutions
19
Group Name QuestionMain type
Importance Responsibility
2 Consent Adverse impact of opting out on individual
If a student is allowed to opt out of data collection and analysis could this have a negative impact on their academic progress?
Ethical 1 Analytics Committee
7 Action Conflict with study goals
What should a student do if the suggestions are in conflict with their study goals?
Ethical 3 Student
8 Adverse impact
Oversimplification How can institutions avoid overly simplistic metrics and decision making which ignore personal circumstances?
Ethical 1 Educational researcher
86 issues
jisc.ac.uk/guides/code-of-practice-for-learning-analytics06/07/2016Learning analytics: progress & solutions
20
Times Higher, 25 Feb. 201606/07/2016Learning analytics: progress & solutions
21ECAR Analytics Maturity Index for Higher Education
Discovery Phase
06/07/2016Learning analytics: progress & solutions
Learning analytics: progress & solutions 22
Implementation process
06/07/2016
5. Implementa
tion Support
4. Signed-up for
Service
3. Institutional Readiness
2. Self-assessme
nt 1.
Workshop
»2016 - 17
Learning analytics: progress & solutions 23
Discovery readiness questionnaire
06/07/2016
• Culture and Vision• Strategy and Investment• Structure and governance• Technology and data• Skills
Learning analytics: progress & solutions 24
Guidelines / checklist
06/07/2016
Culture and Organisation Setup
Decide on institutional aims for learning analytics
Senior management approval and you have a nominated project lead
Undertake the readiness assessment Decision on learning analytics
products to pilot Legal and ethical considerations in
hand Address readiness recommendations Data processing agreement signed Select student groups for the pilot
and engage staff/students
Technical setup Learning records warehouse setup Extract student data to UDD and
upload to LRW Historical data extracted from the VLE
and SRS and uploaded to the LRW VLE plugin installed and live data
being uploaded View in data explorer to check valid Contact Jisc to start implementation
25ECAR Analytics Maturity Index for Higher Education
Architecture
06/07/2016Learning analytics: progress & solutions
Learning analytics: progress & solutions 26
Project partners
06/07/2016
27
Learning Analytics architecture
06/07/2016Learning analytics: progress & solutions
28
Unified data definitions
06/07/2016Learning analytics: progress & solutions
Learning analytics: progress & solutions 29
Service: Dashboards
Visual tools to allow lecturers, module leaders, senior staff and support staff to view: »student engagement»cohort comparisons»etc…
Based on either commercial tools from Tribal (Student Insight) or open source tools from Unicon/Marist (OpenDashBoard)06/07/2016
Learning analytics: progress & solutions 30
Service: Alert and intervention system
Tools to allow management of interactions with students once risk has been identified:
»case management» intervention management»data fed back into model»etc…
Based on open source tools from Unicon/Marist (Student Success Plan)06/07/2016
Learning analytics: progress & solutions 31
Service: Student App
»Comparative»Social»Gameified»Private by default»Usable standalone»Uncluttered
06/07/2016
Learning analytics: progress & solutions 3206/07/2016
Learning analytics: progress & solutions 3306/07/2016
Learning analytics: progress & solutions 3406/07/2016
Learning analytics: progress & solutions 3506/07/2016
Learning analytics: progress & solutions 3606/07/2016
Learning analytics: progress & solutions 3706/07/2016
Learning analytics: progress & solutions 3806/07/2016
jisc.ac.uk
39
Michael [email protected] @michaeldwebb
analytics.jiscinvolve.org
06/07/2016Learning analytics: progress & solutions
Niall [email protected] @sclater
01/05/2023
Reading analyticsClifford Lynch, CNI
01/05/2023
»AWAITING CONTENT
01/05/2023
Sharing data safely and its re-use for analyticsDavid Fergusson, The Francis Crick Institute
The Francis Crick Institute
Sharing Data Safely and re-use for analytics
David Fergusson
Introduction
44
Challenges for ”big data” science in the UK
Distributed Data Sets
Distributed computing resources
Separate authentication/authorization mechanisms
Researchers want to combine and synthesise data
How do we do this?45
Example
Dr David Fergusson,Head of Scientific Computing,Francis Crick Institute
Challenges of providing shared platformsfor staff from existing institutes– CRUK London Research Institute– National Institute for Medical Research
Compute and data requirements for 1,250 scientists working in biomed– In a central London building
Direction of travel towards more and wider collaboration, requirement for controlled sharing of sensitive data
46
Photo credit: Francis Crick Institute
Example
47
Dr Jeremy Yates, STFC DiRAC & SKA:› The National e-Infrastructure for research &
innovation– A 60,000 foot view– Democratisation & Aggiornamento
› Moving to a more cloud-centric view ofscientific computing
› Scientific computing that is not just “HPC”› Changing the culture around Research
Software Engineering› Making industrial access to facilities the norm› Inter-disciplinary science – blockers and
enablers
Image credit: Courtesy of EPSRC
Addressing the problem
SafeShare – shared secure authorisation/authentication
Shared Data Centre(s) – avoid costly/insecure moving of data
eMedlab – collaborative science/shared operations model
48
UK e-Infrastructure
A new bottom up approach
49
People’s National eInfrastructure
Uganda
Medical Bioinformatics
Business and local government
ESRC £64M
MRC £120M
SECURE
What has worked?
Consolidation through collaboration
Swansea: One system supporting Farr Wales, ADRC Wales, MRC CLIMB, Dementia Platform UK
Scotland: EPCC supporting Farr Scotland and ADRC Scotland, leveraging expertise from Archer, UK-RDF
Leeds: ARC supporting Farr HeRC, Leeds Med Bio, Consumer Data RC
Slough DC: eMedLab, Imperial Med Bio, KCL bio cluster
Jisc network: Safe Share
JISC SafeShare
52
John Chapman, Deputy head, information security, JiscThe safe share project
About Jisc » AssentAssent:
Single, unifying technology that enables you to effectively manage and control access to a wide range of web and non-web services and applications.
These include cloud infrastructures, High Performance Computing, Grid Computing and commonly deployed services such as email, file store, remote access and instant messaging
54
About Jisc » Safe ShareSafe Share:
Providing and building services on encrypted VPN infrastructure between organisations
Enhanced confidentiality and integrity requirements per ISO27001
Requirement to move electronic health data securely and support research collaboration
Working with biomedical researchers at Farr Institute, MRC Medical Bioinformatics initiative, ESRC Administrative Data Centres
55
The safe share project
The safe share project 56
• What: a pilot project enabling the secure exchange of data collected by Government and the NHS using an encrypted overlay over the Janet network to facilitate appropriate analysis between project sites
• • AND reusing existing services to increase authentication for
researchers
• Why: easier, secure access to research data to further knowledge of diseases and ill health to improve medical treatments in the long-term
• When: running from November 2014 – March 2017
The safe share project
The safe share project 57
Background
• Substantial investment in medical and administrative data research to generate benefits to society from the appropriate analysis of data collected by Government and the NHS
• E.g. to further knowledge e.g. of disease and ill health to improve medical treatments
Challenges
• Health data, and other routinely collected data on people’s lives, are very personal and sensitive
• Significant numbers of ethical, consensual and practical hurdles to making appropriate use of the sensitive data for research
The safe share project
The safe share project 58
Drivers
• Requirement for connectivity to move and access electronic health data securely
• Challenge to give public confidence that data is appropriately protected
• Provide economies of scale in secure connectivity
The safe share project
• Jisc management and funding of £960k to pilot potential solutions with the aim of developing a service in 2016/17
Partners
The safe share project 59
University of Bristol
Cardiff University
University of Leeds
Swansea University
University of Edinburgh
UCLFrancis Crick Institute
University of Oxford
University of Southampton
University of Manchester
St Andrews University
The Farr Institute The MRC Medical Bioinformatics initiative
The Administrative Data Research Network
University of BristolCardiff UniversityUniversity of EdinburghFrancis Crick InstituteUniversity of LeedsUCLUniversity of ManchesterUniversity of OxfordUniversity of St AndrewsUniversity of SouthamptonSwansea University
The safe share project
The safe share project 60
Authentication, Authorisation and Accounting Infrastructure (AAAI)
Use Cases:• HeRC, N8 HPC – access between facilities using home institution
credentials
• eMedLab – partners will be able to use a common AAAI to access this new system (for analysis of for instance human genome data, medical images, clinical, psychological and social data)
• Swansea University Health Informatics Group – investigating Moonshot as an authentication mechanism to allow use of home institution credentials
• University of Oxford: to enable researchers to use home institution credentials for authentication to request access to datasets for studies e.g. into dementia
The safe share project
The safe share project 61
Example “service slice”: Farr Institution
LANSafe sharecore
Janet, internet or other network
Farr trusted environments
safe share router at edge
The safe share project
The safe share project 62
Example “service slice”: Farr Institution
LAN
Farr trusted environments
Janet, internet or other network
safe share router at edge
Safe sharecore
UK Academic
Shared Data Centre
63
Shared data centre
£900K investment from HEFCEAnchor tenants:
–Francis Crick Institute–King’s College London–London School of Economics–Queen Mary University of London–Wellcome Trust Sanger Institute–University College London 64
Potential cost-saving/resource benefits
Jisc Shared Datacentre is already a cost savingeMedLab award, and need for quick spend, gave impetus to UCL, KCL, QMUL, Sanger, LSE and Crick to identify off-site datacentre hosting (Slough)–Anchor tenants get price reduction based on volume of space used
Procurement led by JiscDatacentre connected to Janet network (Jisc investment) Improved PUE; Slough 1.25 cf ~2 for HEI datacentre (UCL save ~£2M p.a.)
Datacentre Connection Topology
N3/PSNH/PSN
eMedLab
Collaborative scienceShared Operation
67
Objectives - Flexibility
• To help generate new insights and clinical outcomes by combining data from diverse sources and disciplines
• Bring computing workloads to the data, minimising the need for costly data movements
• To allow customised use of resources• To enable innovative ways of working collaboratively• To allow a distributed support model
68
Institutional Collaboration
Support team
eMedLab academy• Training via CDFs and courses• Promote collaborations via “Labs”
eMedLab infrastructure• Shared computer cluster• Integrate exchange heterogeneous
data • Methods and insights across diseases
eMedLabis a hub
6+1 partners
3 data types
electronic health records
genomic
images
3 expertises
clinician scientists
analytics
basic science
3 disease areas
rare
cancer
cardio
>6M patients
What is eMedLab?
Distributed/Federated support(What has worked/savings ..)
eMedLabOps team(shared team)
Knowledge sharing/transfer
(inc. developing UK industrial capacity –OCF/OpenStack)
Support
SupportSupport
SupportSupport
Support
Many projects, same challenges
Information governanceSecure data transferUser managementAAAIWorking with Janet to explore how to support most/all projects
Cultural Barriers Challenges
Finance – government funding with spend window of 1 year only+Mitigated by use of efficient procurement teams and framework agreements
+Working closely with vendors to ensure tight time targets met- Drain on (unfunded) project management and finance team resources
Regulatory challenge+Mitigated by clear policies, governance, supported by training+Changing EU data protection legislation- Risk of bad PR and/or data leaksPeople +Everyone is open, collaborative, generous with time and knowledge
eMedLab production service Projects• UCL & WTSI - Enabling Collaborative Medical Genomics Analysis Using Arvados – Javier Herrero
• Crick KCL UCL - A scalable and flexible collaborative eMedLab cancer genomics cluster to share large-scale datasets and computational resources – Peter van Loo
• UCL QMUL Farr - Creating and exploiting research datamart using i2b2 and novel data-driven methods - Spiros Denaxas
• LSHTM & QMUL - An evaluation of a genomic analysis tools VM on the EMedLab, applied to infectious disease projects at the LSHTM using data from EBI and Sanger & Genetic Analysis of UK Biobank Data - Taane Clark & Helen Warren
• UCL & ICH - The HIGH-5 Programme - High definition, in-depth phenotyping at GOSH, plus related projects - Phil Beales & Hywel Williams & Chela James
eMedLabenablesprojectseMedLab brings data and
expertise together across diseases(potential)
• Mechanisms of cancer diversity and genome instability
• Better understanding of biomarkers• DARWIN Clinical Trial to target clonal drivers
Cancer evolution and heterogeneity (Swanton & Van Loo)
• Cancers evolve heterogeneously• Diverse driver mutations and instability mechanisms
• TracerX: Track lung cancer evolution• Data: genomes, MRI, molecular pathology• Who: clinicians, statisticians, evolutionary biologists
People
Alan Real, Bob Day, Bruno Silva, Clare Gryce, David Fergusson, Emily Jefferson, Jacky Pallas, Jeremy Sharp, John Ainsworth, John Chapman, Jonathan Monk, Mark Parsons, Ric Passey, Richard Christie, Rhys Smith, Simon Thompson, Simon Thompson, Spiros Denaxas, Stephen Newhouse, Steve Pavis, Tanvi Desai, Tim Cutts and others …........
Thank you for reading the information within this document; you have now reached the end.
79
Data sharing and analytics in research and learningChair: Phil Richards, Jisc
01/05/2023
80