aalborg universitet video life cycle data management (vdm) the … · 3. the domain limitation:...

15
Aalborg Universitet Video Life Cycle Data Management (VDM) The management of video data from educational research – Final Report (extended version) Otrel-Cass, Kathrin; Andersen, Bjarne; Offersgaard, Lene; Eckardt, Max Roald Creative Commons License CC BY-NC-SA 4.0 Publication date: 2018 Document Version Version created as part of publication process; publisher's layout; not normally made publicly available Link to publication from Aalborg University Citation for published version (APA): Otrel-Cass, K., Andersen, B., Offersgaard, L., & Eckardt, M. R. (2018). Video Life Cycle Data Management (VDM): The management of video data from educational research – Final Report (extended version). Aalborg University. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. ? Users may download and print one copy of any publication from the public portal for the purpose of private study or research. ? You may not further distribute the material or use it for any profit-making activity or commercial gain ? You may freely distribute the URL identifying the publication in the public portal ? Take down policy If you believe that this document breaches copyright please contact us at [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from vbn.aau.dk on: September 28, 2020

Upload: others

Post on 25-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

Aalborg Universitet

Video Life Cycle Data Management (VDM)

The management of video data from educational research – Final Report (extended version)

Otrel-Cass, Kathrin; Andersen, Bjarne; Offersgaard, Lene; Eckardt, Max Roald

Creative Commons LicenseCC BY-NC-SA 4.0

Publication date:2018

Document VersionVersion created as part of publication process; publisher's layout; not normally made publicly available

Link to publication from Aalborg University

Citation for published version (APA):Otrel-Cass, K., Andersen, B., Offersgaard, L., & Eckardt, M. R. (2018). Video Life Cycle Data Management(VDM): The management of video data from educational research – Final Report (extended version). AalborgUniversity.

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

? Users may download and print one copy of any publication from the public portal for the purpose of private study or research. ? You may not further distribute the material or use it for any profit-making activity or commercial gain ? You may freely distribute the URL identifying the publication in the public portal ?

Take down policyIf you believe that this document breaches copyright please contact us at [email protected] providing details, and we will remove access tothe work immediately and investigate your claim.

Downloaded from vbn.aau.dk on: September 28, 2020

Page 2: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

Video Life Cycle DataManagement(VDM)

The management of video datafromeducationalresearch–FinalReport(extendedversion)

October2018KathrinOtrel-CassBjarneAndersenLeneOffersgaardMaxEckardt

Page 3: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

1. Theprojectandtheinfrastructure

The overall intention of this pilot project was to explore the data management processes and develop and testinfrastructures and resources that support research with video technology (from more to less sophisticated levels ofanalysis),particularlyrelevanttoeducationalresearch,because:

● Videohasbecomeastandardresearchtoolamongsteducationalresearchersandisfrequentlyshared;● Videodatacaptureshighlysensitivedataofandaboutpeople,includingvulnerablegroupslikechildren;● Video data requires that researchers safely collect, store, analyse, discard, and share video data (see recent

updatestoEuropeandataprotectionlaw,GDPRhttps://tinyurl.com/yb8eda6d).What comprises video data: video, audio files, and field notes (ie associated word documents), metadata, ethicalprocedures(datamanagementplans,informedconsent).Itincludesraw,annotatedandcodeddata.

Theaimwas toexplore,develop,pilot, andevaluatehowexisting research infrastructurecouldbeusedandextended tosupport educational research using video. In particular, the project explored existing infrastructure, including: Kaltura,Edumedia2, CLARIN.dk, DMPonline, and data.deic.dk (now sciencedata.dk) to identify extensions and develop & testprototypemodifications.

Projectorganisationandexecution:Theprojectinvolvedteammembersfromthefollowinginstitutions:AalborgUniversity(AAU), The Royal Danish Library (KB), Copenhagen University (KU), and The University of Southern Denmark (SDU). Inaddition, experts from the following institutions were involved for piloting purposes: Waikato University, New Zealand,UniversityofOslo,Norway,andAarhusUniversity,Denmark.Theproject involved researchersand technicalexperts. Theprojectwasmanagedbyacentralteamwithexpertiseindifferentaspectstodowithvideodataresearch/storage,headedbyKathrinOtrel-Cass(AAU),BjarneAndersen(KB),LeneOffersgaard(KU),andMaxRoaldEckardt(SDU).Eachinstitutionalteamleaderwassupportedbyan institutional teamconsistingofeitherresearcher/s, technicalstaff,orboth.Theprojectworkedalongsideaprojecttimeline,heldregularmanagementboardmeetings,wholeteammeetings,andconsultedwithmembersoftheDanishe-Infrastructure(DeiC)toensurethesuccessfulexecutionoftheproject.SinceworkontheKalturaMediaspacewasakeyaspectintheproject,theteamwasgratefulforthesupportprovidedbyKalturawhosponsoredthelicenses,organisedtheinstallationoftheplatformaswellasprovidingon-callsupport.Inordertoexecutesometechnicalworktodowiththeimplementationandset-upofaweb-interface,theprojectinvolvedalsoexpertsfromtheReload,anITsolutionscompany.

The project aimswere executed by drawing on real test cases(video data), provided by the research teams at AAU, KU, andSDU to ensure relevance for developing the prototype or anyother resources. This means that examinations of existinginfrastructure (for example: https://DMPonline.deic.dk) werealways conducted in context. The project process involved thedifferent steps Analyse-Define-Ideate-Prototype-Test asillustratedinFigure1.

Figure1:DesignprocessintheVDMproject

Page 4: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

Academicrooting

SincethisprojectwasconcernedwithsupportingtheDanishresearchcommunityitwasimportanttogroundtheprojectinacademic expertise. This was achieved by building on a national team of experts representing nested and connectedresearchteamsandinitiativessituatedindigitalhumanitiesresearch.TheacademicresearchteamsinvolvedintheprojectwereconnectedtoVILA-TheVideoResearchLabatAalborgUniversity,thevideolaboftheUniversityofSouthernDenmarkat campus Kolding; and the section Center for Language Technology at Institute for Nordic Research at CopenhagenUniversity.

2. AchievementsofthepilotprojectA key aim in the project was to develop new infrastructure for sharing video for research purposes based on existingplatforms. For that reason, we explored: Kaltura, Edumedia2, CLARIN.dk, DMPonline, and data.deic.dk (nowsciencedata.dk).

Afteraninitialexaminationbytheteam,itwasdeterminedthatEdumedia2wasunsuitableasatestplatformsotheprojectconcentrated toworkwith the functionalities of a new and clean instance of KalturaMediaspace sponsored by Kaltura.data.deic.dkwas also explored; however, itwas also identified as unsuitable to support the sharing of data, and sharedworkingwithvideo files, although it representsa secureplatform for thesharingofprojectdataatanational level sinceinternational collaborators experience difficulties accessing the site. Efforts were therefore invested into exploring anddevelopingtheprototypeforthisprojectwithinfrastructuresKalturaMediaspace,Clarin.dkandDMPonline.Theresultsofthe pilot project will be presented by the following topics that represent the nature and focus of the different workpackages:

1. NeedsAnalysis2. DesignofPrototype3. ImplementationofPrototype4. PilotTestingofPrototype

2.1.NeedsAnalysis

Steponeinthisprojectwasaliteraturereviewtoidentifywhataspectsacademicresearchhadbeenidentifiedasnecessaryforvideobasededucationalresearch.Thespecificfocusonthiskindofresearchhadtodowithitbeing;a)ahighlypopularresearchmethodinthelearningsciences,andb)beingresearchthatoftendealtwithhighlysensitivedata,thusdemandinghigh levelsof care in regards to thecollection,processing, sharing,and storingofdata.The literature review formed thebasisfortheneedsanalysis.Theliteraturereviewidentifiedthefollowingkeystepsinvideobasedresearch:Datacollection;DocumentationandMetadata;EthicsandLegalCompliance;StorageandBackup;SelectionandPreservation;DataSharing;andResponsibilitiesandResources.

BasedonthreecontextualcasessuppliedbytheacademicpartnersAAU,KU,andSDU,theprojectteamdevelopedavideolife cyclemanagement flowchart (see Figure 2 below or visit https://bit.ly/2yDl2fZ) to identify the different steps videobased researchers would need to typically consider, from planning to long term archiving. As part of this process theresearcherswentthroughadialogicalprocesssupportedbycard-sorting(identifyingthedifferentsteps)toprovidetheinputforthedesignoftheflowchart.

Page 5: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

Figure2:VideoLifeCycleFlowChart

The needs analysis was used for the Design of the Prototype (2.2) to identify key tools researchers require andfunctionalitiesthatasharingplatformwouldneed.Theneedsanalysisformedthebasisforfurtherdevelopmentalactivitiesinthisproject.

Activities Deliverables

Stakeholder analysis (stakeholders: individual, systems,roles)

State of the Art on (video) data management planningwithinhumanisticresearch(seeappendix1)

User cases / Process description of scenarios (mapsstakeholderstoservices)

Stakeholder Analysis, Process Description (software,scenarios), this was achieved by using the three casestudies

Specificationofinfrastructureanddatamanagementplan Inputintovideolifecycledatamanagementflowchart

Software Catalog including the integration of the videodatamanagementwiththenationalDIGHUMLABnetwork

Needs analysis based on test cases identifying softwareneeds to then select those with the potential fordevelopment(seeappendix2)

Table1:ActivitiesandDeliverablesNeedsAnalysis

2.2.DesignofPrototype

Thedesignoftheprototypeofinfrastructureserviceshadtheaimtosetupinfrastructurethatsupportsresearchers’workthrough the video data lifecycle phases utilising existing infrastructure, including data.DeiC.dk, as part of the storagepractices.

Page 6: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

The design work was shaped by the development of a prototype mock-up interface (wireframes) that facilitated thediscussionsaroundwhatwasviableandusefulandqualifiedtheprioritisationwhenitcametotheactual implementationaroundtheprototype.

Themock-upinterfaceplayedanimportantrolesinceitservedasinspirationincludingpossiblefuturedevelopmentsaskindof a “front-door” leading researchers working with video in the relevant directions. It resulted also in prompting thepreparationofmorestaticguidancethatcanbeimplementedata“front-door”andtheprototype“managevideoresearchprojects”-interface(developedinWP4)thatcouldbeimplementedwithinthisframeaswellinsteadofbeingastand-aloneinterface.

BasedontheworkflowanalysisdoneinWP2aninitialmock-upuserinterfacewasdesignedandinternallyevaluated.Thismock-upfacilitatedthediscussionsaboutthedesignoftheactualworkingprototype.

Ourmock-upinterfacedesignaimedatmaximisingusabilitywithregardtooperationalmanagementofdata.Thefeedbackfromscientists inour fieldresearchhad indicatedthattheywouldnotendorseanyoverhead inadministeringdatausagerights.Therefore,ouruserinterfacedesignbrokedownuserrightsmanagementintothreecategories:

1. denialorpermissiontousedata2. purposeofuseisforeitherresearch,teaching,orboth3. thedomainlimitation:(userwhitelist,institution,country)

Tomanageusagerightsforafileoragroupoffiles,auserofourprototypewouldrelatetoeachofthecategorieswhichwould produce a new rule. These rules for sharing files stack as a cascade of inclusions and exclusions of users acrossinstitutionsandcountries/areas.TheexamplebelowillustratesawidepermissionofafilebeingusedinDenmark,Norway,and Sweden with a supplement of Aarhus University. This cascade of permissions allows federated logins from AarhusUniversitytousethedataevenwhentheyloginfromanothercountry.Also,inthecaseofreviewedpermissions,usersfromdifferentdomainscanbediscriminatedandrespectedwithindividualrules.

MockupUserInterfacedesignformanagingfilesharingrules

The implementation displayed below is based on NodeJS, ExpressJS, and VueJS. It was planned to be connected to theKalturaAPItoadministersharingpermissionsonafileandchannelbasis.Thisprovedtobemorecomplicatedthan initialprobes intoKaltura’sdevelopmentenvironmentmadeitseem.Inour initialresearchweanticipatedthatKaltura’snodeJSAPIwouldcomplywiththeircomprehensivedocumentation.However,theportednodeJSversionwassubstantiallydifferentandincongruouswithitsdocumentationandthusitmadeitveryhardtoactuallyimplementworkingfunctionality.

Figure3:ManageFilePermissions

Page 7: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

Based on the evaluation of the initial mock-up interface and the original needs analysis, a second generation workingprototype interfacewas needed formanaging the concept “research projects” aswell as “projectmembers”. Thiswas anecessary step since the datamodel in Kaltura with the frontend of Mediaspace does not embody this. The evaluationshowedthatusingKalturas“channels” functionalityandsharingpossibilitiesasa technical solutionto implementProject-andmemberconceptswasnotobviousandoperativefortheusers.

● KalturasMediaspaceshouldserveasafrontendfor○ uploadingcontent—bothvideoaswellasotheradditionalmaterial(documents)○ streamingaccesstocontent—basedonsecuresharingin4different“levels”○ downloadingofmaterialinoriginalformat○ addingofextrametadatadefinedbyaproject

Followingtheneedsanalysis, twofurtherresourcesweredeveloped.Sincetheywerenot identifiedascrucial for fulfillingtheaimsof theproject, theseresourceswerenottestedwithusers.Thetworesources includedataflowtoolkit.dkandaninteractiveInformedConsentFormforVideoResearch.

DataFlowToolkit.dk

InresponsetotheexaminationoftheDMPonlinetemplate,theprojectidentifiedtheneedtoprovideguidancethatwouldsupportthedatamanagementprocessforresearchersworkingwithvideo.WhiletheprojecthadaccesstotheDMPonlinesandboxthroughDeiC,wedecidednottoaddnewguidancetext.This ismainlyduetothefactthatwerealisedthatthisguidancewouldstayhiddenforresearchersnotactivelyworkingwithadatamanagementplanwithinDMPonline.

DataFlowToolkit was simply produced through the content management system (CMS) onWordPress. A specific themededicatedto‘Qualitativevideodata’providesspecificinformationabouttopicssuchasethics,metadata,andmore.

Figure4:ScreenshotDataFlowToolKit(https://dataflowtoolkit.dk/object.php?otypeid=441&id=441)

Thisisafirstgenerationresourceandcanbedevelopedfurther.Furtherdetailsandthebackgroundforthefirstversionofdataflowtoolkit.dk is available here; https://vidensportal.deic.dk/da/News/Data_managment_i_praksis (doi:10.7146/aul.243.174)

InformedConsentFormforVideoResearch

AninteractiveInformedConsentFormwasdevelopedtospecificallysupportresearchersworkingwithvideo.ThistemplatewassimplyproducedutilisingGoogleForms(availableinDanishhttp://bit.ly/2K6SqA9andEnglish

Page 8: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

https://tinyurl.com/y76p2qpw)toscaffoldthedifferentstepsandconsiderationsresearchersshouldbetakingintoaccountwhenconsentissoughtfromparticipants,includingfromvulnerableparticipants(e.g.children).Thisprovidesarelativelysimpleprocesspromptingaresearchertoconsidervariousaspects(includesalsolinkstoimportantbackgrounddocumentstodowithethicalconduct),sothatbytheendalettercanbeproducedthatistailoredtosuittherequirementsofaparticularproject.ThistemplatewasalsoafirstgenerationworkingprototypeandwasfinishedbeforetheEU’sEUGeneralDataProtectionRegulation(GDPR)wasimplemented.

Figure5:ScreenshotoflandingpageofinteractiveInformedConsentFormforVideoResearch(DanishVersion)

Activities Deliverables

ITarchitecture/Designofprototype Interactive mock-up for setting file and channel permissions. The prototypeinforms about a possible way in with minimal administrative overhead. Themock-upiskeptinacoderepositoryattheRoyalDanishLibrary.Actualworkingprototypedesignedbasedondiscussionsaroundthemock-upaswellasgoingthroughtherequirementscomparingwhatwasdesiredwithwhatwas doable within the technical framework of Kaltura as well as the timeavailablefortheprototypeimplementation.

Finalizing the requirements for theupgradeof the existing Edumedia.dkplatform

Anupgradeofedumedia.dkwouldrequire:● -reductionofadministrativeoverheadtocomplywithGDPRlaws● - possibility to share video data and other relevant files through the same

interface● -reductionofadministrativeoverheadforsettingsharingpermissions● –possibilitytogenerateshareableversionsfromanyuploadedvideofiles● –possibilitytosupplypublicurlstoindividualvideofilesforembeddinginonline

publications● –possibilitytoaddandeditVDM-metadata

Design templates for datamanagementplans

Areviewofthetemplatesofferedbyhttps://DMPonline.deic.dk/,whichresultedacommentedversionoftheDEFFDM-template.Theseideaswereincorporatedintodataflowtoolkit.dk

Additionalresearchsupporttool interactive Informed Consent Form for Video Research (Danish and EnglishVersion)

Table2:ActivitiesandDeliverablesDesignofPrototype

Page 9: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

2.3.Implementationofprototype

Thedeliveredprototypeinterfaceformanagingvideoresearchprojectswasdevelopedbyanexternalcompany;Reload,inclosecooperationwithKB.ReloadandKBhaveworkedtogetheronKaltura-basedprojectsbefore,hencewhyReloadknewtheKalturaAPIupfrontandcouldstepinandhelptheimplementationwhenacriticalresourcehadtoleavethisprojectforanotherpositionatSDU.

Reloadimplementedanewwebapplicationforthebasicfunctionalityof“startinganewvideoresearchproject”aswellasmanagingmembers of the research project. The new prototype application was developed in pure Javascript using theReact.jsframeworkaswellasthestandardKalturaAPIs(http://www.kaltura.com/api_v3/testmeDoc/).

Thedecisiontoincludeanexternalgroup(Reload)forthedevelopmentofthewebapplicationhalfwaythroughtheprojecttimeline had initially not been anticipated and meant that we needed to focus on key functionality because of budgetlimitations.ForthisreasonthedecisionwasmadetoskiptheimplementationofWAYF-login,insteadweusedadedicatedloginservicefromKaltura.ThismeantthatallusersandtesterswithintheprojecthadtobeaddedmanuallytotheKalturaplatformbeforetheycouldlogin.SinceWAYF-loginisworkingwellinotherknownfrontendstoKaltura,wedidnotneedaproofofconceptforit.FurthermorethededicatedloginmethodmadeiteasytoincluderesearcherswithoutaWAYF-logininthepilottest.

Figure6:Logininboxfortheprototype

Figure7:Workingwithaprojectintheprototypeinterface

Page 10: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

IntheprototypewebinterfacefournewKalturachannelswereaddedwhenanewprojectwasstarted.Thesechannelsweresetupwithdifferentrightssettingsaswellasthecorrectmembersaddedtoeachofthenecessarychannelstohavethefourlevelsofsharingofcontentworkingright.

TheworkwithKalturawasoriginallyplannedtobedoneattheexistingedumedia.dk(alsocalledEdumedia2)platformthatisalsobasedonKaltura.Budgetwasallocated intheprojectforadditional licensesaswellashardwaretobeabletoaddanothertranscodingservertotheplatformandadditionalstoragetocopewiththepilottesting.ThedialoguewithKalturaresultedinthecompany’sinterestintheVDMprojectandtheyagreedtosponsortheprojectbydonatinganextralicenseforKalturaMediaspace,includingtherequiredbackendservices.Thus,wedidnothavetoworkdirectlyontheedumedia.dkplatformbut couldwork on a completely separatedplatformwhich could havepresentedproblems for existing users ofedumedia.dk.

TheKalturaMediaspaceinterfacewasextendedandre-workedwithafocusintheprototypeon:

● AdditionalMetadatacapabilities● SkinningtoillustratetheVDM-project(http://vdm.statsbiblioteket.dk)● RemovingstandardfeaturesnotrelevanttotheVDM-project

Theprototypewent through two iterations,and the following tablepresents the issues identified in theprototype1andhowtheywereaddressedinthesecondgenerationprototype.

IssuesidentifiedinPilotTest1 Issuesaddressed/fixedforsecondpilotprototype

Difficulties findingrelevant informationonhowtousethesystem.

Links to guides and informationwas added to the headermenuinMediaSpace.

Workflow was based on four channels in MediaSpace. ItwaspossibletodeselectsomeofthechannelsintheVDM-admin-interface,whichresultedinproblemslater.

The VDM-admin-interface was changed to make allchannels mandatory. This also mitigated a request toredirecta “create channel”-function inMediaSpace to theVDM-admin-interface.

Userswereunsureabouthowtodeleteavideo. Thiswasmadeexplicitinthepilotuserguide.

ConflictingterminologybetweentheVDM-admin-interfaceandMediaSpace regarding security settingsonvideosandchannels.

TheterminologyinVDM-adminwaschanged.

Downloadofavideowasdisabledinthefirstpilot. Thiswasdiscussedandenabledonlytotheowneravideo.

Uncertaintyonhowtouploadavideo. Better descriptionsweremade andunused and confusingfeaturesinMediaSpacewereremoved.

Moreintuitiveaccesstoauser’sownvideo. Movingalinktothemainmenuheader.

Publicchannelwerenotpublic. Securitysettingsforallfourchannelswererevised.

Table3:issuesidentifiedandaddressedinPrototype1

Implementationofintegrationwiththirdpartysystem(Clarin)

IntegratingClarin.dkwasplannedfromthebeginningtouseanexampletotesthowvideo-handlingcouldbeconnectedtootherrelevantplatformsthatholdvideoand/ormetadataaboutvideo.

TheintegrationusedtheKalturaAPItosharealinktothevideodatastoredinKaltura.ThislinkwasusedinCLARIN.dktoexposethesharablevideodatawithresourcemetadataavailable inCLARIN.dk. It is importanttonotethatonlyvideos inKaltura,thatareinpublicchannelsandmarkedforpublishing,canbeexposedthroughCLARIN.dk.

Metadatathatareexposedforavideoarecreatedintwosteps.First,someofthemetadataforthevideoinsideKalturaisfetchedandmarkedforpublishing.Second,ausergets theoptiontoextendthesemetadatawithmore information.The

Page 11: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

metadatathatareaddedinthesecondstepfocusesondescribingthevideosinawaythatallowexternalusers,notbeingfamiliarwiththedatabeforehand,tounderstandtheprovenanceandcontentofthevideos.

Afterthemetadatahasbeencuratedforpublishing intheCLARIN.dkcurationworkflow,thevideometadata isassignedaPIDwhichcanthenbeusedtocitethedatainapersistentway.

CLARIN.dk then exposes the metadata via OAI-PMH allowing others to harvest the metadata. The metadata will beharvestedbytheEuropeanCLARINmetadataaggregator“CLARINVirtualLanguageObservatory“(vlo.clarin.eu).

Toworkatproductionlevelthefollowingpointsshouldbeobserved:

● Implement an administration interfacewhere a systemadministrator can configurewhen to harvest the videosmarkedforpublishingandwhichpartofthemetadatatoharvest.

● Writeauserguidethatexplainsthedifferentstepsinthepublishingprocess.● ItshouldalsobedocumentedveryclearlyintheVDMpilotinterfacehowtopublishthesharablevideosandthat

completingthemetadataaspartoftheCLARIN.dkcurationworkflowisneededtomakethedataciteablewithapersistentidentifier(PID).

The integrationwithCLARIN.dkcanbeseenasaproof-of-concept for thesuggestedarchitecture forpublishingshareablevideos. It is important to stress thatonly very selecteddata in theKalturabackendcanbeexpected tobe sharedas thevideosare likely tobe sensitive. It shouldalsobe checked that theCLARIN.dkandKaltura integrationdoesnot allow foraccesstovideocollectionsthathavetobeprotectedassensitivedata.

Activities Deliverables

Designofprototype DocumentedintheprojectWIKIattheRoyalDanishLibrary

Implementprototypeinterface Working interface hosted at a cloud service managed bythecompanyReload.CodebasefortheprototypeiskeptattheRoyalDanishLibrary

ImplementadjustmentstoKalturaMediaspace Working interface hosted at the Royal Danish Library(http://vdm.statsbiblioteket.dk). The design andfunctionalityisdocumentedasscreenshotsinappendix3.

Secondroundofdesignandimplement Thedesignandimplementationstepswererepeatedbasedonfeedbackfromthefirstroundoftesting.

Table4:ActivitiesandDeliverablesImplementationofPrototype

2.4.Pilottestingofprototype

Pilottestingwasconductedintworounds:

Pilot1Test

Thefirstpilottestedtheprototypeversion1andwascarriedinoutinJuly-September2018byresearchersfromthethreeresearchteamsinvolvedintheproject(seedeliverables).Fivetestswerecarriedoutintotal.Theindividualfeedbackfromthetesterswascollectedandsummarisedinalistof16issues(seedeliverables).Theseissueswereanalysedandprioritisedbyimportancefromauser’sperspectiveandtheabilitytoimplementthechangewithinthescopeoftheproject.

Basedonthefeedback,themajorityofissueswereaddressedbyimprovingtheprototypeversion2,orbyprovidingclearerguidance to the user about the functionality and the capabilities of the application. A few comments were asking forchangestooadvancedfortheprototypetobesolvedwithinthescopeoftheproject.Commentsonthehandlingofissuesaredescribedinsection2.3.

Page 12: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

Pilot2Test

The improved prototype version 2 was released at the end of September and immediately tested by external testers.Researchers from Arhus University and two international universities, the University of Oslo, Norway and University ofWaikato,NewZealand,testedthesecondprototype.Theresearchersrepresenteddifferentuserprofiles,aresearcherwithatechnicalbackground,aresearcherfocussingoneducationalresearch,andaresearcherwithfocusonthedisseminationofresearch. Theresultsfromthisroundwerealsosummarised.Themainissuestopayattentiontowhengoingbeyondtheprototypearealsohighlightedbelow.

The overall feedbackwas very positive based on the intention to provide a video research platform that handles accessrights,storage,andenablessmallediting(trimming)andsharingofvideosinaflexiblesetupusing“projects”withspecificrights.

Issuestotakeintoconsiderationare:

● Toensurethatsensitivedatacanbehandledsafelyandaccessloggedare-implementationofthefrontendandthecodeaccessingtheKalturabackendwillbeneeded.

● WAYFuseraccessshouldbeincludedtoeasetheloginfornewusers,andtoensuretheidentityandaffiliationofusersaccessingpossiblysensitiveresearchdata.

● Easylinkingtoconsentformsorallowingfordetailsaboutthecontentoftheconsentforms:what isallowedforwhichpurposes:e.g.“Canbesharedinprojectgroup”,“Canbeusedineducation”.

● Considertohaveakindofworkflowbuildintothesystem:upload(batchorsinglefile),adding(detailed)metadata,curationbyotherperson(PIordatasteward)andpublication/releaseofdataforuse.

● User Guide, help text and tips will be required together with a general introduction to what the system arefocussingon.Texthastobeeasytounderstand.

● Theinterfacelayouthastogothroughareviewusingfewerdifferentfontsandsymbols.● Thesearchfunctionshouldbemoreadvancedmakingiteasyforuserstofindmaterialbasedondifferentkindof

searches,e.g.ontags,namesthatalmostmatch.Usershavedifferentneedsandwishesaboutworkingwithmetadata.Perhapsthesuggestionhavingametadatatemplatetopre-fillinforaprojectcouldsolvethisissue.

WAYFintegrationwasnotimplementedintheVDMprototype,butusingWAYFloginserviceisknowntechnologyforbothTheRoyal Library and theCLARIN group fromUniversity of Copenhagen.Nomajor obstacles are expected to implementWAYFlogin.

Furthermore,theimplementationdidnotaddressallaspectsofhandlingsensitivedata,neithertheGDPRrequiredlogging.Wedidnotconductseveresecurityteststoexploreifsensitivedatacouldbeaccessed.

Activities Deliverables

Performing pilot tests usingWAYF-login access (see notesabove)

Testtemplatesforusertestinphase1Testtemplateforusertestinphase2TesttemplateforCLARINpilottestingPrototypeDiagramsNotes(appendix4)NotesonCLARINpilotimplementation(appendix5)

Evaluationofpilottestandpilotresultsreport Summaryreportoftestphase1(appendix6)Summaryreportoftestphase2(appendix7)

Table5:ActivitiesandDeliverablesPilotTesting

3. SynthesisoffindingsandoutcomesoftheprojectIn this section we will present the synthesis of our findings. This synthesis is based on the comparison of the actualperformanceof theplatformwith thepotential ordesiredperformance.Wehavedescribed "whereweare" in thispilotproject (at thepresentstate)andwill shownow"wherewewanttobe" (ourtargetstate).Thisallowsusalsotopresent

Page 13: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

functionalitiesthathavebeen leftout,andfunctionalitieswhichwouldneedtobedevelopedforaworkingvideosharingplatform.

Thepilotprojectsetouttoexploreandexpandthefollowingfunctionalities

1. Explorationofexistinginfrastructure,including:Kaltura,Edumedia2,DMPonlinetoidentifyextensionsanddevelop&testprototypemodifications.

2. Metadataschemasthatfittheneedsofvideobasedresearchers3. Sharingmechanismsthatwillallowgroupsofresearcherstosharesensitivedatawitheachotherinasecureway4. Persistentidentifiers(PIDs)usingtheDanPID5. LongtermpreservationoforiginalvideomaterialandattachedmetadatausingtheNationalBitrepositorySoftware6. Webinterface7. Downloadfunctionalitywithinthisinterface8. Integrationwithclarin.dkasatestcase9. Simpleanalyticstobeabletoreportusageoftheplatformtorelevantinstitutions10. Addedtranscodingservertoincreasethecapacity11. Extrastoragespace12. Sharingplatformwithsimplevideoeditingcapabilities13. Legalframeworkandnecessary legalagreementsbetweenendusersandtheRoyalDanishLibraryforthesecure

depositofsensitivedata.TheVDMprojectwas an exploratory project andmanaged to carry outmost of theplannedwork. Twoof theoriginallyanticipated aims were not possible to be developed. However, those are also significant findings and outcomes of theproject:

● LongtermpreservationintheNationalBitrepositoryandusageofDanPIDaspersistentidentifiers:Becausetheanalysis of existing infrastructures including Kaltura as the main technical platform in the focus in this projectturned out to show that Kaltura is not suitable for long term preservation, this part of the original plan wasskipped.InsteadapermanentidentifierwasaddedforpublicationpurposesthroughthetestintegrationwiththeCLARIN.dkinfrastructure.

● LegalframeworkandnecessarylegalagreementsbetweenendusersandtheRoyalDanishLibraryforthesecuredeposit of sensitive datawas not implemented.Unfortunately, the Royal Danish Library lost their internal legaladvisorbeforethispartoftheprojectcouldbecarriedoutandthepositioncouldnotbefilledintime.

Recommendationsasadirectoutcomefromthispilotprojecttoimplementforaproduction-readyenvironment:

● Metadatawas discussed and published to show both the researchers and the developers point of view (seepublications)AsimpleversionofVDMmetadatawasimplementedintheresultingprototype.

● WAYF-login: Simple and known technology. Has been integrated with other front ends in other projects andprovides security layers includingupdatesof passwordsetc.One future challengemaybe if data is sharedwithresearchers outside of Denmark without WAYF-login. Edugain may allow in such a case international logins.However, thismay raisequestionsonhowprojectsand storage is fundedandwhomaypay for the serviceandespeciallyifstoragerequirementsareverylarge.

● Separate frontend insteadofKalturaMediaspace: Thisprojecthas tweakedKalturaMediaspace inmanyways,andthroughthishasithasidentifiedcertainlimitations.WefindthatKalturaMediaspaceisageneralisedfrontendthat for this purpose lacks some features and in other areas has tomany features. These limitations of KalturaMediaspacearenotchangedeasilyifatall.Tobuildawell-functioninganduser-friendlyinterfaceforworkingwithvideoresearchdata, theresourcesusedtotweakMediaSpacefurthercouldbebetterusedto implementanewtailored interface from scratch. The new implementation should include the functionality of the “manage videoresearchproject”-prototypeinterface(theReloadprototypedevelopment).Weestimatethataninitialversionofsuchnewinterfacewilltakeseveralmanmonthsofexpertdeveloperswork.

● PIDs:ImplementationofPIDsforpublishing/sharing.ThiswasonlyimplementedintheprototypeimplementationtotheCLARIN-platform.ThehandletechnologyiswellknownandstandardtechnologysoitwouldbepossibleinthefutureinatailoredinterfacetoaddPIDstorelevantcontentinsidetheKalturaplatform.Thiscouldbeeitherforallcontentoronlyforpartsofthecontent,e.g.contentonthe“publication”channelofaprojecttobeabletolinkwithapermanentidentifierinaresearchpaper.

Page 14: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

● Guidance inside DMPonline: During the pilot work we reviewed DMPonline and developed a list ofrecommendations that could be added to the DMPonline guidance. However, we recommend that necessaryinformation could stay hiddenwithin the DMPonline template. However, guidance inside DMPonline could linkdirectlytootherinformationportalstobeabletomaintaintextsnippetsonlyonce(forexamplewhatiscontainedintheDataFlowToolkitortheInteractiveInformedConsentTemplate).

● FurtherworkwithDataFlowToolkit: DataFlowToolkit can be easily extendedwithmore information on how tohandle data. One initiative could settle around making extended and more detailed text between themes(metadata, securityetc.)andprocesses (create,editetc.).Thiswouldenhance theguidance. Ifnecessary,a splitcouldbemadeintodifferentdataobjecttypes,e.g.separatingguidancefordatafrom“simple”videoobservationfrommoreadvancedtechnologylike360camerasandthelike.

● Legalframeworkspecifictothesetupofaworkingversionofavideosharingplatform.Questionstobeaddressed:whoownswhatkindofrightstowhichkindofdataontheplatform?Bothontheshortertimeframeaswellasinconnectionwithany long termpreservation functionality.Videodata foreducational research ishighly sensitivedatasonewGDPRrulesneedtobeaddressed.

● Front door landing page: To be able to collect and publish valuable resources in a single place. This couldpotentially be implemented using a standard CMS-system depending on the amount of information and thenumbers of editors. Could maybe be part of existing knowledge portals like the DeiC eScience Vidensportal(https://www.deic.dk/da/vidensportal). An advanced versionwould include personalised information dependingoninstitutionalrelations.

4.Sustainabilityandvisionforpermanentinfrastructure

We see that a sustainable future implementation of a sharing platform for video based research has to first establishfinancialresponsibilitytorunsuchserviceatnational level. ItwouldseemnaturalthatDeiCcouldplayamajorroleinthesetup and running and the Royal Danish Librarywith regards to long term preservation.We recommend that the lattershouldalsobediscussedwith theNationalArchivesandother relevantparties.DIGHUMLABmightalsoplaya role in theorganisationofsuchaservice—e.g.onservicedesk,trainingetc.DIGHUMLABorDeiCcouldalsoplayaroleinprovidingtheplatformforsupportservicesthatprovideguidanceonvideoresearchmoregenerally,videobaseddatamanagement,andethicalpracticalities(howtoprepareinformedconsent).ForthoseauxiliaryservicestheVDMprojecthasalreadyproducedresources thatcanactasblueprints tobeusedas isor theseresourcescanbeupdated/ furtherdeveloped (seeexampleethicstemplate).TheseactivitiescouldbedevelopedbysomeoftheexistingpartnersintheVDMproject.Sincethevisionisto develop infrastructure that supports researchers it is advisable to involve researchers not only to, inform thedevelopmentofsuchtools,buttoalsoraisetheawarenessoftheresearchcommunityonhowtoworkwithavideosharingplatform,observethecodesofconductregardingthehandlingofsensitivedata,andmanagevideodataatasophisticatedlevel.

TheVDMprojectteam

AalborgUniversity

KathrinOtrel-Cass(projectleader,leadAAU),LoneDirckinck-Holmfeld,ThomasRyberg,KarstenKrygerHansen

TheRoyalDanishLibrary

BjarneAndersen(leadKB),FrankLund,ErikBertelsen,KåreF.Christiansen

CopenhagenUniversity

LeneOffersgaard(leadKU),MitchellJohnSeaton,CostanzaNavarretta

UniversityofSothernDenmark

MaxRoaldEckardt(leadSDU),JuliaRuser

Page 15: Aalborg Universitet Video Life Cycle Data Management (VDM) The … · 3. the domain limitation: (user whitelist, institution, country) To manage usage rights for a file or a group

VDM–FinalReport

Listofappendices

Appendix1VDMprojectpublicationsStateoftheartandVDMpublications

Appendix2VDMNeedsAnalysis

Appendix3ScreenshotsofPrototypeforvideosharing

Appendix4VDMPrototypeDiagramsNotes

Appendix5VDM-CLARIN-pilot_Short_notes_about_implementation

Appendix6TestPhase1_UserTest_Summary

Appendix7TestPhase2_UserTest_Summary_20181113