abstracts dhbenelux tuesday · currencies such as bitcoin, it seems future generations will see...

71
1 Abstracts DHBenelux 2017 conference Tuesday 4 July 2017 Session A 1. Coin Production in the Low Countries, fourteenth century to the present Rombert Stapel 1 , Jaco Zuijderduijn 2 , Jan Lucassen 1 , Kerim Meijer 1 International Institute for Social History, Amsterdam, Netherlands Lund University, Lund, Sweden This project collects, combines and makes available data on mint production in the Low Countries (Netherlands, Belgium, Luxembourg) and has developed a web application to query and visualize the data, which is also linked to a digital map of (changing) historical boundaries in the Low Countries from 1100 to the present (available in Linked Open Data). It provides scholars with a user-friendly approach to large datasets, and allows them access to such variables as regional production figures and coin denominations. Introduction Monetization is a key concept in economics and in economic history. Throughout history currencies were a crucial element of economic exchange: first in the form of metal coins, which made up the lion’s share of currencies, and were widely used in everyday transactions. Only much later paper money also emerged: before the First World War very few normal people would have ever seen paper money. Finally, nowadays non-material book money has become much more important than currencies, and with the onset of mobile banking and virtual currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. Historical societies depended as much on media of exchange as we do today: coins and paper money helped a great deal in realizing everyday transactions, as did various forms of credit. Coin production figures are of crucial importance for understanding development in the long run. 1 The study of coinage, their quantity, denominations, use (e.g. in wage payments) and monetary policy in general provides important insight in economic and social history and this project provides historians a firm quantitative basis for their research. In this paper, we will present the project and its goals, give an overview of the process of data collection and the web application we built to query and visualize the data (including geospatial visualizations), and provide some of the results for historical research that stem from our dataset. Project Coin Production in the Low Countries, fourteenth century to the present provides an overview of coin production figures covering many centuries. Of course we deal with omissions: not all mint accounts go back to the fourteenth century, and not all administration has survived. The website allows for an overview of the mint house data we have at our disposal at the moment, and 1 Jan Lucassen and Jaco Zuijderduijn, ‘Coins, currencies, and credit instruments. Media of exchange in economic and social history’, Tijdschrift voor sociale en economische geschiedenis 11 (2014) 1-13.

Upload: others

Post on 03-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

1

AbstractsDHBenelux2017conferenceTuesday4July2017

SessionA

1.CoinProductionintheLowCountries,fourteenthcenturytothepresentRombertStapel1,JacoZuijderduijn2,JanLucassen1,KerimMeijer1

InternationalInstituteforSocialHistory,Amsterdam,NetherlandsLundUniversity,Lund,Sweden

Thisprojectcollects,combinesandmakesavailabledataonmintproductionintheLowCountries(Netherlands,Belgium,Luxembourg)andhasdevelopedawebapplicationtoqueryandvisualizethedata,whichisalsolinkedtoadigitalmapof(changing)historicalboundariesintheLowCountriesfrom1100tothepresent(availableinLinkedOpenData).Itprovidesscholarswithauser-friendlyapproachtolargedatasets,andallowsthemaccesstosuchvariablesasregionalproductionfiguresandcoindenominations.

IntroductionMonetizationisakeyconceptineconomicsandineconomichistory.Throughouthistorycurrencieswereacrucialelementofeconomicexchange:firstintheformofmetalcoins,whichmadeupthelion’sshareofcurrencies,andwerewidelyusedineverydaytransactions.Onlymuchlaterpapermoneyalsoemerged:beforetheFirstWorldWarveryfewnormalpeoplewouldhaveeverseenpapermoney.Finally,nowadaysnon-materialbookmoneyhasbecomemuchmoreimportantthancurrencies,andwiththeonsetofmobilebankingandvirtualcurrenciessuchasBitcoin,itseemsfuturegenerationswillseemuchlesscurrenciesthanpeopleinthepast.

Historicalsocietiesdependedasmuchonmediaofexchangeaswedotoday:coinsandpapermoneyhelpedagreatdealinrealizingeverydaytransactions,asdidvariousformsofcredit.Coinproductionfiguresareofcrucialimportanceforunderstandingdevelopmentinthelongrun.1Thestudyofcoinage,theirquantity,denominations,use(e.g.inwagepayments)andmonetarypolicyingeneralprovidesimportantinsightineconomicandsocialhistoryandthisprojectprovideshistoriansafirmquantitativebasisfortheirresearch.

Inthispaper,wewillpresenttheprojectanditsgoals,giveanoverviewoftheprocessofdatacollectionandthewebapplicationwebuilttoqueryandvisualizethedata(includinggeospatialvisualizations),andprovidesomeoftheresultsforhistoricalresearchthatstemfromourdataset.

ProjectCoinProductionintheLowCountries,fourteenthcenturytothepresentprovidesanoverviewofcoinproductionfigurescoveringmanycenturies.Ofcoursewedealwithomissions:notallmintaccountsgobacktothefourteenthcentury,andnotalladministrationhassurvived.Thewebsiteallowsforanoverviewoftheminthousedatawehaveatourdisposalatthemoment,and1 Jan Lucassen and Jaco Zuijderduijn, ‘Coins, currencies, and credit instruments. Media of exchange in

economic and social history’, Tijdschrift voor sociale en economische geschiedenis 11 (2014) 1-13.

Page 2: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

2

visualizesthemissingdata.CoinproductionintheLowCountries,fourteenthcenturytothepresentalsodoesnotpretendtobethefinaldataset:likeanyotherdatasetitreflectsthedatathathasbeencollectedandmadeavailableupuntilnow.AlthoughweareconfidentwecoverthevastmajorityofthecoinsmintedintheLowCountries,someneworoverlookedsourcesmayemergeinthefuture;wearelikelytomakeadditionsintime.Thedatasetrepresentsthedatawepresentlyhave,andisatooltobeusedbyscholarslookingforvariablesrelatedtocoinproduction.

Ourgoalforthisprojectwastotaketheaforementioneddatasets,checkthevalidityofthecollecteddata,selectand/or(re)calculatetherelevantvariablesforourproject,combinethedifferentdatasets,andpresentourselectedvariablesinawebapplicationwhichallowstheusertoqueryandvisualizethedata.

Manualwebapplication

Figure1.NumberofcoinsmintedinFlandersbetween1334and1700,organisedperalloy(status:November2016).

Inthewebapplication2,theusercanquerythedataandcreate(andexport)theirownsubsets.Differentqueriesandselectionscanbemadeatthetopleft.ThisincludesthepossibilitytodisplaytheValueindeniergroot,acommoncoinusedasmoneyofaccount,inhourlywages.Thequerystartsbyclicking‘Run’.Therearethreetabs:‘Table’,‘Chart’,and‘Map’.Thevariablesinthetableandchartcanbeadjustedfreelyontheright.Themapneedssomefurther

2https://datasets.socialhistory.org/dataverse/coinproduction/search/.

Page 3: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

3

introduction.Atthemoment,themapisusedtogivearoughindicationhowcompleteourdatasetisforparticularminthousesandauthoritiesintime.WehaveturnedtotheworksbyHugoVanhoudtandH.EnnovanGelder,supplementedwithdatafromourowndatasets,todeterminetheyearsofactivitiesofminthousesandauthorities.3

Thecolourofaregionthatmintedcoins(e.g.DuchyofBrabant)willbedependentonthenumberofyears(inaparticularquery)weknowthatregionwasmintingcoinsandforwhichofthoseyearswehaveactualproductionfiguresinourdataset.Thisalsoappliestotheminthouses,wherewehaveusedpiecharts.Forthispurpose,wehavecreatedaGISmapofallmajorauthoritiesintheLowCountriesintime.4Thismeansthatborderswillchangewithtimeandminthouseswillpopupanddisappear.5Aslideronthetopleftcornerofthemapallowstheusertochangetheyears.On therightoftheapplication,differentoptionsregardingthemapcanbeselected,choosingwhetherthecoloursandpiechartsshouldchangeinstantaneouswiththesliderornot.

Figure2.MapoftheLowCountries(1432)withpercentageofdataavailabilityinthatyear(status:November2016).

3 H. Vanhoudt,Atlas dermunten van België van de Kelten tot heden (Heverlee 2007, 2nd edition); H.E. vanGelder,DeNederlandsemunten(Utrecht2002,8thedition).

4ForsomeimportantdisclaimersregardingtheseGISmaps,seetheintroductionathttp://hdl.handle.net/10622/HPIC74/.5 This process was visualized in a movie of the period 1100-2016, where each frame is a year:http://hdl.handle.net/10622/5KGG1T.

Page 4: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

4

2.MappingthePlace:“DeKrookQuarter”PirayeHacıgüzeller,SallyChambers,ChristopheVerbruggenandHansBlommeGhentCentreforDigitalHumanities,GhentUniversity

ThepresentationwillelaborateonanewprojectGhentCentreforDigitalHumanities(GhentCDH)isstartingtocarryout,“MappingthePlace:‘DeKrookQuarter’”,whichinvolves“deepmapping”ofahistoricaldistrictinGhent.Inthepresentation,thecontext,framework,workflowandimpactoftheprojectwillbedescribedanddiscussed.

Theobjectiveofthe“MappingthePlace”projectistoharnessthewell-demonstratedpowerofcartographyasaparticipatorytool(Perkins2007).Specifically,theprojectaimstocontributetotheparticipatorygovernanceofculturalheritageinEuropethrough“deepmapping”adistrictinGhent(Belgium)thatembodiesplace-basedheritagesuchasVooruit(apeople’spalaceestablishedin1913thathasbeenturnedintoavibrantinternationalcontemporaryartscentre),theMinard Theatre,De Krook(thenewlybuiltcitylibraryanddigitalinnovationcentre)andadjoiningformerWintercircus,andthesurroundingstreets(Kuiperskaai)thatusedtoconnectaLatinQuarterandredlightdistrict.Incollaborationwiththeheritageinstitutionsresponsibleformanagementoftheseplaces,GhentCDHwillemployavarietyofparticipatorymappingtoolsandmethodologiesinordertoinvolvearangeofcommunitiesinadeepmappingproject.

Deepmapsare“thickspatialdescriptions”ofplacesbreakingawayfromCartesianparadigmincartography.Thelatter,knownalsoas“Westernscientificmapping”(Pickles2004;seeTurnbull1996),limitbothcontentandmethodsofmappingasittraditionallyaimstomaponlyempiricallyobservablephenomenathatisconsideredtoconstituterealityexclusively.Deepmaps,ontheotherhand,inspiredbytheconceptof“thickdescription”coinedbyanthropologistCliffordGeertz(Bodenhameretal.2105),arebasedonamuchmoreflexibleandfruitfuldefinitionofwhatcanconstituteamapandwhatconstitutesplacesaimingtobringtogetheralargeandricharrayofspatialqualities.Deepmappingisevenmorepromisingtodayasdigitalcartographyopensupmanypossibilitiestocollectandcrowdsourcenewtypesofgeospatialinformationandvisualise,integrateandanalyseitinnovelwayswiththehelpoftechnologiessuchasgeographicalinformationsystems,virtualandaugmentedrealityand,realtimemapping.

TheparticipatorydeepmapofGhent,displayedinDeKrookandVooruit,willbeaninnovative,openended,multi-vocalandlargelydigitalcartographicprocessthatwillbringtogethergeographicalinformation,sensualexperiences,memories,oralhistories,creativenarratives,emotions,knowledges,imaginations,practicesandevents.Themapisplannedtobeproducedthroughthefollowingfivetypesofactivity:a)playfulcommunitymappingexercises(Pinder1996;2005)willbeorganisedfordiversegroupsinordertocarryoutacertaincartographictask(e.g.mappinganarea)andtheirknowledgeandexperiencesoftheplaceswillberevealedintheprocessthroughtheirinteraction(e.g.Grasseni2004)b)adigitalonlinecrowdsourcingplatformforheritageplaceswillbecreatedwherepeoplecanentercartographicinformation(seePerkins2013);c)geospatialdataonpeople’semotions(http://biomapping.net/),movement,soundandsmellwillbecollectedinreal-timeandconvertedintodatasculpturesorpaintingsbyartists(see,e.g.,www.refikanadol.com/);d)multi-layeredgeographicinformationsystemsandthree-dimensionalvirtualrealitydisplayswillbeinstalledinDeKrookaffordingadiversegroupsofvisitorstoannotatetheirexperiencesandknowledgeaboutheritageplacesfocusedinthedeepmappingprojecte)(non-)digitalmap-basedor-aidedgames(e.g.geocaching)willbedesigned,developedand/oremployedinordertofacilitateconversationaboutheritageplacesinquestionbetweendiversegroupofpeopleaswellasinformingandengagingthemwiththeseplaces.ThelayersoftheparticipatorydeepmapwillbedistributedacrossmanylocalsinDeKrookcomprisingageographicalinformationsystemscomponent,virtual

Page 5: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

5

realityroom,gameroom,exhibitionroom,digitalsculptureandpaintingrooms,screensforrealtimemapping,andcomputerswithaccesstothedigitalcrowdsourcingplatform.

ReferencesBodenhamer,D.J.,Corrigan,J.&Harris,T.M.eds.,2015.Deepmapsandspatialnarratives,Indiana:IndianaUniversityPress.

Grasseni,C.,2004.Skilledlandscapes :mappingpracticesoflocality.EnvironmentandPlanningD:SocietyandSpace,22,pp.699–717.

Perkins,C.,2007.Communitymapping.TheCartographicJournal,44(2),pp.127–137.

Perkins,C.,2013.Plottingpracticesandpolitics:(Im)mutablenarrativesinOpenStreetMap.TransactionsoftheInstituteofBritishGeographers,39(2),pp.304–317.

Pickles,J.,2004.Ahistoryofspaces:Cartographicreason,mappingandthegeo-codedworld,London&NewYork:Routledge.

Pinder,D.,1996.Subvertingcartography:thesituationistsandmapsofthecity.EnvironmentandPlanningA,28,pp.405–427.

Pinder,D.,2005.Artsofurbanexploration.CulturalGeographies,12(4),pp.383–411.

Turnbull,D.,1996.CartographyandscienceinearlymodernEurope:mappingtheconstructionofknowledgespaces.ImagoMundi,48,pp.5–24.

3. Cinemas on the Move: A geospatial analysis of the role oftravelingcinemasintheDutchcinemalandscapeJolandaVisser,JuliaNoordegraafandIvanKisjesUniversityofAmsterdam

Theemergenceofthecinemaasanewculturalindustryatthedawnofthetwentiethcenturyhashadasignificantimpactonthesocial,culturalandeconomicinfrastructuresofmodernizingsocieties.Cinema’stechnologicalandculturalinnovation,combinedwitheconomiccompetition,significantlyreconfiguredtheroleandplaceofentertainmentcultureinpubliclife.Besidesbeinganeconomicfactorofimportance,italsohasliterally“takenplace”inurbanandruralinfrastructures,transformingtheorganizationandexperienceofmodernpublicspace.

ThewaysinwhichcinemahastakenplaceinDutchpublicspacehasbeenthesubjectofanumberofstudies.Somefocusonthehistoryofspecificcinematheatresandtheurbancontextinwhichtheyfunction(Visser2012;Noordegraafetal.2016).Othershaveinvestigatednationalandlocalcinemanetworksandfocusedontheorganizationandeconomicsoftheindustry(Dibbets1980&2006;Oort2016).Yetotherstudiesfocusedonthewaysinwhichmoviesreachedtheiraudiencesandhowthiscorrelateswithspecificreligiousandideologicalorientations(BoterandClaraPafort-Overduin:2009),orstudiedthepopularityofcertaingenresorstars(VanBeusekom2013).Inaddition,acomprehensivedatabasehasbeencreatedthatfacilitatesdata-drivenresearchonnationalDutchfilmculture.6

Atthesametime,though,thestudyoftheroleofcinemainmodernpubliclifehasfocusedprimarilyonurbancontexts.WhenplottingthelocationsofcinemasfromtheCinemaContextonamap,it

6 www.cinemacontext.nl

Page 6: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

6

appearsthatthemajorityofcinemasislocatedinurbanizedareas.Infact,therewerecinemascreeningsinlessurbanizedareasaswell;thosewerefrequentedbytravelingcinemas.Atpresent,theroleandimpactofthesetravelingcinemasinDutchcinemacultureremainsentirelyunknown.Inthispaper,wepresenttheresultsoftheveryfirststudyoftheimpactoftravelingcinemasonDutchfilmculture.Usingacombinationofnetworkandgeospatialanalysissoftware,thepapercontributes:1.newinsightsintothewaycinemaasaleisureindustrycontributedtotheshapingofmodernDutchidentity;and2.areflectionontheaffordancesandlimitationsofGISandnetworkanalysistoolsfor(cinema)historicalresearch.

CentralQuestion

OurresearchaimstoestablishtheroleandplaceoftravelingcinemasintheDutch,post-WWIIcinemalandscape.Whatwastherelationbetweenthepermanentandtravelingcinemas,intermsofgeographicaldistribution,marketshare,anddistributionandexhibitionpractices?Inordertoanswerthisquestion,weapproachtheDutchcinemalandscapeasanetworkwithsocio-economic(distribution,consumption)andcultural(programming)dimensions.Inordertoanalysethisnetwork,wecombineageospatialanalysisofthenetworkofpermanentandtravelingcinemasandowners/exhibitorsinTheNetherlandsin1949withanin-depthcasestudyofoneparticularsectionofthismarket.Thiscombinationallowsustocombineamacrosocialanalysisoftheroleoftravelingcinemasinthenationalcinemamarketwithananalysisofthecontextualfeaturesthatexplaincausalityinonespecificcase(Ragin1987).

MethodFortheresearch,weadoptedatwo-tieredapproach.First,weextendedthedataonthelocationofpermanentcinemasandtheirownersintheCinemaContextdatabasewithnewlyassembleddataontheplacesfrequentedbytravelingcinemas.Then,wemappedthesecinemasaccordingtotheirtypologies,distinguishingbetweenpermanenttheatres,theatreswithoccasionalscreeningsandtravelingcinemasinQGIS.ThisresultedinageospatialanalysisoftheorganizationoftheDutchindustrythat,forthefirsttime,includesdataontravelingcinemas.

Second,thenetworksofcinemaexhibitorsofpermanentandtravelingcinemashavebeenanalyzedbyprocessingthedataontheatresandowners/exhibitorsinGephi.Theresultinggraphallowedustoacknowledgetheinfluenceofcinemachainsaswellasindividual,non-networkedentrepreneurs.ByprojectingthesedataonhistoricalmapsinQGis,wecouldcomparethegeographicaldistributionofdifferenttypesofcinemaswiththenetworkofcinemaowners/exhibitors.Weidentifiedanumberofclusterswherepermanentcinemasandmobilecinemaswererelatedandusedthisanalysistoselectonecaseforfurther,in-depthanalysisoffilmflowswithinacinemachainwithatravelingdepartment.TheselectedcasestudytracksthefilmflowsofthecinemachainofJoh.MiedemaandhiscompetitorsintheNorthernprovinceofFrieslandin1949.

Results

Some of the data sets used already existed (Cinema Context database), some had to be digitized partly (census data) and some had to be created (film programming, traveling cinema locations and screenings). In the first phase of the project the data of the cinemas and the networks of cinemas were combined. The first results showed the geographical networks of Dutch permanent cinemas in relation to the network of owners/exhibitors. In general, as also shown by Dibbets (1980), one can conclude that half of the cinemas belonged to a cinema chain, leaving the other half as isolates.

After adding the mobile cinema networks, we identified a clear geographical distribution for exhibitors of a cinema chain with a traveling department, among others in the provinces Friesland and Drenthe. The selected case study focused on the network of Joh. Miedema in

Page 7: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

7

Friesland, which comprised 10 permanent cinemas surrounded with places he claimed for his mobile department. It appears he used these mobile screening locations for constructing a buffer zone around the permanent cinemas in his chain, to ward of competition from other owners in the region. Reconstructing film programming practices within that network and comparing that to those of his competitors in the province of Friesland in 1949 provides new insights in the economics of a cinema chain with a traveling department, the socio-economic and cultural context of these various sites visited, and patterns of taste. Based on the first results of this research, the benefits and pitfalls of the combined use of Gephi and QGIS will also be evaluated.

ReferencesBeusekom,Ansjevan.“Distributing,programmingandrecyclingAstaNielsenfilmsintheNetherlands,1911-1920.”InImportingAstaNielsen:Theinternationalfilmstarinthemaking1910-1914,editedbyMartinLoiperdinger&UliJung,259-272.NewBarnet,HertsUK:JohnLibbey/KINtop,2013.

Boter,Jaap,andClaraPafort-Overduin.“CompartementalisationandItsInfluenceonFilmDistributionandExhibitioninTheNetherlands,1934-1936.”InDigitalToolsinMediaStudies:AnalysisandResearch:AnOverview,editedbyMichaelRoss,ManfredGrauer,andBernhardFreisleben,55–68.Bielefeld:TranscriptVerlag,2009.

Dibbets,Karel.“BioscoopketensinNederland:Economischeconcentratieengeografischespreidingvaneenbedrijfstak,1928-1977.”Doctoraalscriptie,UniversiteitvanAmsterdam,1980.online:http://kd.home.xs4all.nl/home/Karel%20Dibbets%20%20Bioscoopketens%20in%20Nederland%201980.pdf

Dibbets,Karel.“HetTaboevandeNederlandseFilmcultuur:NeutraalinEenVerzuildLand.”TijdschriftVoorMediageschiedenis9,no.2(2006):46–64.

Hallam,Julia,andLesRoberts,eds.LocatingtheMovingImage:NewApproachestoFilmandPlace,2014.

Horak,Laura.“UsingDigitalMapstoInvestigateCinemaHistory.”InTheArclightGuidebooktoMediaHistoryandtheDigitalHumanities,editedbyCharlesRAclandandEricHoyt,65–102.Falmer:ReframeBooks,2016.

Noordegraaf,Julia,Opgenhaffen,Loes,&Bakker,Norbert.“CinemaParisien3D:3DVisualisationasaToolfortheHistoryofCinemagoing”.Alphaville,11(2016):45-61.

Oort,Thunnisvan.“IndustrialOrganizationofFilmExhibitorsintheLowCountries:ComparingtheNetherlandsandBelgium,1945–1960.”HistoricalJournalofFilm,RadioandTelevision(March17,2016):1–24.Onlinefirst:http://dx.doi.org/10.1080/01439685.2016.1157294

http://dx.doi.org/10.1080/01439685.2016.1157294

Oort,Thunnisvan.“‘ComingupThisWeekend’:AmbulantFilmExhibitionintheNetherlands”.(Forthcoming).

Ragin,CharlesC.TheComparativeMethod:MovingbeyondQualitativeandQuantitativeStrategies.Berkeley,CA:UniversityofCaliforniaPress,1987.

Visser,Jolanda,SamennaarTheMovies–100jaarBioscoopopdeHaarlemmerdijk161,TheMoviesArtHouseCinemasandFilmDistributionAmsterdam:2012.

Page 8: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

8

SessionB

1. Soft skills inhardplaces: the changing faceofDH training inEuropeanresearchinfrastructuresJenniferEdmond,TrinityCollegeDublinVickyGarnett,TrinityCollegeDublin

Researchinfrastructuresarebecominganincreasinglydistinctpresenceinthelandscapeofthedigitalhumanities,creatinguniqueresearchecosystemsthatinteractwith,butremaindistinctfrom,thetraditionaluniversity-basedones.Itisaresearchsectorstillverymuchintheprocessofdefiningitself,however,inparticularintheartsandhumanities,notonlyintermsofhowexactlyinfrastructuressupportresearchbutalsointermsofhowawordwithsuch“hard”connotations(conjuringupimagesofroadsandbridges)canencompassthemany“soft”resourcesandskills,fromdatatoknow-how,thatwenowrecogniseasapartofinfrastructuralprovisionforresearchinEurope.Thistensionisalreadyinhowresearchinfrastructureisdefined,withsomecampspreferringtofallbackonlonglistsofelementsinfrastructuremayormaynotcomprise,suchasdata,servicesandtools,whileothersremainmoretheoretical,placingthemintheroleof“mediating”(BadenochandFlickers,2010)or“belowthelevelofthework”(Edwardsetal..,2012).Regardlessofhowweconceptualiseit,however,infrastructureisundeniableasarisingpresence,withagrowingimpactonhowresearchisconceptualisedandcarriedout,howresearchresultsarecommunicatedandshared,andhowthepotentialscaleofahumanitiesprojectcanbeconceptualised.

Thereisoneelementinthislandscapeofchangethathassteadfastlyremainedbasedwithintheuniversities,however:thatisthemannerinwhichnewgenerationsofresearchersareformed,throughtrainingandeducation.Someofthereasonsforthislieintheneedforspecialisedprocedures,staff,resourcesandexpertisetodeliverformaleducationalprogrammes,alayerofprovisionthatresearchinfrastructuresseldomhave.Indeed,itisthelackofthislayerthatmostdistinctlydifferentiatesactivitiesoftheresearchinfrastructurefromthoseofthemorefamiliaracademiccontext.Aswecontinuetodevelopourunderstandingofwhatitmeansto‘teach’thedigitalhumanities(eg.Fyfe,2011,Hirsch,ed,2012,orBellamy,2012),however,weneedalsotoreconsidertheutility,responsibilityandpotentialcontributionsofotheractorsthanuniversitiesinthisprocess,andhowweintegratethemintorecognisedlearningpathways.Itisnotinfrastructuresdonotoffertrainingopportunities,justthattheparadigminformingmuchofthistraininghashistoricallybeenfoundeduponamorenarrowconceptualisationoftheaddedvalueoftheinfrastructuralspaceforcreatingandsharinguniqueknowledge.Assuch,projectsandplatformswouldtraditionallycreatematerialstoassistusersapproachingspecifictoolsdevelopedorhostedbytheinfrastructure,servingaverynarrowconceptualisationoftheuserandhisorherneeds.

Therehasbeenanincreasingnumberofexamplesoftheinfrastructuralcommunityexpandingtheiractivitiestofillspaceslesseasilyaddressedbytraditional,formal,course-andinstitution-basedtrainingcontexts,however.Hands-ontrainingwithspecificcollectionsorobjects,orusingtransnationalaccesstobuildskills,forexample,aremechanismsthathavebeendevelopedtogreateffectbyinfrastructures,ashasthemodelofpartneringwithotherorganisationstodelivercredit-bearingprogrammes.Thesearemechanismsthathaveariseninpartbecauseoftheopportunitiesthatexist,forexample,whenresearchersworkincloseproximitytospecificscientificinstruments,asinthefieldsofculturalheritageandpreservation,buthavealsoarisenasaccidentsofdesign.Manyresearchinfrastructurefundingschemesincludefixedelementsdrawndirectlyfromthelongertraditionofinfrastructuredevelopmentinthefieldsofscienceandtechnology,mechanismsthatdonotnecessarilyfithumanitiesmodesofworkorinteraction.

Page 9: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

9

Evenassuchprogrammesremainedlargelyadhocextensionsoftheoriginatinguser-supportmodeloftraining,theyexposedthepotentialofresearchinfrastructuresnotonlyasplacesthatsupportresearch,butwhereuniqueknowledgewasbeingcreated,andwherethisknowledgecouldandshouldbeshared.Thedevelopmentofatheoreticalunderstandingofthestrengthsoftheresearchinfrastructure,whatknowledgetheycontributetodigitalhumanities,andhowthisknowledgecouldbemoresystematicallysharedhasbeenaprimarygoalofthetrainingprogrammeofthePARTHENOS(PoolingActivities,ResourcesandToolsforHeritagee-Research,OptimizationandSynergies,http://www.parthenos-project.eu/)clusterproject,itselfacollaborationbetweenanumberofresearchinfrastructuresandtheiraffiliatedprojects.

Asaninfrastructurecluster,PARTHENOSischargedwithdeepeningunderstandingofwhatinfrastructureisandhowcommonactivitiescanbebetteralignedformaximalbenefittoresearchersbetweenthecommunitiesthathavebuiltlandmarkresearchinfrastructuresatEuropeanlevel.ThePARTHENOStrainingframeworkseeksfirstandforemosttomakeadistinctionbetweenresearchworkthatdoesandthatdoesnotengagewithdataandserviceinfrastructuressuchasthePARTHENOSpartnersrepresent.Atthenextlevel,theframeworkseekstoaddressthedigitalhumanitiesnotonlyasasetofdomains,butalsoasasetofrolesandactors,followingupontheworkoftheDigCurvproject(http://www.digcurv.gla.ac.uk/).Byreconceptualisingadidacticsystemfromthefirstprinciplesofwhomightneeddigitalinfrastructureandwhattheymightneedtoknoworbeabletodo,PARTHENOShasbeenabletocreatebespoketrainingmaterialsthatdrawfromtheuniquesexperienceswithinresearchinfrastructuresandtheuniqueknowledgetheycreate.Thematerialsexistwithinasimplebutevolvingframework,addressingexperiencelevelsfromthenovice(forexample:“WhatisanInfrastructure”),totheintermediate(forexample:“ManagementChallengesinResearchInfrastructures”)andadvanced(forexample:“IntroductiontoInfrastructuresasCollaborations”)levels.Modulesaredesignedtobuildbridgesbetweenpotentialusersandtheentirecontextoftheresearchinfrastructureandhowtheyoperate,answeringfundamentalquestionsaboutwhatresourcesareavailableandhowtheyoperate,throughtomuchmorefundamentalexplorationsoftheopportunitiesandchallengesthatexistinthisenvironment,issuesthatevenexpertpractitionersstruggletodefineandaddress.

ThepaperwillembedapresentationofPARTHENOS’sworkinatheoreticaldiscussionoftheroleofresearchinfrastructuresinthedevelopmentofskillsandcareersinthedigitalhumanities.Itwillgiveanoverviewofsomeofthepracticalinterventionstheprojecthasmadetoaddressthethornyissuesofdevelopingtrainingandeducationprogrammesoutsideoftheacademy,includingawarenessraising,foresightwork,embeddinginhighereducation,partnershipsandaccreditation.Workinginconcertwithitsconstituentpartners(theDARIAH,CLARINandE-RIHsResearchInfrastructures,aswellastheirpartnerprojects,suchasCENDARI,EHRI,ARIADNE,andIPERIONCH),thePARTHENOSteamistestingthepotentialforinfrastructuralknowledge,foritstransmissionasmaterialsforself-directedusebyindependentlearnersandtrainers,andforitscapacitytobeintegratedintheprogrammesofuniversitiesandprofessionalorganisationsalike.ThroughthisprogrammeofengagementPARTHENOSwillnotonlybringanextendedhorizonfortrainingtoresearchinfrastructuresandtheirusers,buttoallofdigitalhumanities.

ReferencesBadenoch,A.,andA.Fickers,MaterializingEurope:TransnationalInfrastructuresandtheProjectofEurope(PalgraveMcMillan,2010)

Bellamy,Craig,‘TheSoundofManyHandsClapping:TeachingtheDigitalHumanitiesthroughVirtualResearchEnvironment(VREs)’,DigitalHumanitiesQuarterly,6(2012)

Edwards,PaulN.,Knobel,CoryP.,Jackson,StevenJ.,andBowker,GeoffreyC.,UnderstandingInfrastructure:Dynamics,Tensions,andDesign<http://hdl.handle.net/2027.42/49353>[accessed16November2012]

Page 10: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

10

Fyfe,Paul,‘DigitalPedagogyUnplugged’,DigitalHumanitiesQuarterly,5(2011)

Hirsch,BrettD.,DigitalHumanitiesPedagogy:Practices,PrinciplesandPolitics(Cambridge:OpenBookPublisher,2012)<http://www.openbookpublishers.com/product/161/digital-humanities-pedagogy--practices--principles-and-politics>[accessed7April2017]

2.Ranke.2-HowtoGetDigitalSourceCriticismontheTeachingAgendaStefaniaScagliola-C2DH–CentreforContemporaryandDigitalHistoryUniversityofLuxemburg

AbstractThetermRanke.2referstotheneedtoreassessLeopoldvonRanke’smethodforhistoricalsourcecriticism,inthelightoftheimpactofdigitizationandtheworldwidewebonthepositionofthearchiveandthecraftofthehistorian.Itisalsotheproposedtitleofaplatformforlessonsondigitalsourcecriticism,aprojectthatisbeingdevelopedattheCentreforContemporaryandDigitalHistoryattheUniversityofLuxemburg.

Whileanumberofscholarshavesuccessfullyaddressedvarioustheoreticalandepistemologicalimplicationsofthedigitalturnforthehistoricalcraft,littleisknownabouthowthissubjectisdealtwithintherealmofteaching.ThispaperpleadsforanassessmentoftheconceptofDigitalSourceCriticismfromtheperspectiveofDigitalHumanitiesPedagogy.ItstartsoffwithsomereflectionsonwhyandhowRanke’sconcepthastobereconsidered.Thenitdiscusseswhethersourcecriticismcanstillberegardedasaspecifichistoricalmethod.Thethirdsectionofthepaperisanaccountofasmall-scaleexplorationamonghumanitiesscholarsinvolvedinteachingatthehumanitiesfacultyoftheUniversityofLuxemburg.Theywereaskedtosharetheirunderstandingofhowdigitalsourcecriticismshouldbetaught.ThepaperconcludeswithapleaforaintegratingsmallscaleDHinterventionsintothetraditionalhistoricalcurriculum.

‘Everythinghaschangedandeverythinghasstayedthesame’Withthearrivalofdigitally-based‘fakenews’andtheinabilityofsectionsofthepublictodistinguishitfromthe‘realthing’,thevitalimportanceofdigitalsourcecriticismshouldbeevident.Whatislessevidentishowitaffectsthecraftofthehistorian.Historianseducatedinthe21stcenturyarewitnessingtheconsolidationofthe‘digitalturn’withprofoundconsequencesforthehistoricalprofession.TheGermanscholarLeopoldvonRankewasresponsibleforanearlierradicalchangeinscholarlypracticeinthe19thcentury:heintroducedtheso-called‘archivalturn’.Healsointroducedtheconceptofthe‘seminar’andencouragedanewgenerationofaspiringscholarstovisitnumerousarchives,scrutinizeandcomparedocuments,andtracebacktheidentityandmotivesoftheauthorandthecircumstancesunderwhichadocumentcameintoexistence.Rankemadeadistinctionbetween‘external’sourcecriticism,whichfocusesonthecreation,appearanceandallegedorrealauthenticityofasource,and‘internal’sourcecriticism,whichevaluatestheevidentialvaluethatcanbeattributedtoaparticularsource.Thisnewapproachbecamewidespreadandproblematizedthetraditionof‘universalhistories’,basedonbroadphilosophicalconceptsandideasabouttheevolutionofmankind.Rigorousfact-checkingcameinplaceofmyth-making.Ranke’sinnovationinthesecondhalfofthe19thcenturycoincidedwiththeperiodofmodernstateformationandthecreationofnationalarchives.Itgraduallybecamethebackboneofprofessionalhistory,withastrongorientationtowardsthearchiveastheguardianofauthenticityandhistoricalrelevance(RisbjergEskildsen2008).

Page 11: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

11

Wenowliveinglobalizedworldwithculturalanddisciplinaryboundariesthatareblurred,withdigitaltechnologythathaspermeatedtheacademicresearchpractice,andwiththeopportunitytocopy,alterandremixdatawithrelativeease.Itthereforecomesasnosurprisethatconcernabouttheorigin,authenticityandvalueofhistoricalsourcesindigitalformisincreasing(JonesandHafner,2012)Howthishasaffectedthehistoricalprofessionandwhatchangesneedtobeintroducedhasbeendiscussedbyseveralscholars(Fickers2012,Sternfeld2014,Zaagsma2014,Föhr2015).Theypleadforacriticalreflectiononthenatureofsourcesindigitalformandforaninvestmentindigitalskillstobeenablestudentsandpractitionerstoapplydigitaltoolsinaprofessionalmannerandunderstandtheirpotential,biasandlimits.Criticalreadingandthinkingarenolongerenoughintermsofsafeguards,buthavetobecomplementedwithamoretechnicalandmathematicalunderstandingofdigitalphenomena.(Scagliola2016)

InadditiontothetraditionalRankianinquiryintothecontextinwhichahistoricalsourcecameintoexistence,twoadditionalprocessesofcreationandpossiblemanipulationneedtobescrutinized.Thefirstinvolvesidentifyingalterationsandlossofcontextthatoccurduringthetransformationfromanalogsourcetodigitalobject.(Fickers2012,Treleani2013).Transparencyshouldbethenorm,astowhowasinvolvedinthechainofdigitization,whatchoicesweremadeandwhattoolswereused.Ifthisisabsent,thescholarmusthaveenoughcontextualandtechnicalknowledgetobeabletoidentifyandreconstructtotheextentpossiblethisgapandevaluatehowthismayinfluencethehistoricalinterpretationoftheobject.

Thesecondprocessrelatestoabetterunderstandingofthealgorithm-basedselectionbiasofsearchenginessincetheseincreasinglydetermineourreferenceframeandhavealsopenetratedacademiclibrarysystems(VanDijk2010,Vaidhyanathan2009).Itlooksasifourearlierdependencyonthepolicyofthenationalarchivewithregardtograntingaccesstodocumentsbasedonnationalsecurityandotherconcerns,hasbeensubstitutedbyoneonthebiggeststakeholdersinsearchtechnology:Google.Themeritsandperilsofalgorithm-basedsearchtechnologieshavebeentheobjectofacademicdebatesandhaveledtoreflectionsontheepistemologyofthedigitalenvironment(Woutersetal2013,Liu2014).However,theseremainlimiteddiscussionsbetweenthe‘usualsuspectswithinthecommunityofDHscholars’.Theydonotseemtomatterenoughtopushforreformingifnotrevolutionizingthecurriculum.

Crap-DetectionorDigitalPhilology?Thequestionwefaceishowtogoabouttoadjustandadapttheclassicalhumanitiescurriculumtotherequirementsof21stcenturyacademicresearch.Wheredowestart?Shouldwemakeadistinctionbetweengeneralacademicdigitalskillsandthosethatarecalibratredforspecificfieldsofresearchsuchashistory?

Whenobservingthelearningsubject‘methodsofresearch’,whichisoftentaughtinthefirstyearofahumanitiesbachelorcurriculum,onegainstheimpressionthatwiththe‘Googlelizationofknowledge’andthemoregeneraldigitizationofinformation(Vaidhyanathan2009)topicsthatinthepastbelongedtodistinctive(sub-)fieldsofresearchsuchascriticalmediastudies,informationscience,literacystudiesandeducationstudiesarenowmoreandmorealike.Thiscallsforarenegotiationofboundariesandspecificationofwhatisdistinctiveabouthistory.

Whenwelookattherealmofeducation,thecallfortrainingyoungpeopleinassessingthetrustworthinessofwhattheyconsultandofwhattheyengagewiththroughsocialmedia,isarecurrentfeature.Therearemanyinitiativesaimingatmakingtheuseofdigitalmedialessdangerousforthenoviceinthefield.(Scanlon2014,Cartelli2013,Bellanca2010)ThewriterHowardRheingoldhasre-introducedHemingway’sjournalisticprinciplesfor‘crap-detection’,andpointstotheimportanceofwebresourcesthatgiveadviseonhowtodetectfalseinformation(Rheigold2013).

Page 12: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

12

However,whenstudentsenteracademiawiththeintenttoexploretheworldofhistoricalnarratives,philosophicalconceptsandgeneralculturalheritage,willthepossessionofgeneralcriticalmedialiteracybeenoughtoavoidpitfalls?Itseemsthatsomespecialskillsareneeded.Inadditiontobeingabletodistinguishfakefromreal,theyshouldalsobeabletotracebackthehistoryofthevariousversionsofadocument.Thisphilologicalinquiryinadigitalenvironmentrequiresunderstandingthebackendofadigitaldocumentandsometimesrequiresapplyingforensicsoftwaretodetectthetrailofbinarydigitsthateachmanipulationhasleft.Moreover,Web.2andforthcomingWeb.3technologyalsorequirestudentsandacademicstobeabletoexpresstheirthoughtsandinsightsinotherwaysthenwritingatextintheformofanessay.Therefore,digitalsourcecriticismwhenappliedtohistory,involvesmorethanamerecriticalreadingofdigitalsourcesandwritingofarticlesthatarepublishedonline.Itentailstheactiveapplicationoftoolstotraceanddetectchanges,andtocreatedigitalcontent.Itisnotjustonemoremethodaspartofawiderrepertoireofthehistorian’scraft,itisanewconceptofconductinghistoricalresearch.Thishasseriousimplicationsforwhatneedstobeputinpracticeandconsequencesforitsrelationhiptotheexistingcurriculum.Thishasanestablishedstatuswithengravedsocialpractices,inwhichlecturersareinvolvedwhohaveputeffortinit.Changingthesepracticesrequirespatienceanddiplomacy.

OntheVergeofTransformationPassiveifnotactiveresistanceamonglecturerswhentryingtointroducedigitalmethodsinthehumanitiesisnotuncommon.Thisisoftenseenasbeinganinstinctivereactiontoprotectestablishedpositionsofpowerandexpertise(Scanlon2013,DeJongetall2011).Fearfornewtechnologiesanddistrustofrosypromisesaboutwhatsuchtechnologiescando,alsoplayarole.Anotherobstructiveelementcanbetherigidorganizationalstructureoftraditionalacademicteaching,thatisbasedonthetimespanoflecturesofjustoneortwohours.Thishardlyleavesspaceforlearningnewskillsletaloneexperimenting.(HendersonandRomeo2013).

ToexplorethespaceforthesubjectofDigitalSourceCriticismattheFacultyofHumanitiesoftheUniversityofLuxemburg,asmall-scaleuserstudywasconducted.7TheFacultyisasalientenvironmentfortestinginterestinDigitalSourceCriticism,asitisexperiencingconsiderableinstitutionalchanges.AsofOctober2016,thenewCentreforContemporaryandDigitalHistoryhasbeenestablished,thatwilltakeupinnovativeresearchandteachinginclosecollaborationwithitsformerbasis,theInstituteofHistory.

ThefirstpartoftheuserstudyconsistedofapresentationoftheenvisionedformatforlessonsonDSCduringthemainmeetingoftheInstituteofhistory,followedbyasurvey.

Theplanistocreateanappealingvideoessayaroundaparticulardatatypeinwhichthedigitalversionisproblematizedandcomparedtoitsanalogversion.Subsequentlystudentshavetoreadliteratureandconductresearch,andfinallycreateadigitalpublicationorobjectwithasimilartypeofdatawiththehelpofdigitaltools.Thesurveytocollectfeedbackonthisformatwassetoutto40colleaguehistorians,amixofprofessors,lecturersandPh.D.students.Thisyieldedninebenevolentresponses,whichallstressedtheimportanceofthetopic,butalsotheexistinglimitationstointegrateitintotheirlessons,duetolackofexpertiseandtime,andofspacewithinthelimitsoftheprescribedICTS.

Thenextstepwastoorganizefocusgroupswithcolleaguesfromthenewcenter.Fourmeetingswereheldwiththreetofourparticipants,amixofjuniorandseniorcolleagues.Inaddition,afewface-to-faceinterviewswereheld.Thebackgroundoftheparticipantsvaried,mostofthemwere

7The consultation of lecturers is work in progress; it should be completed in the coming months and should yield a more solid foundation for designing and realizing Ranke.2, the new teaching platform on Digital Source Criticism.

Page 13: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

13

historians,amongwhichmediastudieswasoverrepresented.Specialweightwasgiventothefeedbackofaninformationscientistandoftwohistoriansspecializedindigitalmethods,allthreewithampleteachingexperience.Again,theywerefirstshownthepresentationontheidealtypicalformatoftheDigitalSourceCriticismlesson,afterwhichthreemainquestionswerepresented:

I. Inwhatwayisdigitalsourcecriticismrelevantforyourresearch?II. Whatdoyouregardasnecessarydigitalskillsforstudents(basic,academic,specificfor

historians);III. Whatwouldyouchoosetointegrateinyourcourses,thevideoessay,theassignments,the

hands-oncomponentoracombination?

Thefeedbacktothepresentationandquestionswasinmostcasesrecordedandlatertranscribed.Inafewcasesnoteswerejotteddownduringtheinterview.Themostsalientconcernsandpreferencesthatcameoutoftheconsultationsaresummarizedbelow:

-Thelevelofdigitalliteracywhenenteringtheuniversity

Thelevelofcompetencesistoodiversebecauseoflackofsystematiccoverageofthetopicinsecondaryeducation.Anentrancetestshouldbeconsideredtobeabletocoverthegapswithindividualtrainingunits.

-LimitedTime.

DigitalLiteracyandcompetencestodealwithdigitaldata,arebesttaughtincollaborativeprojectsthattakeuptimebecauseoftheneedtoteachskills.Thinkofhowmuchtimeittakestolearntowriteaccordingtoacademicstandards.Atthesametime,lecturersofthematiccoursesconsiderdigitalsourcecriticismasatopicthatbelongstothesubject‘researchmethods’-asubjectwithalimitedamountofhoursinthecurriculumwhichisofferedonlyonce,mostofteninthefirstyearofabachelor.Mostteachingisthematicandnotaboutmethods.

-The‘branding’ofthetermDigitalSourceCriticismisproblematic

Creatingaspecialtermforthistypeofsourcecriticismsuggestsitisadifferentandnewpractice.Alecturerof‘methodsofresearch’suggestedtousethegenerictermSourceCriticism,thatcanbeappliedtoanysource,regardlessofwhetheritisananalogueordigitalform.

-Thereisaneedforcontinuityinthe‘framing’oftheproblem.

Somelecturersofmediastudiesstatedthatgivingtoomuchattentiontothetransformationfromanalogtodigital,riskstoobscurethemanytransformationsandmanipulationsthatalreadyoccurbetweenanalogmedia(e.g.intheprocessofeditingofnewsreel).Theyprefertoframethesubjectinamoregeneralway,e.g.‘reflectingontransformations’.

-ThemajorityofresearchersandPhDworkwithnon-digitizedsources.

Takingintoaccounthowmanylecturersandresearchersworkwiththematicsubjectsandwithdataandliteraturethatisnotdigitized,itwouldbedisproportionatetoplaceDigitalSourceCriticism,amethodologicaltopic,asacentralsubjectonthecurriculum.Theprincipleof‘hybrid’researchculturesshouldbeemphasizedasitconnectsbettertothedominantteachingpractice.

ConclusionToaddresssuchconcernsasmartcommunicationstrategyshouldbeconsideredinwhich‘digitalsourcecriticism’ispresentedasa‘hybridconcept’thatencompassesbothdifferencesandcontinuitiesindealingwithsourcecriticism.Whatcouldbeconsideredistosubstitutetheprincipleofaseriesoflessonsthatwouldtakeupmuchofthetimeinthecurriculum,withsmallerteachingunitswithadigitalcomponent.Thesecouldbecomplementaryinathematiccourse,andmorecentralinamethodologicalsubject.Awaytosupportthisapproach

Page 14: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

14

wouldbetofollowthepedagogicalprincipleoftheSAMRmodel,whichstandsforSubstitute,Augment,Modify,Redefine.Itwasdesignedtograduallyintegratetechnologyintothecurriculum(Puentedura2014).Theprocessstartswithfirstmerelysubstitutingtasksthathavetobecompletedmanuallywithatechnology,andthengraduallyaddingtechnologicalcomponentstofamiliarizenewuserstothepossibilitiesthattheyoffer.Theoutcomeofthisgradualprocessshouldleadtoaredefinitionoftheoriginaltask.

ThisSAMRmodelapproachiscurrentlybeingconsideredasaninstrumenttorealizetheenvisionedtransition.Atthesametime,however,masterandPhDstudentswillbeimmersedinintensiveDHcollaborativecourseswithexperimentalcomponentsatthenewcentre.

Thepolicyofcombininggradualchangewithimmersiveandexperimentallearningcouldbethesolutiontocreateacommongroundamongdifferentgenerationsofhistoriansandfuturegenerationsofstudentsofhistory.

ReferencesJamesA.Bellanca(2010),21stCenturySkills:RethinkingHowStudentsLearn,SolutionTreePress.Seealso:http://www.p21.org/about-us/our-history

CatherineFrancisBrooks(2016).‘Disciplinaryconvergenceandinterdisciplinarycurriculaforstudentsinaninformationsociety’.In:InnovationsinEducationandTeachingInternational,http://www.tandfonline.com/toc/riie20/current

AntonioCartelli(2013),(ed)Fostering21stCenturyDigitalLiteracyandTechnicalCompentency,InformationScienceReference.

JoseVanDijck(2010),Searchenginesandtheproductionofacademicknowledge.InternationalJournalofCulturalStudies,13(6).doi:10.1177/1367877910376582.

AndreasFickers(2012)‘TowardsANewDigitalHistoricism?DoingHistoryintheAgeofAbundance.’VIEWJournalofEuropeanTelevisionHistoryandCulture,1(1).

PascalFöhr,"Poster‚HistoricalSourceCriticismintheDigitalAge‘,"HistoricalSourceCriticism,31.März2015,http://hsc.hypotheses.org/328..

MichaelHenderson,andJeoffRomeo(2016),TeachingandDigitalTechnologies:BigIssuesandCriticalQuestions:CambridgeUniversityPress.

RodneyH.JonesandChristophA.Hafner(2012),UnderstandingDigitalLiteracies;aPracticalIntroduction,Routledge.

DeJong,Ordelman,Scagliola,Audio-visualCollectionsandtheUserNeedsofScholarsintheHumanities;aCaseforCo-Development,ProceedingsofSupportingDigitalHumanities,2011,Copenhagen.http://files.beeldengeluid.nl/pdf/r-en-d_audio-visual-collections-and-userneeds_dejong-ordelman-scagliola_20111117.pdf

AlanLiu(2014)“ThesesontheEpistemologyoftheDigital:AdviceFortheCambridgeCentreforDigitalKnowledge.”http://liu.english.ucsb.edu/theses-on-the-epistemology-of-the-digital-page

RubenPuentedura(2014),SAMRandTPCK:AHands-OnApproachtoClassroomPracticehttp://www.hippasus.com/rrpweblog/archives/000140.html

HaroldRheingold(2013).http://rheingold.com/2013/crap-detection-mini-course/retrieved1-5-2017.

KasperRisbjergEskildsen,‘Leopoldranke’sarchivalturn:locationandevidenceinmodernHistoriography’,ModernIntellectualHistory,5,3(2008),pp.425–453C_2008Cambridge.doi:10.1017/S1479244308001753

Page 15: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

15

EileenScanlon,E.(2014),Scholarshipinthedigitalage:Openeducationalresources,publicationandpublicengagement.BrEducTechnol,45:12–23.doi:10.1111/bjet.12010

MatteoTreleani(2013),‘Recontextualisation;cequelesmédianumériquesfontauxdocumentsaudiovisuels’,in:Réseaux,1,(no177)http://www.cairn.info/publications-de-Treleani-Matteo--99590.htm

StefaniaScagliola(2016),DigitalSourceCriticisminthe21stCentury:ReconsideringRanke’sPrincipleintheDigitalAge,blogDigitalHistoryLab,August2016.http://www.dhlab.lu/blog-post/digital-source-criticism-inthe-21st-century-reconsidering-rankes-principles-in-the-digital-age/

JoshuaSternfeld(2014),‘HistoricalUnderstandingsintheQuantumAge’,JournalofDigitalHumanities,Vol3,nr.2,http://journalofdigitalhumanities.org/3-2/historical-understanding-in-thequantum-age/

SivaVaidhyanathan(2009),‘TheGooglizationofUniversities’,in:TheNEA2009AlmanacofHigherEducation,2009http://www.nea.org/assets/img/PubAlmanac/ALM_09_06.pdf

PaulWouters,AnneBeaulieu,AndreaScharnhorstandSallyWyatt(2013)(eds),VirtualKnowledge;ExperimentingintheHumanitiesandtheSocialSciences(Eds.)

GerbenZaagsma,‘OnDigitalHistory", BMGN - Low Countries Historical Review 128/4 (2013)3-29.

3.Individualpresentation:VideoessaysandthenewpossibilitiesforfilmcriticismandpedagogyIrinaTrocan,CinemaandMediaPhD,NationalUniversityofFilmandTheatreBucharest

Theshiftoffilmcriticismtotheonlinesphereinrecentyearshasledtoanumberofmutations,includingtheincreaseinpopularityofarelativelynewformat:thevideoessay.Roughlyanaudiovisualversionoffilmcriticism-amodeofanalysisthatemploysthediscussedobject(thecinematicwork)directly-,thevideoessayquotesthefilmevenasitdeconstructsit.Itcanthereforebeeasiertograspwithoutnecessarilybeingsimplifiedasdiscourse–aseven-minuteclipcanbeasrichandthoughtfulasalongformessay–andallowsforthesurvivalofintelligentfilmcriticisminaratherdyslexicculturalenvironment.

Theaimofthispresentationistosummarizethecurrentstateofvideoessaysandtheiraestheticanddidacticpossibilities.In2017,thehistoryofvideoessaysissimultaneouslytooshortandtoolong.Sincetheformisroughlyadecadeoldinpopularview,inordertodiscernitsinfluences,onewouldhavetolookbeyondthepracticeitselftoexamineeitherthemoretimeworntraditionofessaycinema–thenon-narrativefilmsofChrisMarker,Jean-LucGodard,HarunFarocki–ortheaudiovisualhistoriesandTVbroadcastsonthesubjectofcinema–MarkCousins'TheStoryofFilm:AnOdysseyorAPersonalJourneywithMartinScorsesethroughtheAmericanCinemabeingpopularexamples.However,adecadeofvideo-essay-makingisalsolongenoughfortheformtohaveexperienceditsfirstmomentsofcrisisandforattemptstotheorizeittobecomeincreasinglydifficultanddangerouslyreductive.Forinstance,videoessaysmadecca.2014wereproblematicintheirover-relianceonvoice-over(i.e.audiocommentaryoftheauthoroverlappedwiththeimages),whereasin2017,beingaimedatsocialmediadistribution,severalofthemadopttheirrelevant/mutedaudio,text-on-screenformat,thusplacingalltheweightonthevisualcomponent;facedwiththenewerpattern,commentershavegonefrompleadingforlessvoice-overtoaskingformoreofit.Thisconstantlychangingmedialandscapemakesiturgenttodevelopstrategiesfor

Page 16: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

16

aestheticevaluationandcurationofvideoessays–otherwise,theoverproductionofonlinecontentwillobscurethebestonesandthemoreprovocativepossibilitiesoftheform.

Essential(thoughunderstated)productionguidelinesDuetotheirabilitytoquotefromfilmwithnoneedofprocessingitintoanewlanguage,popularvideoessaysareoftenmadefromimmediatelystrikingfragments:strikingfilmimagery(asinStanleyKubrickfilms),dialogues(AaronSorkin-scriptedone-liners),orevenblatantjuxtapositions(comparingtwostylisticallysimilarfilmsinasplit-screen,withtheaimofprovingjusthowmuchthelaterfilmborrowsfromtheearlier,usuallycanonicone).However,theirrangeofsubjectslargelyoverlapswiththatofcinephile/poponlinecriticism:overviewsofacertainartist'sfilmography,acertaingenre,filmfestival,nationalcinema,trendoftechnicalevolutioninfilmcraft.

Therearealreadyafewprominentplaformsforlaunchingvideoessays,whichprovidevideo-essayistswithopportunities(on-the-jobtraining,accesstoneccesarymedia)evenastheysometimeslimittheircreativeoptions.Thefirstandalreadymostcontroversialisthevideo-on-demandplatformFandorwithitsannexedpublication,Keyframe;othersaretheBFI/Sight&Soundwebsite;theNetherlands-basedplatformFilmkrant;MUBI(alsoannexedtoaVODplatform),andthemostacademic-oriented,[in]Transition(whichismoresimilartoadistributorthanaproducer,toborrowterminologyfromthefilmmarket).

Whiletherearealsovideo-essayist'superstars'withdistinctivestyles,forthesakeofbrevity,Iwillonlyfocusontheinstitutionalguidelineswhichtheymustfollow.Studyingtheseauthors'workoverseveralyearsprovesthat,eveninthisseeminglylaxworkingprocess,shiftingeditorialdemandscanhaveasignificantimpactonwhattheyproduceandhowwidelyitcirculates.Iwouldfurtherarguethattheformativetrainingofthevideo-essayists(whethertheyarefilmmakers,critics,academics)isitselfonlypartlyrelevanttotherigororwhimsicalityoftheirvideographiccriticism.Althoughtheformatisinrapiddevelopmentandexpansion,andmakingavideoessayishypotheticallyaccessibletoanyonewhoownsacomputerandeditingsoftware,hierarchiesandmandatorystylemarkerscaneasilybetracedamongthemostwell-knownvideoessaysmadetodate,whichonceagainindicatesthatthetotalcreativefreedomoftheInternetismerelyautopiandream.

ChallengestothedevelopmentofvideoessaysThedifficultiesofthisnewformtendtobepragmatic,sincethevideoessaysdependonveryprecariousfactors.Thefirstistheirsurvivalandcontinuedavailabilityintheonlinesphere,whichtherecentFandorscandal-involvingthewithdrawalofseveralhundredvideoessays-hasprovedtenuous.Thesecondisthelegalcircumstanceoftheirrighttoexist,namelytheFairUsecopyrightexception:thisstatesthatclipsofartworkcanbeusedbyindividualswithoutpermissionandcopyrightownershipaslongastheultimatepurposeisdifferentfromthestraightforwardexploitationofthematerial.AnoteonFairUseinthebrochureTheVideographicEssay:CriticisminSoundandImageendswithadisclaimerthattheymerelyofferpeeradvice–theyarenot,nordotheyclaimtobe,lawyers.

VideoessaysasstudymaterialAmongthemostremarkablefeatsofvideoessaysisthepopularizationoffilmastheory–oraudiovisualthinking.AsVolkerPantenburgpointsoutinhiscomparativestudyofFarockiandGodard,theoryhasthusfarbeenpredominantlylinguistic,evenwhenitisself-reflexiveandproposesabreakwiththedominant“amalgamofstructuralism,Lacanianpsychoanalysis,post-structuralism,andMarxism”.AsPantenburgputsit,“writingagainstthefilmtheoriesofthe1970scontinuestoassumeacleardistinctionbetweenthefilmsontheonesideandtheiranalysisandtheorizationontheother.”

Page 17: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

17

Similarly,inhis2012essayVisualizationMethodsforMediaStudies,LevManovichcouldbetalkingaboutvideoessayswhenusingtermslike“collectionmontage”andclaimingthereisafutureinvisualizationofmediaartifactswhengroupingthembyintrinsic,yet-unarticulatedfeatures:“themostimportantquestion,whichisstillunresolved,ishowtocombinedistantandclosereadings”.Forthis,videoessayscouldbeapowerfultoolofscholarshipandamorecomplexwayofconveyinginformationthanwrittenlanguage.

BibliographyEricFaden,CatherineGrant,KevinB.Lee,JasonMittell,TheVideographicEssay:CriticisminSoundandImage,caboosebooks

Pantenburg,Volker,Farocki/Godard:FilmasTheory(FilmCultureinTransition),AmsterdamUniversityPress2015

Wees,WilliamC.(1993),RecycledImages:TheArtandPoliticsofFoundFootageFilms,AnthologyFilmArchives

Manovich,Lev(2001),TheLanguageofNewMedia,MITUniversityPress

Manovich,Lev(2012),MuseumWithoutWalls,ArtHistoryWithoutNames:VisualizationMethodsforHumanitiesandMediaStudies,manovich.net

Witt,Michael(2013),Jean-LucGodard,CinemaHistorian,IndianaUniversityPress

Page 18: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

18

SessionC

1. The Pyramid of Conscientious Digital Humanities Research:howtogeta‘generalideaofwhatyoushouldbeseeing’SergeterBraake,UniversityofAmsterdam

‘Theonlywaytoknowifyourresultsareusefulorwildlyoffthemarkistohaveageneralideaofwhatyoushouldbeseeing.’8

Thequestionhowtocopewithamassivenumberofdigitalhumanitiestexts,andthetoolstoprocessthem,hasledtopublicationson‘algorithmiccriticism’,‘toolcriticism’and‘datacriticism’.Whatthesepublicationshaveincommonisthequestforaconscientiouswaytodealwithtoolsanddata,balancedwiththehumanistdomainknowledgeandmethodologies.9Humanitiestextscanbepoemsthatwerewrittenafterasuddenburstofinspiration,wellcraftedtextsonthehistoryofanempire,themostinnerthoughtsofadiarywriterorconscientiouslycraftedbookkeepingaccountsoflonggonerulers.ThefieldofDigitalHumanitiestendstotreatthesetextsquitebadly.Textsarerippedoutoftheiroriginalcontexts,choppedintopieces,linkedtoothertexts,andusedforanalysesthatgofarbeyondtheiroriginalintentions.

Dependingontheresearchquestionoftheindividualresearcher,orresearchgroup,this‘textransacking’isnotnecessarilyabadthing.DigitalHumanitiescan,should,anddoes,askquestionsthatgobeyondthescopeoftextsthatcouldbestudiedintenselybyonehumanbeing.Therearehowever,plentyofdangersinvolvedinusingdigitaltoolswithoutreallyknowingwhattheyexactlydo.Firstofallthereisthequestionwhenweknowenoughofwhatatooldoestoperformconscientiousdigitalanalyses.Secondlythereisthequestionifwekeep(enough)intouchwiththematerialwestudywithdigitalmethods.Whereliesthedomainknowledgethresholdthatisnecessarytodealwithdigitaldatacarefully?Atwhatpointdowehavea‘generalideaofwhatweshouldbeseeing?’

Thedangerof‘blackboxtooling’isincreasinglygettingattention.10Thedangersoflosingtouchwiththeoriginalsourcematerialrequiressomefurtherexplanation.Forsomehumanitiesscholars,digitalhumanitiesresearchmainlyextendstheworktheyalreadyaredoing:samekindofdata,largerapproaches.WhenFatherRobertBusainitiatedtheIndexThomisticusinthe1940’s,heobviouslyalreadywasfamiliarwiththeworkofThomasofAquinas.WhenliteraryscholarswanttostudythelanguageuseintheworksofJaneAustenwemayassumetheyhavealreadyreadquiteabitof

8MeganR.Brett,‘TopicModeling:ABasicIntroduction’,JournalofDigitalHumanities,vol2.,nr.1,Winter20129Tociteonlyafew:On‘algorithmiccriticism’theslightlydatedbutstillinsightful:S.Ramsay,ReadingMachines:TowardanAlgorithmicCriticism(Chicago2011).Ondatacriticism:FrederickW.GibbsandTrevorJ.Owens,TheHermeneuticsofDataandHistoricalWriting(2012revision)’,in:JackDoughertyandKristenNawrotzkieds.,WritingHistoryintheDigitalAge(Michigan,2013);OnToolcriticism:S.terBraake,,A.S.Fokkens,N.OckeloenandC.vanSon,‘DigitalHistory:towardsnewmethodologies’in:Bozic,Mendel-Gleason,DebruyneandO’Sullivaneds.,2ndIFIPWorkshoponComputationalHistoryandData-DrivenHumanities(2016).10SeeforexampletheToolCriticismWorkshopinAmsterdam:http://event.cwi.nl/toolcriticism/;AlbertMeroño-Peñuela,AshkanAshkpour,MariekevanErp,KeesMandemakers,LeenBreure,AndreaScharnhorst,StefanSchlobach,FrankvanHarmelen,‘SemanticTechnologiesforHistoricalResearch:ASurvey’,SemanticWebJournal,Volume6,Number6(2015)539-564;TerBraaketal,‘DigitalHistory’.

Page 19: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

19

Austen.Thesescholarscertainlyalreadyhaveageneralideaofwhattheycouldbeseeing.Whenhistoriansuselargenewspaperarchivesfordigitalresearchhowever,includingdifferentnewspapersspanningnumerousdecades,thingsbecomemorecomplex.Historiansareoftenexpertsononeorseveralhistoricaltopics,withthenecessaryarchivalsourcesattachedtothem.Fewhistoriansareexpertsonawidevarietyofhistoricalnewspapers.Thisproblemisenlargedbythewaydigitaltoolsdealwiththesenewspapers.Textistransformedinto‘data’,takenawayfromthepageanditssurroundingsandistransformedtogetherwithotherpiecesoftextintoanaggregatedresult.11

ThequestionsIwanttoaddresshereare:

1. Whendoesaresearcherknowenoughofatooltouseitconscientiously?2. Whendoesaresearcherknowhismaterialwellenoughtousedigitaltoolsfordistantreading

analyses?

Andfinally,springingforthfromthis:

3. Atwhatpointdowedecidethattheanswersto1)and2)arenotcostefficientanymore?Atwhatpointshouldwedecidethata‘simple’toolandclosereadingpracticesaremorepracticalforhumanistresearchthancomplicatedtoolsusedonlargedatasets?

Ifwewanttovisualisetheinterplaybetweenresearcher,algorithm,tool,interfaceanddata,thenwecancometoapyramidofconscientiousdigitalhumanitiesresearch,asvisualisedbelow.Ontopthereisthehumanistresearcher,withallofhisorherpresuppositionsacquiredfrompriorknowledge.Thisresearcherwillmostlybeworkingwithaninterface,butalsohastounderstandthetoolbehindtheinterfaceandthedataandalgorithmsbehindthetool.Ifthehumanistmisseseitherasufficientgraspofthecomputeralgorithms,orofthedatathatisused,theresultsthatare

11ForexampletheShiCotool,tracingconceptsthroughtime:https://github.com/NLeSC/ShiCo.SeeforreflectionsonthelossofcontextC.Jeurgens,‘TheScentoftheDigitalArchive:DilemmaswithArchiveDigitisation’,BMGN-LowCountriesHistoricalReview128(4)92013)pp.30–54

Page 20: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

20

providedbythetoolthroughaninterfacemaybemisinterpreted,orsignificanterrorsmaynotbespotted.

Inshort,thereshouldbea‘generalidea’ofwhatwecouldseeing,bothbyknowingthetoolandthedata.Inthispresentation,Iwillpresentaproposal,astep-by-stepplan,ofwhatcouldbedonetoreachthisgeneralunderstandingbytakingtheexampleofmyownresearchonconceptdriftinDeGidsandVaderlandscheLetteroefeningen,twonineteenthcenturyjournalsdealingwithallkindsoftopicsofgeneralinterest.Thesestepsinclude:1)manualclosereading;2)digitalclosereading;3)digitalanalysis;4)criticismoftheresults;5)reflectiononsteps1and2:weretheysufficient?6)reflectionsonstep3:wasthistoolthebesttouseforthispurpose?

Whengoingthroughthiscyclethesequestionsshouldalwaysbeconsidered:atwhatpointaretherequirementsforconscientiousdigitalhumanitiesresearchtoohightobeworththeeffort?Atwhatpointisthepyramidtoocostly?Whenisitmoreefficient,andinfactconscientious,tosettlefora‘simpler’tool?Atwhatthresholdshouldthedigitalmakeroomagainformoretraditionalhumanities?

2.Thisismygroundtruth,tellmeyours:PotentialsofmultipleannotationsfordigitalhumanitiesBeritJanssenMeertensInstitute,AmsterdamandInstituteforLogic,LanguageandComputation,UniversityofAmsterdam

Manymethodsindigitalhumanitiesrelyoncomputationalmethods,whichmaybetrainedonasetofreferenceannotations,alsoreferredtoasgroundtruth.However,humanjudgementsarerarelyunanimous:thisledtoresearchintohowinformationfromhumanjudgescanbebestcombinedtoincreaseknowledgeofthe“true”relationshipsindata(e.g.,Dong,2010).However,inmanydomains,forinstanceinmusicinformationretrieval,itmaybeassumed,thatmultipleannotatorjudgementsmayformequallyvalidinterpretationsofdatasuchasmusicsimilarityorchordestimation(Koops,2016;Schedl,2014).Thepresentcontributionshowshowmultipleannotationscanbeusedtorevealhumanstrategiesandknowledgebyinvestigatinghowannotatorsmayagreeordisagreeondifferentsubgroupsindata.

Asanexample,Ipresentadata-setofannotationsonphrasesimilarityin360Dutchfolksongs.12Thesefolksongsarecategorizedinto26groupsofvariants,ortunefamilies.Threeannotatorsworkedindependentlytogivelabelstophraseswithintunefamilies,orgroupsofvariants.Thelabelsconsistedofalettercombinedwithanumber,withwhichannotatorscouldindicatesimilarityinthreecategories:“almostidentical”(sameletterandnumber),“relatedbutvaried”(sameletterbutdifferentnumber),and“different”(differentletterandnumber).Theannotatorsdidnotagreeonphrasesimilarityatalltimes,butwithFleiss’κ=0.71(Fleiss&Cohen,1973),theagreementwassubstantial.

Thedatasetwasusedtoevaluatepatternmatchingalgorithms:thesealgorithmscomparedeachphraseinthedatasetagainstthemelodieswithinthetunefamilyfromwhichthequeryphrasewastaken,andreturnedamatchscore.Forevaluationpurposes,thethreeannotationswerecombinedthroughamajorityvote:iftwoormoreannotatorshadgivenanyphraseinagivenvariantthesame

12Availablefromliederenbank.nl/mtc

Page 21: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

21

labelasthatofthequeryphrase,thevariantwasconsideredtocontainaninstanceofthephrase,whichapatternmatchingalgorithmshouldfind(cf.Janssen,vanKranenburg&Volk,2017).

The added value of combining multiple annotations is that next to the evaluation of pattern matching algorithms, also the annotators themselves may be compared to the majority vote. This comparison shows that individual annotators agree around 87% with the majority vote: they miss about 10% of the relevant phrase instances, and find about 10% irrelevant occurrences, as compared with the majority vote. Flexer and Grill (2016) showed how such inter-rater disagreement introduces an upper bound for various tasks in music information retrieval.

The current work presents a way to learn from inter-rater disagreement: the dataset is categorized into tune families, which form homogeneous groups of melodies with high distinctiveness between groups. An analysis of the distribution of disagreement with the majority vote over tune families reveals that individual annotators disagree with the majority vote in different ways, such that some tune families lead to few disagreements for one annotator, but many disagreements for another annotator. This differs from the errors produced by the three-best performing pattern matching algorithms: they show similar trends over the tune families, such that a tune family in which one algorithm produces many irrelevant results will also be more difficult to handle by other algorithms. This suggests that the strategies of the compared pattern matching algorithms may be similar, while the annotators bring different strategies to the table.

ReferencesDong,X.L.,Gabrilovich,E.,Heitz,G.,Horn,W.,Murphy,K.,Sun,S.,&Zhang,W.(2014).Fromdatafusiontoknowledgefusion.ProceedingsoftheVLDBEndowment,7(10),881-892.

Fleiss,J.L.,&Cohen,J.(1973).Theequivalenceofweightedkappaandtheintraclasscorrelationcoefficientasmeasuresofreliability.Educationalandpsychologicalmeasurement,33(3),613-619.

Flexer,A.,&Grill,T.(2016).TheProblemofLimitedInter-raterAgreementinModellingMusicSimilarity.JournalofNewMusicResearch,45(3),239-251.

Janssen,B.,vanKranenburg,P.&Volk,A.(2017,inpress).Findingoccurrencesofmelodicsegmentsinfolksongsemployingsymbolicsimilaritymeasures.JournalofNewMusicResearch.

Koops,HendrikVincent,etal."IntegrationAndQualityAssessmentOfHeterogeneousChordSequencesUsingDataFusion."InternationalSocietyforMusicInformationRetrievalConference.2016.

Schedl,M.,Gómez,E.,&Urbano,J.(2014).Musicinformationretrieval:Recentdevelopmentsandapplications.FoundationsandTrendsinInformationRetrieval,8(2-3),127-261.

3.DigitalHistoryProjectsasBoundaryObjectsMaxKemmanUniversityofLuxembourgmax.kemman@uni.lu

Digitalhistoryisconcernedwiththeincorporationofdigitalmethodsinhistoricalresearchpractices.Thus,digitalhistoryaimstousemethods,concepts,ortoolsfromotherdisciplinestothebenefitofhistoricalresearch,makingitaformofmethodologicalinterdisciplinarity(Klein,2014).Thisrequiresexpertiseofdifferentfacets,suchashistory,technology,anddatamanagement,andasaresultmanydigitalhistoryactivitiesareacollaborationofscholarsandprofessionalsfromdifferentbackgrounds.

Page 22: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

22

SuchcollaborationswouldfitSvensson’scharacterisationofdigitalhumanitiesasafractionedtradingzone(Svensson,2011,2012).Simplystated,thismeansfirstthatdigitalhumanitiesfunctionsasheterogeneouscollaborations,i.e.,withparticipantsfromdifferentdisciplinarybackgrounds,andsecondthattheparticipantsactvoluntarily.

Inthispaper,wewillinvestigatethesetwoaspectsinthecontextofdigitalhistorytounderstandhowdigitalhistoryprojectsfunctionasheterogeneouscollaborations,andwhattheparticipants’incentivesareforenteringsuchcollaborations.

Wewilllookatdigitalhistoryprojectsasboundaryobjects,aconceptdevelopedbyLeighStarandGriesemertodescribeanobjectthatmaintainsacommonidentityamongthedifferentparticipants,yetisshapedindividuallyaccordingtodisciplinaryneeds(StarandGriesemer,1989;Star,2010).Thisconceptcouldbeusedforexampletorefertothetoolunderdevelopment,orthedataonwhichthetoolandhistorianwillwork.However,inthispaperwewillapproachtheprojectitselfasboundaryobject;theprojectbindstheparticipantstogether,andallparticipantssubscribetoacommondescriptionoftheproject’sgoals,whileatthesametimetheparticipantsshapetheprojectaccordingtotheirownneeds.Asonedigitalhistoryprojectcoordinatordescribeditinaninterview:

”[Y]ouhavearesearchidea,andyoufitthattothecallyou’reapplyingto,andthenyougetfunding…Andifyouthenhireresearchers,yestheytoohavetheirownideaofcourse,andtheirownlineofresearchthey’reworkingon,andtheytrytofitthatintheresearchproject.”

Thisleadsustoinvestigatetheincentivesforcollaboration.Whenwritingaboutinterdisciplinarycollaborationindigitalhistory,thisisalmostalwaysdonetounderscorethepositiveorevennecessaryeffects(e.g.Eijnattenetal.,2013;Hitchcock,2014;Sternfeld,2011).However,suchcollaborationisnottrivialandrequiresdedicationandinvestmentsfromallinvolved,e.g.asshownbySiemens(2009;2012).InordertoinvestigatetheactivitiesofindividualparticipantswewillfollowtheworkofWeedmanonincentivesforcollaborationsbetweenearthscientistsandcomputerscientists(1998).ForseveraldigitalhistoryprojectsbasedintheBeneLux,wehaveinterviewedtheparticipantsandinquiredabouttheirreasonsforjoiningtheproject,theirindividualgoalswiththeproject,andtheexpectedeffectsoftheirparticipationaftertheprojecthasended.Forexample,inaninterviewonehistoriannotedabouttheirproject:

”[W]e’resupposedtobeadvisingtheteamdevelopingthetool.Andtryingtothencarryoutresearchonaspecificcasestudy.Andsooriginallyitwaslikewowwe’regoingtobeabletousethetool,butveryquicklyitbecameclearokactuallyprobablywe’renotgoingtobeabletousethetool.”

Bylookingintotheincentivesofalltheparticipantsofaproject,wewillunpackthetradingzonesofdigitalhistoryprojects,togainanunderstandingofhowheterogeneous,interdisciplinarycollaborationswork,andhowparticipantsshapethesecollaborations.Thiswillallowustolookintowhyasituationasdescribedabovebythishistorianoccurs,andhowindividualshapingoftheprojectcanleadtothis.Moreover,wewillarguethattheseincentivesgobeyonddisciplinaryboundaries,whichmeansthatthetradingzoneinadigitalhistoryprojectismorecomplexthanthe(in)famousTwoCulturesasdescribedbyC.P.Snow.

ThisresearchispartofPhDresearchonhowtheinterdisciplinaryinteractionsindigitalhistoryaffectthepracticesofhistoriansonamethodologicalandepistemologicallevel(Kemman,2016).Byunpackingdigitalhistoryprojects,weaimtogainbetterinsightinhowdigitalhistoryfunctionsasacoordinationofpracticesbetweenhistoriansandcollaboratorsfromdifferentbackgrounds,andhowindividualincentivesshapethiscoordination.

Page 23: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

23

ReferencesEijnatten,J.van,Pieters,T.,andVerheul,J.(2013).BigDataforGlobalHistory:TheTransformativePromiseofDigitalHumanities.BMGN-LowCountriesHistoricalReview,128(4):55–77.

Hitchcock,T.(2014).BigData,SmallDataandMeaning.Availablefrom:http://historyonics.blogspot.co.uk/2014/11/big-data-small-data-and-meaning_9.html.

Kemman,M.(2016).DimensionsofDigitalHistoryCollaborations.DHBenelux.Belval,Luxembourg.

Klein,J.T.(2014).InterdiscipliningDigitalHumanities:BoundaryWorkinanEmergingField.UniversityofMichiganPress,onlineedition.

LeighStar,S.(2010).ThisisNotaBoundaryObject:ReflectionsontheOriginofaConcept.Science,Technology&HumanValues,35(5):601–617.

LeighStar,S.andGriesemer,J.R.(1989).InstitutionalEcology,‘Translations’andBoundaryObjects:AmateursandProfessionalsinBerke-ley’sMuseumofVertebrateZoology,1907-39.SocialStudiesofScience,19(3):387–420.

Siemens,L.(2009).’It’sateamifyouuse”replyall”’:Anexplorationofre-searchteamsindigitalhumanitiesenvironments.LiteraryandLinguisticComputing,24(2):225–233.

Siemens,L.andINKEResearchGroup(2012).FromWritingtheGranttoWorkingtheGrant:AnExplorationofProcessesandProceduresinTransition.ScholarlyandResearchCommunication,3(1).

Sternfeld,J.(2011).Archivaltheoryanddigitalhistoriography:Selection,search,andmetadataasarchivalprocessesforassessinghistoricalcontextualization.AmericanArchivist,74(2):544–575.

Svensson,P.(2011).Thedigitalhumanitiesasahumanitiesproject.ArtsandHumanitiesinHigherEducation,11(1-2):42–60.

Svensson,P.(2012).BeyondtheBigTent.InGold,M.K.,editor,DebatesintheDigitalHumanities.UniversityofMinnesotaPress,onlineedition.

Weedman,J.(1998).TheStructureofIncentive:DesignandClientRolesinApplication-OrientedResearch.Science,Technology&HumanValues,23(3):315–345.

Page 24: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

24

SessionD

1.ModellingandAnalyzingCharacterNetworksinRecentDutchLiteratureRoelSmeets(PhDcandidate)RadboudUniversityNijmegen,DepartmentofLiteraryandCulturalStudies

Keywords:socialnetworkanalysis,characternetworks,DigitalLiteraryStudies,Dutchliterature

CharacterrelationsWhenweinterpretnovelsweareinfluencedby(hierarchical)relationsbetweencharacters.Theserelationsarenotneutral,butvalue-laden:e.g.thewayinwhichweconnectClarrisawithRichardisofmajorimportanceforourinterpretationofthegenderrelationsinMrsDalloway(1925).Inliterarystudies,characterrelationshavethereforelainatthefoundationofavarietyofcriticalstudiesonliterature(e.g.Minnaard2010,Song2015).Abasicpremiseinsuchcriticismisthatideologicalbiasesareexposedinthe(hierarchical)relationsbetweenrepresentationsofcertaingroups(i.e.gender,ethnicity,socialclass).

Closereading–thecommon,traditionalmethodinliterarystudies–iswellsuitedforfine-grainedanalysesofthenuancesandsubtletiesofcharacterrelations,butfallsshortwhenitcomestofindingpatternsamongcharacterrelationsortestinghypothesesoncharacterrelationsinlargerbodiesofliterarytexts(cf.Stronks2013).

SocialNetworkAnalysisIncomputationallinguistics,inrecentyearsabroadeningrangeofresearchhasbeencarriedoutonthecomputationalanalysisofsocialnetworksin(literary)texts(e.g.Elsonetal.2010,Karsdorpetal.2012).Onthebasisofautomated,computationalmodelscharacterrelationsofallkindsareformalizedandmappedinlargeamountsoftexts.Althoughinitsinfancy,thisbranchofresearchshowsthatsocialnetworkscaninfactbereliablyextractedautomaticallyfromnarrativetexts(VandeCamp2016),andrelationshipscanalsobeclassifiedaccuratelybycomputationalmodelstrainedonexamples,e.g.asbeingromantic(Karsdorpetal.2015)

Thecurrentresearchprojectdepartsfromthehypothesisthatacomputationalapproachtocharacterrelationscanreveal(hierarchical)patternsbetweencharactersinliterarytextsinamoredata-drivenandempiricallyinformedway.Inordertotestthishypothesis,experimentsarebeingconductedwithdifferentformsofsocialnetworkanalysisofcharactersinacorpusof170recentDutchliterarynovels.Thetwomajormethodologicalchallengesare:

1. todefinethenodesthatconstitutethesocialnetworkofanovel2. todefineandtoweightherelationsbetweenthenodes

Thefirstmethodologicalchallengeisaboutdoingaformofcharacterdetection:NLPtechniquesasNamedEntityRecognitionandResolution,pronominalresolutionandcoreferenceresolutioncometomind.However,automaticcharacterdetectioninliterarytextsisfarfromaconvenientclassificationtask(Valaetal2015).

Thesecondmethodologicalchallengeisaboutfindingawaytodecidewhenandhowtwoormorecharactersinatext‘interact’.WhenFrancoMorettiinhisfamousbookDistantreading(2013)madeacharacternetworkofShakespeare’sHamlet,hedidthatonthebasisofoccurrencesofcharacterX(theaddressee)inthelinesofcharacterY(thespeaker).Novelsarefundamentallydifferentthandramaticplaysinthatrespect:charactersinnovelsusuallydon’tspeaktoeachotherinadirectway,andthedefinitionandweighingofcharacterinteractionthereforerequiresadifferentapproach.

Page 25: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

25

Top-downandbottom-upapproachInthistalkIwillarguethatapracticalcombinationofmanuallygathereddataandcomputationalanalysiscangaininsightinpatternsbetweencharacterrelationsinrecentDutchliterature.Insteadofusingabottom-upapproachofcharacterdetection,Iwillstarttop-downusingapredefinedlistofnamesofcharactersfromeachnovelinmycorpus.Furthermore,Iwillusemanuallygathereddatafromearlierresearchtoascribedemographicfeaturestothecharactersthatconstitutethenodesofthenetwork(VanderDeijletal2016).Assuch,itwillbepossibletorelatedemographicbackgroundsofcharacterstotheirrespectiveplaceinthecharacternetworkofthenovel.Moredataarecurrentlybeinggatheredmanuallyfromtheresearchcorpus:thematicrelationsasfamily,friend,lover,colleagueandenemy,whichwillbeusedtodepictthenatureoftherelationsbetweenthecharactersinthecorpus.

Iwilldemonstrateinthistalkhowmanuallygathereddata(demographicfeaturesandthematicrelations)canbeusedfordefiningboththenodesofthenetworkandthenatureofrelationbetweenthenodes.Moreover,Iwillshowhowatop-downapproachbasedonmanuallygathereddatacanbecomplementedandenrichedbyabottom-up,computationalanalysisofco-occurrences,whichwillwebeusedforweighingtherelations(or:interactions)betweenthecharacternodes.Theco-occurrenceanalysiswillconsistofpreciselydelineatedtextualwindows(onthesentencelevel)inwhichwillbesearchedfordifferenttokens(variantsofnames,pronouns)forspecificcharacterentitiesinadjacencywithtokensbelongingtoothercharacterentities.

ReferencesCamp,Matjevande.2016.Alinktothepast:ConstructingHistoricalSocialNetworksfromUnstructuredData.PhDthesis,TilburgUniversity(TilburgSchoolforHumanities).

Deijl,Lucasvander,Pieterse,Saskia,Prinse,Marion&Smeets,Roel.2016.‘MappingtheDemographicLandscapeofCharactersinRecentDutchProse:AQuantitativeApproachtoLiteraryRepresentation.’In:JournalofDutchLiterature(7:1).

Elson,David,Dames,Nicholas&McKeown,Kathleen.2010.‘ExtractingSocialNetworksfromLiteraryFiction’.In:Proceedingsofthe48thAnnualMeetingoftheAssociationforComputationalLinguistics(ACL2010),Uppsala.

Karsdorp,Folgert,Kranenburg,Petervan,Meder,Theo&AntalVandenBosch.2012.‘Castingaspell:Identificationandrankingofactorsinfolktales.’In:F.Mambrini,M.Passarotti,andC.Sporleder(eds.),ProceedingsoftheSecondWorkshoponAnnotationofCorporaforResearchintheHumanities(ACRH-2),pp.39–50.

Karsdorp,Folgert,Kestemont,Mike,Schöch,Christof,&Bosch,Antalvanden.2015.‘TheLoveEquation:ComputationalModelingofRomanticRelationshipsinFrenchClassicalDrama.’In:ProceedingsoftheSixthInternationalWorkshoponComputationalModelsofNarrative,pp.98-107

Minnaard,Liesbeth.2010.‘TheSpectacleofanInterculturalLoveAffair:ExoticisminVanDeyssel'sBlankengeel’.In:JournalofDutchLiterature(1:1).

Moretti,Franco.2013.DistantReading.London:Verso.

Song,AngelineM.G.2015.APostcolonialWoman’sEncounterWithMosesandMiriam.NewYork:PalgraveMacmillanUS.

Stronks,Els.2013.‘Deafstandtussencloseendistant.Methodenenvraagstellingenincomputationeelletterkundigonderzoek’.In:TijdschriftVoorNederlandseTaal-enLetterkunde(4).

Vala,Hardik,Jurgens,David,Piper,Andrew&Ruths,Derek.2015.‘Mr.Bennet,hiscoachman,andtheArchbishopwalkintoabarbutonlyoneofthemgetsrecognized:Onthedifficultyofdetecting

Page 26: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

26

charactersinliterarytexts.’In:Proceedingsofthe2015ConferenceonEmpiricalMethodsinNaturalLanguageProcessing,pages769–774,Lisbon,Portugal,AssociationforComputationalLinguistics.

2. Spinozist discourse in Dutch textual culture (1660-1720)A computational approach to the dissemination of the RadicalEnlightenmentLucasvanderDeijl,UniversityofAmsterdam

LiavanGemert,UniversityofAmsterdam

ErikvanZummeren,UniversityofAmsterdamContact:[email protected]

Keywords:Spinozism,RadicalEnlightenment,topicmodeling,discourseanalysis,textmining

Sincethelinguisticturn,theterm‘discourse’hasbeenanimportantinstrumentformanyhumanitiesscholars(Bové1995).Ithasbecomecommonpracticetostudyculturalhistorythroughthelanguageanddiscussionsinwhichitwasmediated.Currently,thegrowingavailabilityofdigitisedhistoricalmaterialprovidesnewwaysandscalestostudyhistoricaldiscourses,whichhavebeenrecognisedbydigitalhumanitiesscholarsatanearlystage(Olsen&Harvey1988).However,digitalapproachestohistoricalcorporafacetheproblemthattheoftenlooselydefinedterm‘discourse’isnoteasytoformalise.Intraditionalliterarystudies,theverylackofdefinitionisinherenttotheinfluentialpost-structuralistparadigmthatreinventedtheterm,inwhichmeaningisconsidered‘indefinite’bydefinition.Withinthistradition,discursiveelementsaremeasuredthroughbothmanifestandlatentsemanticrelations,withanequalfocusonwhatissaidandwhatisleftout,forgottenorsuppressed.Quantitativemethods,tothecontrary,requireamorereductiveunderstandingofwhatadiscoursecomprises(e.g.Jockers2013;Ramsay2011).Theyprimarilyrelyoninformationrepresentedincomputationallymeasurabletextelements,whichchallengesthetraditionaluseoftheterm.DigitalHumanitiesthuspromisenewopportunitiesforculturalhistory,butalsorequireacriticaltranslationoftraditionalmethodology.

Adominantapproachinthestudyofintellectualdiscoursesfocusesonconcepts(e.g.Mandelbaum1965;Lovejoy2001;Kuukkanen2008).Philosophersandcomputationallinguistshavecreatedmodelsandmethodsinordertoaccountforconceptualchangeordriftthroughtimecomputationally(Betti&Hein2014;Kenteretal.2015).Secondly,studiesthatemploydigitaltextanalysistoapproachhistoricaldiscoursesoftenuse‘topics’asarepresentationorindicationofdiscursivepatternsinlargetextcorpora(e.g.Nelson2010).Topicmodelingisausefultechnologyfornarrowingdownaresearchcorpusintoaselectionthatcouldbeofinteresttotheresearcher.Themethodalsoallowstracingtheevolvementofdominantthemesovertime.Itisespeciallyusefulwhentheresearcherhasnostrongintuitionsaboutthecorpus:thepoweroftopicmodelingisitsindependencefromassumptions(Underwood2012).Theuseoftopicsasameasurefor‘discourse’inthetraditionalsenseis,however,problematic.Atopicisformallydefinedasa‘distribution[ofwords]overavocabulary’andisnomorethanasetofwordsthatarestatisticallylikelytoco-occurinagiventext(Blei2012).AdiscourseintheFoucauldiansensecomprises(historical)values,sharedassumptions,‘commonsense’,associations,automatedmodesofwritingandthinking,whichconstituteandregulatepowerrelationsthroughlanguageandintertextuality(e.g.Foucault1977;Bové1995).WhenfollowingFoucaultsnotionofdiscourse,collocations–thebasiclinguisticelementfortopicmodeling–couldbemisleading.Theoperationalisationofdiscoursesthroughtopicsmaybeintuitive,butistheoreticallyfarfromevident.

Page 27: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

27

Thestudyofthedisseminationofconceptsanddiscoursesisespeciallyrelevantinthecontextoftheso-calledRadicalEnlightenment,amovementofproto-EnlightenmentintellectualinnovationinwhichSpinozaplayedakeyrole(Israel2001;Jacob1981;Krop2014).AsaresultoftheexplosivetheologicalandscientificdebatesthatthreatenedthestabilityoftheRepublicthroughouttheseventeenthcentury,radicaldiscoursesthatchallengedorthodox-Calvinistdoctrinewerefirmlysuppressedthroughcensorshipandprosecutionofauthors,publishersandprinters(Israel1997).Inspite(orbecause)ofthiscensorship,radicaldiscoursescirculated‘underground’,inclandestinepublicationsandcircuits(cf.Darnton1982).Manyculturalhistorianshavealsoindicatedhowauthorscommunicatedradicalideasindirectlyandambiguouslythroughliterarygenressuchasnovelsandpornography(VanBunge2003;Elias1974;Leemans2002;Wortel2006).TheFoucauldianmeaningof‘discourse’asapossiblemeansforthereinforcementofpowerrelationsbecomesevidentduringtheRadicalEnlightenment.

Ratherthanelaboratingonthetheoreticaldifferencebetweentopics,conceptsanddiscoursesonanabstractlevel,thispaperdemonstratesitthroughacasestudy.Itpresentscomputerassisteddiscourseanalysisasanapproachtoaspecifichistoricalquestion:howdidSpinozistphilosophydisseminateintoa‘Spinozist’discourseinearlymodernDutchtextualculture(1660-1720)?Inthisstudy,Spinozistphilosophywasreducedtoasetofcharacteristicconcepts(cf.DeBolla2013),whichwereidentifiedthroughtf-idf13frequencyanalysesandthenrefinedbyhand.Theconceptswererepresentedasnetworksofco-occuringwordsinseventeenthcenturyDutchtranslationsofeightworkswrittenbythephilosopher,translatedbyPieterBalling(?–1664)andJ.H.Glazemaker(1620-1682)(Thijssen-Schouten1967;Steenbakkers1999).14TheseconceptualnetworkswereusedasameasuretoidentifySpinozist‘discourse’inacorpusof500textspublishedbetween1660and1720.Forpragmaticreasons,thevocabularieswereassumedtobestable,butthispaperaddressespossibleadvancementsbasedontheliteratureonconceptualandlinguisticdrift(Betti&Hein2014;Kenteretal.2015).Also,conventionalproceduresappliedincomputationalintellectualhistoryweremodifiedinordertoreducetheproblemscausedbyspellingvariationinhistoricalDutch(e.g.inHerbelotetal.2012;Tangherlini&Leonard2013).

Theresultsobtainedthroughtheconcept-orientated‘topdown’approacharecontrastedwithamore‘bottomup’transformationofthecorpusbasedontopicmodeling.ThispaperevaluatesthedifferencesbetweenbothapproximationsofSpinozistdiscourseandshowshowSpinozisttextsunknowntothecomputerweresuccessfullyidentifiedanddescribed.Basedontheseresults,itformulatesaworkinghypothesisonthedisseminationofSpinozistdiscourseinDutchtextualcultureandadvancesthedebateontheresonanceof(Radical)Enlightenmentideaswithcomputationalresults(Darnton1982;Israel2001;Leemans2002;Edelstein2010etc.).

ReferencesBetti,A.&H.vandenBerg,‘ModellingtheHistoryofIdeas’.BritishJournalfortheHistoryofPhilosophy22(2014)4:812-835.

Blei,D.,‘ProbabilisticTopicModels’.CommunicationsoftheACM55(2012)4:77-84.

Bolla,P.de,TheArchitectureofConcepts.TheHistoricalFormationofHumanRights.NewYork2013.

13 ‘term frequency – inverse document frequency’. 14 Korte verhandeling van God, de mensch en deszelvs welstand (1660-1661); Renatus Des Cartes Beginzelen

der wysbegeerte, I en II bewezen (1664); Aanhangzel, over-natuirkundige gedachten (1664); Handeling van de verbetering van 't verstant (1667); Zedekunst, In vijf delen onderscheiden (1677); Brieven Van verscheide geleerde Mannen Aan B.d.S (1677); Staatkundige verhandeling (1677); De Rechtzinnige Theologant, of godgeleerde staatkundige verhandeling (1693).

Page 28: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

28

Bové,P.A.,‘Discourse’.In:F.Lentricchia&T.McLaughlin,CriticalTermsforLiteraryStudy.Chicago1995:50-64.

Bunge,W.van,‘Philopater,deradicaleVerlichtingenheteindevandeEindtijd’.MededelingenvandeStichtingJacobCampoWeyerman26(2003):10-19.

Darnton,R.,TheliteraryundergroundoftheOldRegime.Cambridge(MA)1982.

Elias,W.,‘HetspinozistischeerotismevanAdriaanBeverland’.TijdschriftvoordeStudievandeVerlichting2(1974):283-320.

Edelstein,D.,TheEnlightenment.Agenealogy.Chicago2010.

Foucault,M.,‘TheArcheologyofKnowledgeandtheDiscourseonLanguage’.Trans.A.Sheridan.NewYork1977.

Gemert,L.van,‘Steneninhetmozaïek.DevroegmoderneNederlandseromanalsinternationaalfenomeen’.TijdschriftvoorNederlandseTaal-enLetterkunde124(2008)1:20-30.

Herbelot,A.,E.vonRedecker,J.Müller,‘Distributionaltechniquesforphilosophicalenquiry’.Proceedingsofthe6thEACLWorkshoponLanguageTechnologyforCulturalHeritage,SocialSciences,andHumanities.Avignon2012:45-54.

Israel,J.,‘ThebanningofSpinoza’sworksintheDutchRepublic’.In:C.Berkvens-Stevelincke.a.(red.),TheemergenceoftoleranceintheDutchRepublic.Leiden1997.

Israel,J.,RadicalEnlightenment.NewYork2001.

Jacob,M.C.,TheradicalEnlightenment.Pantheists,freemasonsandrepublicans.London1981.

Jockers,M.,Macroanalysis.DigitalMethodsandLiteraryHistory.Urbana2013.

Kenter,T.M.,M.Wevers,P.Huijnen&M.deRijke,‘AdHocMonitoringofVocabularyShiftsoverTime’.Proceedingsofthe24thACMInternationalConferenceonInformationandKnowledgeManagement.Melbourne2015.

Krop,H.,Spinoza.EenparadoxaleicoonvanNederland.Amsterdam2014.

Kuukkanen,J.M.,‘MakingSenseofConceptualChange’.HistoryandTheory47(2008):351-372.

Leemans,I.,Hetwoordisaandeonderkant.RadicaleideeëninNederlandsepornografischeromans1670-1700.Nijmegen2002.

Lovejoy,A.O.,‘TheHistoriographyofIdeas’.ProceedingsoftheAmericanPhilosophicalSociety78(1938):529-543.

Lovejoy,A.O.,TheGreatChainofBeing.AStudyoftheHistoryofanIdea.Cambridge,MA/London2001[1964].

Mandelbaum,M.,‘TheHistoryofIdeas.IntellectualHistory,andtheHistoryofPhilosophy’.HistoryandTheory5(1965):33-66.

Nelson,R.K.,‘MiningtheDispatch’,2010.[http://dsl.richmond.edu/dispatch/pages/home]

Olsen,M.&L.G.Harvey,‘ComputersinIntellectualHistory:LexicalStatisticsandtheAnalysisofPoliticalDiscourse’.TheJournalofInterdisciplinaryHistory18(1988)3:449-464.

Ramsay,S.,ReadingMachines.TowardsanAlgorithmicCriticism.Urbana:UniversityofIllinoisPress,2011.

Page 29: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

29

Siebrand,S.J.,SpinozaandtheNetherlands.Aninquiryintotheearlyreceptionofhisphilosophy.DissertationRijksuniversiteitGroningen1980.

Steenbakkers,P.M.L.,‘BenedictusdeSpinoza.Eenoverzicht.’Filosofie9(1999)6:4-14.

Tangherlini,T.R.&P.Leonard,‘TrawlingintheSeaoftheGreatUnread:Sub-corpustopicmodelingandHumanitiesresearch’.Poetics41(2013)6:725-749.

Thijssen-Schouten,C.L.,UitdeRepubliekderLetteren.ElfstudiënophetgebiedderideeëngeschiedenisvandeGoudenEeuw.DenHaag1967.

Underwood,T.,‘Topicmodelingmadejustsimpleenough’.Online2012.[https://tedunderwood.com/2012/04/07/topic-modeling-made-just-simple-enough/]

Wortel,D.,‘VrouweninmannenklerenenSpinoza.DeKloekmoedigeLand-enZee-Heldin(1682)alsverpakkingvandefilosofievanSpinoza’.In:SpiegelderLetteren48(2006):27-55.

Page 30: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

30

SessionE

1.BuildingaConceptualArchitectureandDataModeltoaddresstheSustainableDataIntegrationProblemGeorgeBruseker,MariaTheodoridou,MartinDoerr(ICS-FORTH)

ResearchInfrastructures(RI)seekingtoprovideaunifiedresourcesettotheirusercommunitytendtobeginwiththeelaborationofanewmodelforunifyingadomainofdiscourseandthenseekouttheinstitutionalandpoliticalsupporttoundertakemappingstothedefinedcommonstructure.Theseprojectsareundertakenwiththecriticalaimoffacilitatingbroadresourceaccesswithinthedomainofinterest.Suchprojects,however,notablyfacestrongchallengesbothintermsofdefininganadequatemodeland,then,insustainingamappingandaggregationprocesswhichisunavoidablytimeconsumingandexpensive.Whilesuchresourceintegrationprojectsundoubtedlyserveacrucialroleinresearchenvironments,anessentialaspectofthisprocessseemstobeconsistentlyoverlooked.Dataarefundamentallyheterogenousinnature-astatethatcannotbeavoided-andareinaprocessofcontinuouspotentialoractualchange.Further,actorsmanagingresourceschangecomposition,statusandactivities.Thisquicklycreatesthepotentialforobsolesenceofanyintegrateddataenvironmentastheindexedresourcesinevitablychange.

Itseems,then,thatvaluecanbehadfromanewapproachthatfocusesonmakingintegrationsustainableandusefulinthelongrunbymodellingandmanagingtheintegrationprocessitself.Bymodellingthismetametalevelandprovidingadatastructureforthetrackingofthesame,weargue,itispossibletoprovidethenecessarymanagementstructuresforbuildinggroundupandon-demandaggregationwhichwillmeettheaimsofthisprocessbothinthepresentandintothefuture.Thispaperwilloutlinetheproposalofanewconceptualarchitecturetosupporthighlyscalableintegrationactivitiesfordevelopingevermoreintegratedpoolsofresourcesandaconceptualmodelcapableofrepresentingthedatarequiredtodrivethisprocess.

TheproposedconceptualarchitecturehasatitscorearegistrythatisalogicallyifnotphysicallydistinctdatastructurethatholdsdatapertainingtotheactivitiesofRIsandtheirmembersthemselves,theresourcestheyprovideandthemannerinwhichtheydoso.Theregistrymaintainsthepictureofwhohasanddoeswhatandwhereresourcesare,aswellastheirlevelofcompatibilitywithotherresources.Thedatarequirementsofthisregistryareextremelylightinordertoformaslittleabarrieraspossibletoparticipationinsuchaservicebypotentialpartners.Thebasicfunctionalrelationshipsthataretrackedtoallowthelong-termmanagementandcontrolofresourcesare:part-of,metadata-ofandindexed-by.Additionalmetadataisonlyrequestedinordertohelpdisambiguate

entitiesintheregistryandtosupportitsreadabilitybytheoperators.Intheproposedarchitecture,sourcemetadataanddataaswellastheirmultiplemappingsremaininacontentcloudwhichcanbeeitherdataheldbytrustedproviderswhoguaranteetheirmaintenanceor,otherwise,canbecopiedintoastablestoragefacilityatthetimeofregistration.Theregistryhastheintentiontoenabledecisionswithregardstothemanagementofdata,basedonthehighlevelviewof

resources,wherevertheymayresideacrossthedatacloud.Suchdecisionscouldinclude:identifyingdatasetsforanintegration,identifyinggapsincoverage,connectingorphaneddatasetsto

Page 31: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

31

appropriatecurators,followingupwithserviceproviderswithregardstoavailability/qualityofserviceetc.

Inordertosupporttheproposedarchitecture,itisnecessarytoproposeanewconceptualmodeldescribingintegrationprocessesthemselves.ThisisthefunctionoftheParthenosModel.BuiltoffananalysisoftheregistriesofexistingRIs,itaimstomodelthefundamentalresourcesandrelationsthatareofinteresttomanageinintegration.Identifiedthroughthisprocesswereanumberoffundamentalentitiesthestudyofwhoserelationsdrovethemodeldevelopment.Theseare:services,project,datasets,software,andactors.WhatwasofinterestinthemodelwastounderstandthenatureoftheseobjectsnotassuchbutastheyplayarolewithinRIs.Takingthisscopeintoaccountallowedforstronganalyticdistinctionsofthehighlevelentitiesofinterestdeliveringacompactmodelof+-38classesand50relations.

Particularmodellingchallengesincludedefiningthefunctionalroleofservicesandcollections.Serviceplaysacentral,ifoftenoverlooked,roleinRIdiscourse.Itiswhatbindsassetstoactorsandallowsforeffectivecommunicationbetweenagentsonascientificandtechnicallevel.Aparticularchallengewastomodel

servicebeyondthescopeofe-servicesandtounderstandthefullrangeofitsmeaning.Thisleadtothedefinitionofserviceasawillingnessandabilityforsomeonetotakeactiontothebenefitofsomeotheragent.Modellingserviceatthisgenericlevelandthenprovidinghighlevelclassesforhosting,curatingande-servicesallowsahighlyflexibledescriptionofthevariouskindsofserviceRIsprovidetotheirmembers,notablyincludingnon-ITrelatedservices.Anotherparticularmodellingchallengefacedwastoaddresstheperennialquestionofwhatconstitutesa‘collection’.Thefactofthepluralityofanobjectiseasilymodelledthroughpartofrelations,butthismissesanaspectofthephenomenonthat‘collection’triestoexpress.Considerationofthisquestioninrelationtothecontextofserviceallowedforahighlyusefulnewconceptualization,distinguishingpersistentandvolatiledigitalobjects.Theformerarestaticinformationobjectswhoseidentityisfixedatthebitlevelandhaveanobjectivelyidentifiableexistenceovertimefromtheirstructure.Avolatiledigitalobject,however,hasnofixedidentityinitself,sinceitundergoescontinuouschangeandmodification.Itinheritsanidentityfromthefactthatitisanobjectundercuration,theactivityofacurationservice,undertakenwithsomespecificplan.Bymakingreferencetotheserviceofcurationanditsplan,wecanidentifyvolatiledigitalobjectsor‘collections’overtime.

TheproposedshiftinfocusfromdomainmodellingtomodellingofRIintegrationprocessesthemselvesiscurrentlybeingtestedwithintheParthenosProjectwherethearchitectureandmodelarebeingimplemented.ThemodelisbeingdevelopedandvalidatedthroughaniterativeprocessofmappingfromtheparticipatingRIsregistriestothemodelforintegrationintheregistry.ThemappingprocessisbeingundertakenusingtheX3MLtoolkitforwritingdeclarativemappings.OncepopulatedtheregistrywillbeusedtogetanoverviewoftheintegratedresourcecapacitiesofthejoinedRIsanddetermineappropriatedeeplevelintegrations.ThetechnologiestoruntheaggregationandthesubsequentVREsareprovidedthroughtheGCubeandD4Sciencesystems.Todate,themodelhasshownitselfrobustagainstbasicrevisionandflexibleenoughtodescribethishigh-levelmanagementpictureofintegration.

Page 32: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

32

2. Improving data quality in Europeana by designing extensiveEDMrecords-TheUniversitätsbibliothekHeidelbergstudycasePierre-EdouardBarrault,ValentineCharles,AntoineIsaac(EuropeanaFoundation,PrinsWillem-Alexanderhof5,2595BE,TheHague,TheNetherlands)

IntroductionForthispaper,wehaveworkedonimprovingtheresultsofmappingprocessfromtheMETS15toEDM16schemas,formetadatarecordsassociatedwithculturalheritageobjects.WechosetopresentthecaseoftheUniversitätsbibliothekHeidelberg17,whichwasfoundedin1386andisGermany'soldestuniversityandoneoftheworld'soldestsurvivinguniversities.Itsmagnificentcollectionofabout25000records18containsparchments19andearlyprintedbooksfromthe14thcenturyuntilModernAge,orbooks,magazinesandnewspapersfromthe19thandonward,invariouslanguagesincludingFrench20,German,ItalianorSpanish.Itiswithoutanydoubtasolidaccomplishmentforanoldbookdigitizationproject,demonstratingthevalueaddedfromrespectingbothcontentintegritythankstohighdigitizationstandardscoupledwiththeIIIFframework,andinformationalqualitythroughrich,highly-structured,opendata.Inaddition,theinstitutionproposesitscollectionundertheCreativeCommons-Attribution,ShareAlike(BY-SA)openlicense,allowingforfreere-use21.

Ontheotherhand,theEuropeanaCollections22isanEuropeanplatformpartneringwithculturalinstitutionstocentralize,inanopenonlinedatabase,allmetadataandcontentrelatedtoculturalheritageobjectsavailableacrossEurope.Theplatformsactsasasearchenginetoexplorethesecollections,offersasetofcuratedchannelsfocusedonspecificthematics,andalsomakesseveralWebservicesavailablethatcanbeusedbydevelopers,creativesandresearchersfortacklingandre-usingdigitalculturalresources.

Previouslytothisexperiment,thecollectionoftheUniversitätsbibliothekHeidelberginEuropeanawasbasedonharvestsoftheOAI-PMHserveroftheinstitutionexposingmetadataundertheESEschema.Weusedtoreceivelimitedmetadatarecordsinwhichmultiplevaluesforagivenfieldweremappedinonlyoneinstanceofthisfield.Fieldssuchasdc:date,dc:typeanddc:subjectwerebiased.HavingsinglestringsintroducedinasinglemetadatafieldwithseparatorspreventstheEuropeanaautomaticsemanticenrichmentfromdetectingtheappropriatestringandenrichingtherecordbasedonthematchingstring.Othershortcomingswerebasedonthelackoflanguageattributesorrelevanthierarchicaldata.

15Seehttp://www.loc.gov/standards/mets/mets-schemadocs.html

16Seehttp://pro.europeana.eu/share-your-data/data-guidelines/edm-documentation

17Seehttp://www.uni-heidelberg.de/index.html

18Europeanarecordsforthisinstitutionhttp://www.europeana.eu/portal/en/search?view=grid&q=PROVIDER%3A%22Universit%C3%A4tsbibliothek+Heidelberg%22&per_page=96

19SeeHeidelbergerSchicksalsbuch(HeidelbergBookofFate),1491http://www.europeana.eu/portal/en/record/07932/diglit_cpg832

20SeeLeSifflet:journalhumoristiquedelafamille(LeSifflet:humorousfamilynewspaper),1872http://www.europeana.eu/portal/en/record/07931/diglit_sifflet1872.html?q=PROVIDER%3A%22Universit%C3%A4tsbibliothek+Heidelberg%22

21Seehttp://creativecommons.org/licenses/by-sa/4.0/

22Seehttp://www.europeana.eu/portal/en

Page 33: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

33

IIIFimplementationWefocusedourworkonthisspecificproviderwiththehopeforimprovingitscollections,whichwerealreadyavailableinEuropeanaCollections,withtheIIIF23featurestheyhadimplementedontheirside.Thisopentechnologicalframeworkcanbeimplementedwithincontentmanagementsystemstoenabledeepvisualisationfeatures(zoom,crop,effects),andtomakeimagesharingeasierontheWeb.

ThemaintargetofthisexperimentwasaboutimplementingIIIFmetadataelements,whichwerenotpresentinpreviouslysubmitteddatafromthisinstitutiontotheEuropeanaCollectionsdatabase.Afterinvestigatingtheavailabledataontheinstitution’sside,wedecidedtoharvestMETSrecordsasthiswasamuchrichermetadatasource,regardingbothIIIFcoreelementsandmetadatarangeandquality.

DataqualityEvenifmetadataimprovementsarenotalwaysobviousonaresultpageintheEuropeanaCollectionsportal,theyneverthelesshaveastrongimpactonsearchandoverallfindability.Ingestionofreliabledatathereforeparticipatesinensuringacohesiveexperienceforitsusers,from

Inthecaseofdigitalculturalheritage,qualitativedatasetscouldbedefinedasensembleofstandardised(suchasLODresources),granular,specific,relevantandconsistentmetadata,associatedwithhighqualityvisualisationstandards.Thenatureoftherecordsitselfshouldobviouslybeinconsiderationwhendefiningtheoverallstrategy.Forinstance,OCR24techniqueswouldmakesenseinthecaseoftextdocumentswhilefocusingonhighdigitizationstandardswouldbettersuitphotographs.Dataqualityisyetcriticaltosupportusersfocuseddiscoveryscenarios25,andlong-termstrategytoimproveitshouldbeconsidereddefactobyanyculturalinstitutions,asaleveragetoreachawideraudience.

ByusinganothermetadatasourcefromTheUniversitätsbibliothekHeidelberg,werefinedandimprovedtheoveralldataqualitybyrelyingonLinkedOpenDataresourcesfromtheGNDauthorityvocabularymaintainedbytheGermanNationalLibrary26,whichwereavailableintheoriginalMETSrecords.Wethereforeincluded,assystematicallyaspossible,theprovidedURIsofresourcesrelatedtoagents,conceptsandplaces.ThisapproachfollowsLODimplementationbestpractices:onlylinkstoresourcesareprovidedintheingestedrecords,andthenEuropeanade-referencesthem,fetchingalltheavailablemetadataforeachprovidedURI.

Wealsoappliedstricterconditionstothemappinginordertopreservethesemanticprecisionandgranularityoftheoriginaldataasmuchaspossible.Thiswasdonebychoosingmorespecificmetadatafields,andrejectingirrelevantones.Wefocusedoncoremetadataelementsrelatedtotypology,format,temporalandgeographicalinformation.Wealsocreatedanadhocdescriptionfieldinordertoprovidemorephysicallocationinformationtousers.

Furthernormalizationwasdoneforagentsrelatedtotheserecords(e.g.creatorsandcontributors),whichwerepreviouslysentwithoutanyroledistinction.Wedisambiguatedthemappingofthese

23Seehttp://iiif.io/

24Seehttps://en.wikipedia.org/wiki/Optical_character_recognition

25MostofEuropeanausersrelyonthesearchfunctionality,and59%ofthemuseextrafilteringoptionstorefinetheirsearch.Morethanhalfoftheuserssearchitemsbasedonspecificgeographicallocation.(Source:EuropeanaCollectionsOnlineSurvey,April2016)26Seehttp://www.dnb.de/EN/Standardisierung/GND/gnd_node.html

Page 34: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

34

elementsusingtheMARCRelatorscodes27originallyembeddedintheMETSrecords,suchas“aut”thatrepresents“Author”.Thecodeswereusedtoidentifytheagentsascreatorsorcontributors,andthenwerenormalizedintostringstobedirectlyincorporatedintotheresultingEDMrecordsasadditionalmetadata.

Finally,hierarchicalrelationshipsthatwerenotmadeavailableintheoriginalconversionwererepresentedinthenewmetadata.Wefocusedonrecordsforindividualjournalsencompassedinbiggervolumes,andmappedtherelevantmetadata-referencestoparentandchildrenrecords-withinhierarchicalfields.Thisenabledabetterexperienceforendusersthankstothedisplayofawidgetdedicatedtobrowsehierarchicalresourcesbyfollowingtheircardinalityortheirappartenance.

ResultsThefirstoutcomeofthisworkisanextensivereportpresentingthisstudycase,standingasdataguidelinesavailableintheProsectionofEuropeanaCollections28.However,ourresultsrelyonbothqualitativeandquantitativeachievements.

TheoveralldataimprovementempowerstheEuropeanausers-creatives,searchers,curious-withhigherqualityresults,allowingthemtotailortheirexperienceevenfurtherfromthemainpublicaccess.Specificdatareuseordataminingscenariosalsobenefitfromsuchexperiment,thankstotheEuropeana’sRESTAPI29.Inaddition,thecompatibilitywiththeIIIFframeworkensureaseamlessuserexperiencecarriedoutthroughextendedvisualisationfeatures.ThiscanbetransposedintomoreadvancedapplicationsbydirectlyreusingtheaggregatedIIIFmetadatafromEuropeana,e.g.withinDigitalHumanitiesvisualisationprojects.

Finally,theupdateddatasetsdidn’tnecessarilygrowinsize,recordswise.Butinsteadoftheformer1thumbnailperrecordrule(forabout25Krecords),thenewlyaddedIIIFmetadataenablestheEuropeana’sviewertofetchnowmorethan3.5Mhigh-resolutionpictures(+1600pxwide)fromalltheconnectedJSONmanisfests30.

3.EasingAccesstoLinkedDataResourcesforDigitalHumanitiesScholarsAlbertMeroño-Peñuela1andRinkeHoekstra1,21ComputerScienceDepartment,VrijeUniversiteitAmsterdam,NL{albert.merono,rinke.hoekstra}@vu.nl2FacultyofLaw,UniversityofAmsterdam,NL

Abstract.SemanticWebtechnologycomprisesavarietyoflanguages,standardsandpracticesthat,overthelasttwodecades,hasfacilitatedtheemergenceoftheLinkedOpenData(LOD)Cloud–aglobalWebgraphofmorethan100billioninterconnectedstatements[1].DatasetsinthisLODcloudcovera

27Seehttp://www.loc.gov/marc/relators/

28Seehttp://pro.europeana.eu/share-your-data/data-guidelines/edm-case-studies/the-universitaetsbibliothek-heidelberg-case-study

29Seehttp://labs.europeana.eu/api/introduction

30Seehttp://iiif.io/api/annex/notes/jsonld/#greedy-compaction-of-terms

Page 35: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

35

varietyofdomains,includinggeography,government,lifesciences,linguistic,media,publicationsandsocialnetworking.DespitethissuccessintegratingdataontheWeb,SemanticWebtechnologyisstillverypresentateveryleveloftheLODcloud.ThisincludestheearlylayerofaccessingLinkedData;thisis,themechanismbywhichusersselectandgrabthedatatheyconsiderfortheirapplicationsoranalyses.AccessingLinkedDatarequirescertaintechnicalskills–mostlyinvolvingunderstandingoftheResourceDescriptionFramework(RDF)[6]andtheSPARQL[7]querylanguage,butalsootherssuchasSQUIN[3]orLinkedDataFragments[8]–thatveryoftenexcludepotentialusers.Inthedigitalhumanities,manyscholarslackthistechnicalknowledge,andconsequentlymissagreatdealofLODsourcesoftheirinterest.Thisincludes,butisnotlimitedto,multiplelinkeddatasetsonhistoricalstatistics(e.g.CEDAR[2],CLARIAH[4]),museumcollections(e.g.Amsterdam,BritishMuseum,Smithsonian),linguisticresources(e.g.lexinfo,BabelNet),andmedia(e.g.MusicBrainz,BBC,NewYorkTimes,LinkedMovieDatabase)).Althoughthesescholarsarebecomingmoreandmoretechsavvy,deepknowledgeoftechnologyshouldnotbeastrictrequirementforaccessingLinkedData.Inordertoaddressthisissue,weproposegrlc[5],anLinkedDataaccessingserverthatusesSPARQLqueriesstoredanywhereontheWebtogeneratecomprehensive,welldocumented,neatlyorganized,andprovenance-trustedAPIspecifications.SuchAPIsmakeanyLinkedDataactionable,makingaccesstoLinkedDatasourceseasy,repeatableandshareablewithonesingleURIentrypoint.grlcreliesontheSwaggerUI31,anOpenAPI32frontend,topresenttheseAPIstotheuserasanintuitiveuserinterface.Inthisdemo,wewillshowhowgrlccanhelponeasingthetraditionallyhightechnicalrequirementstoaccessLinkedData.WewillillustratethiswithseveralrunningusecasesinCLARIAH33,aDutchnationalprojecttobuilddigitalinfrastructureforthehumanities.

Keywords:LinkedData,API,REST,SPARQL,#LD,WebDataaccess,middleware,OpenAPI

References1.Abele,A.,McCrae,J.P.,Buitelaar,P.,Jentzsch,A.,Cyganiak,R.:LinkingOpenDataclouddiagram.http://lod-cloud.net/(2017)

2.CEDARProject,http://www.cedar-project.nl/

3.Hartig,O.:Squin:Atraversalbasedqueryexecutionsystemfortheweboflinkeddata.In:Proceedingsofthe2013ACMSIGMODInternationalConferenceonManagementofData.pp.1081–1084.SIGMOD’13,ACM,NewYork,NY,USA(2013),http://doi.acm.org/10.1145/2463676.2465231

4.Hoekstra,R.,Meroño-Peñuela,A.,Dentler,K.,Rijpma,A.,Zijdeman,R.,Zandhuis,I.:AnEcosystemforLinkedHumanitiesData.In:Proceedingsofthe1stWorkshoponHumanitiesintheSemanticWeb(WHiSe2016),ESWC2016(2016)

5.Meroño-Peñuela,A.,Hoekstra,R.:grlcMakesGitHubTasteLikeLinkedDataAPIs.In:TheSemanticWeb:ESWC2016SatelliteEvents,Heraklion,Crete,Greece,May29–June2,2016,RevisedSelectedPapers.pp.342–353.Springer(2016)

6.TheWorldWideWebConsortium(W3C):ResourceDescriptionFramework(RDF).http://www.w3.org/RDF/

31See http://swagger.io/swagger-ui/ 32See https://www.openapis.org/ 33See http://www.clariah.nl/en/

Page 36: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

36

7.TheWorldWideWebConsortium(W3C):SPARQLQueryLanguageforRDF.http://www.w3.org/TR/rdf-sparql-query/

8.Verborgh,R.,Sande,M.V.,Colpaert,P.,Coppens,S.,Mannens,E.,vandeWalle,R.:Web-ScaleQueryingthroughLinkedDataFragments.In:Proceedingsofthe7thWorkshoponLinkedDataontheWeb(LDOW2014),WWW2014(2014)

Page 37: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

37

SessionF

1.TheNederlabresearchenvironment:anupdateHennieBrugman&AntalvandenBoschMeertensInstitute,[email protected]

Nederlab34(Brugman,2016)isafiveyearlong'NWOgroot'projectbuildingaresearchinfrastructureforprimarilyhistoriansandliterary,linguisticandculturalscholars.Buildingthisinfrastructureinvolvesactivitiesinthreemaintracks:

1. Acquisition,harmonisation/semanticmapping,textenrichmentandmetadatacurationofasubstantialnumberofexisting(historical)DutchdigitaltextcollectionsofouracademicandculturalheritagepartnersintheBenelux.

2. ImprovingthequalityoftheoutputofexistinglanguageprocessingtoolswhentheyareappliedtohistoricalDutchtextsfrom800untilpresent.

3. Buildingavirtualresearchenvironmentwithapowerfulsearchbackendforexploration,searchandanalysisofmetadataandannotedtextfromourverylargeaggregatedandintegratedcollections(Brouwer,2016).

Wearecurrentlyinthelastyearofourproject.Therefore,inourcontributionwewouldliketotaketheopportunitytoevaluatetowhatextentwehavebeenabletoimplementouroriginal,ambitious,projectusecases.WeintendtosupportthisevaluationwithademonstrationatDHBenelux2017.

Ingeneral,weexpecttohaveprocessedbetweentwentyandthirtycollectionsbytheendofourprojectandtohavemadethoseavailabletotheresearchcommunity.Atthemomentofwritingthis,wehavereachedatotalofalmosttenbillionwordsofannotatedtext,accessiblethroughouronlineVirtualResearchEnvironment,the'researchportal'35.Duringthelastyearofourprojectwearecarryingoutanumberofscientificpilotprojectsinanopencall,totesttheusabilityofthisVREandtheNederlabcollections,andtoaddextensionsbasedonrealuserneeds.

Belowwewillzoominonouroriginalcategoriesofusecases.

1.Detectingtheonsetofchange

Whendonewconceptsoccurforthefirsttime?Ornewwordforms?Orwordcombinations(collocations)?

Bytheendofourprojectwewillhavecollectiondataforallperiodsbetween800andpresenttime,therebyenablingfulldiachronicsearches.OurNederlabresearchportalisabletovisualisetimedistributionsoverallhitsfoundforspecificqueries,bothdocumentandhitcountsandshowingabsoluteaswellasrelativefrequencies(forexample,showthenumberofoccurrancesof'vliegtuig'-airplane-foreachyear).ThesystemsupportscomplexqueriesforsequentialpatternsovermultipleparallelannotationlayersusingtheCorpusQueryLanguage(2),aquerylanguageintroducedbytheCorpusWorkBench(CWB)andregularlyusedinourdomain(e.g.bySketchEngine,MTAS,BlackLab).NederlabusesMultiTierAnnotationSearch(MTAS) 36.SearchingforpatternsusingCQL,incombinationwithgroupingofresultsenablesresearcherstoinvestigatewordcombinationsandhow

34 www.nederlab.nl 35 www.nederlab.nl/onderzoeksportaal 36 https://meertensinstituut.github.io/mtas/

Page 38: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

38

oftentheyoccur,forspecificperiodsintime.Forexample,itispossibletoqueryforthemostfrequentnounsusedinsentencescontainingthelemma'varen',foreachcentury,toinvestigatepotentialshiftsinmeaningovertime(inthiscasefrom'go'to'gobyboat').

2.Establishingthespreadofchanges

Howdosuchchangesspread,overtime,overplaces,fromonetexttypetoanother,fromoneauthortoanother?

Oursystemallowsuserstosearchforwordsorpatternsandvisualisetheresultsasdistributionsovermanymetadatadimensions,evenovermultipledimensionssimultaneously(e.g.timeandgenre).Itisalsopossibletodirectlycomparetimedistributionsfordifferentsearchtermssimultaneously(usinga'trends'visualisation,e.g.'mensch'versus'mens')(TjongKimSang,2016).

3.Findingconnectionsandnetworks

Findandinvestigatemotivesusingsemanticwordfieldsaroundconcepts.Establishrelationsbetweenpersonsandplaces.

WecurrentlyalreadysupportexpansionofquerieswithhistoricalvariantsusingawebservicebuiltaroundtheDutchhistoricallexiconbytheInstituutvoordeNederlandseTaal(INT).Weintendtogeneralizeandextendthisqueryexpansionmechanismtoincludesemanticexpansionandexpansionwithuserdefineddomainlexica.Wewilldothisincollaborationwithanumberofourongoingscientificpilotprojects.Anexampleofsuchadomainlexiconisasemanticlexiconcontainingemotionwords.

Networksofpersonsandplacescanbechartedonbasisofthenamedentitiesthatwereaddedtoourcorpusduringtheenrichmentprocess.WeuseCQLsearchingincombinationwithgroupingfunctionalitytodothis(e.g.listthemostfrequentlymentionedpersonsinsentencesorparagraphscontainingthelocation'Deventer').

4.Detectingsimilaritiesanddifferencesbetweentexts

Investigatereuseoftextfragmentsamongauthors.Comparetextsortextcollectionswithcorpusanalysistools.

Forindividualtextsorforanysubcollectionoftextsfromourcompletecorpuswecanqueryforstatistics.Wecandeterminetotalnumbersofdocuments,tokensandtypes,butalsomeanandmediannumberofwordsperdocument,infact,oursystemcanreturncompletewordcountdistributionsthatcanbedirectlyvisualised.Otherstatisticsthataresupported:numbersofsentences,paragraphs,divisions,heads,frequencylistsoverwordsoroveranyoftheannotationlayers,foranysubcollectionofourcorpus.Allofthesestatisticsandlistscaninprinciplebeusedtocomparetextdocumentsorcompletedocumentcollections.Allstatisticscanalsobeexportedforfurtheranalysisinexternaltools,likeforexampleR.

ConclusionAfteranumberofyearsofconstructingthefoundationsofourinfrastructure,theprojectisnowatastagewherewecanstartusingitforrealresearchpilotsorprojects.Althoughthereissubstantialroomforimprovementonmanyaspectsofourproducts,ourinitialaimsarewithinreach.

ReferencesBrouwer,Matthijs,HennieBrugman,Marckemps-Snijders(2016).‘MTAS:ASolr/LucenebasedMultiTierAnnotationSearchsolution’,CLARINAnnualConference2016,Aix-en-Provence,France,26-28October2016.

Page 39: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

39

Brugman,Hennie,MartinReynaert,NicolinevanderSijs,RenévanStipriaan,ErikTjongKimSang,AntalvandenBosch(2016).‘Nederlab:TowardsaSinglePortalandResearchEnvironmentforDiachronicDutchTextCorpora’,in:ProceedingsofLREC(10theditionoftheLanguageResourcesandEvaluationConference,23-28May2016,Portorož(Slovenia),pp.1277-1281.

Christ,O.(1994).AModularandFlexibleArchitectureforanIntegratedCorpusQuerySystem.InProceedingsofCOMPLEX’94:3rdConferenceonComputationalLexicographyandTextResearch,Budapest).

TjongKimSang,Erik(2016).'FindingRisingandFallingWords',In:ProceedingsoftheCOLING2016workshopLanguageTechnologyResourcesandToolsforDigitalHumanities,ACL,Osaka,Japan,2016.http://ifarm.nl/erikt/papers/lt4dh2016.pdf

2.ModelingtheevolutionoflanguagesthroughtextminingAproposedmethodologyappliedtothetransitionbetweenLatinandromancevernaculars

FlorianCafieroandRemyVerdo

Themechanismsatstakeinthepassagefroma“dilateddiasystem”,wherealanguagebecomesmoreandmorecomplex,toa“disconnecteddiasystem”,wheretwodistinctlinguisticsystemsappearinthesameculturalsystem,seemtobeawell-studiedproblematic.

Forinstance,severalmodelshavebeenpresentedtodescribetheevolutionfromLatintoromancevernacularsinthepastdecade.ThefirstmodelproposedtoaddressthisquestionisErnstPulgram’s(Pulgram,1950:462).InthispioneeringworkLatin,languageishoweverrepresentedaccordingtothetraditionalwrittenvs.oraldistinction,anddoesnotallowaverydetailedanalysis.Itsdeterministicapproachmightalsoleadtosomeinaccuracies,thelanguagebeingconsideredasalwaysfurtherfrom“oldLatin”asthetimegoesby.In1986,WalterBerschin(Berschin,1986:148)proposedamorecomprehensivemodeling.Berschinproposesatwo-sideddiachronicmodeling.TheconceptofvulgarLatinismorerefinedhere,asitincludesbothwrittenandspokenlanguage.Yet,heretoo,vulgarLatinisseparatedfromliteraryLatin,the“stylisticallevel”(Stilhöhe)ofwhich,evenwhenitisatlowest,nevercrosses,oreventouches,thecurveoforallanguage.Whatismore,thisVulgarLatinissupposedtoevolvelinearly,asinPulgram’sworks.ThecurvemodelingliteraryLatinseemstorepresentthesoleevolutionofthehigherregisteroflanguagethatisobserved.Itdisregardstheco-existenceofdifferentregistersoflanguageinliteraryLatin,andignorestheirarticulationtovulgarLatin.Last,wecanonlyregrettheabsenceofdatatakenfrom“diplomatic”texts,inwhichstylisticandpragmaticeffortsarealsotobenoticed.

Hence,thosestudiesraiseafewproblems.Theydonotaddresswelltheproblemofregisters,usingverybroaddistinctions,andforgettingthepossibilitythatdifferentlanguageregisterscouldbeusedatthesametime,eveninthesametext.Theyalsoare“expertsview”,basedontheauthor’sextensiveexperience,ratherthanonasystematicanalysisofthetexts.

Wethusproposeamethodologytosystematicallystudytheevolutionofalanguagefromaformtoanother,takingintoaccountourremarkonregisters.Thismethodologyinvolvescomputerizedstatisticalanalysisandartificialintelligencebutshouldnotbeseenasanautomatedprocessdisconnectedfromthelinguist’sanalysis.Onthecontrary,ithasbeendesignedasawaytoextendthewayofthinkingofaparticularexpert.Itenablestopartiallyre-createhisownpointofview,andtoapplytoalargeamountoftext,thatwouldtaketoolongtoanalyzeotherwise.

Thefirststepconsistsin“traditional”linguisticanalysisonaselectionoftexts,aimedatdifferentiatingseveralregistersusedinsidethetextsofone’speriodofinterest.

Page 40: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

40

Oursamplecorpusconsistsinthreehagiographicaltextsandtwenty-onediplomatictexts.OurthreehagiographicaltextswerewritteninlaterMerovingianorinearlyCarolingianages(ca.650-780),thenrewrittenduringtheCarolingianRenaissance(from780tothedeathofCharlestheBaldin877,orso).Thediplomatictextsare21originalFrankishroyalchartersdatingfromca.665to868.Mostofthemareaccountingforajudgment.OriginallypartofthegreatcollectionofthemonasteryofSaint-Denis,theyarekeptintheFrenchnationalArchive.

Weisolatefivelanguageregistersinthissamplecorpus,consistentwithMichelBanniard’sworks(Banniard,2008),andwedesignatableofcriteriatocharacterizethem.

Wethengothroughacalibrationphase.Wetrytoapplyvariouscomputingmethodsthatcanhelpisolatingdifferentlanguageregistersusedinthevarioustextsofthecorpus-orinsideontextofthecorpus.Thisinitiallycallsforunsupervisedmethods,aswewouldnotwanttoinfluencethecomputations’outcome.Thestatisticalanalysiscouldrevealdivisionsweignored,highlightunnoticedphenomena...Wetrytoimplementclusteringalgorithmssuchask-means,hierarchicalclustering,andvariousneuralnetworks.Wethencomparetheperformanceofthosealgorithmswithsupervisedalgorithms,whereoursamplecorpusisusedastrainingdata.

Crucialforthoseanalysisisthewaywechoosetopresentthetextstoouralgorithms.Lemmatizingthetextswouldremovetoomuchinformation.Here,evensmallvariations,suchaswrittenformvariations,arelikelytobesignificant.Itcansometimesbeevenmoresignificantthanthegrammaticalstructureofthetextsitself.Thisiswhyweapplyourcomputationstotwotypesofversionofourcorpus’texts.Inthefirstversions,thetextsaretreatedasalistofwords,withoutanyfurthertreatment,orwithaselectionofthemostfrequentwords.Inthesecondversions,thetextsaretreatedasn-grams(for8>n>3),withoutanyfurthertreatment,orwithaselectionofthemostfrequentforms.N-gramscandemonstrategreatperformancehere,astheyallowtotakeimplicitlyintoaccountthestructureofthesentences-here,whichwordcomesafterwhich.

Wecompareallthosefindingswithourown“expert”modeldesignedonoursample,andselectthesolutionthatgivesthemostaccuratedivisioninregisters.

Wethenruntheselectedalgorithmonanextendedcorpus,formedbyalargeselectionoftextswrittenduringthesameperiod(650-877).Wethenfollowtheregister’sevolutionacrosstimeonthisbroadercorpus.Wethenconcludeontheglobalconsistenceoftheseresultswiththemodelwedesignedbyanalyzingourfirstsample.

BIBLIOGRAPHYMichelBanniard,«Dulatindesillettrésauromandeslettrés:laquestiondesniveauxdelangueen

France(viiie-xiiesiècle)»,inZwischenBabelundPfingsten:SprachdifferenzenundGesprächsverständigunginderVormoderne(9.-16.Jh.):Aktender3.deutsch-französischenTagungdesArbeitskreises«Gesellschaftund

individuelleKommunikationinderVormoderne»(GIK)inVerbindungmitdemHistorischenSeminarderUniv.Luzern,Höhnscheid(Kassel),16-19nov.2006,PetervonMoosed.,Münster,2008(«GesellschaftundindividuelleKommunikationinderVormoderne»,1),p.269-286.

W.Berschin,BiographieundEpochenstilimlateinischenMittelalter,Stuttgart,t.3:KarolingischeBiographie,750-920,1991.

PieraMolinelli,«Perunasociolinguisticadellatino»,inLatinvulgaire–latintardif:actesduVIIe

colloqueinternationalsurlelatinvulgaireettardif(Séville,02-06septembre2003),éd.CarmenAriasAbellán,Séville:UniversidaddeSevilla,2006,p.463-474.

Page 41: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

41

Giovanni Polara, « Problemi di ortografia e di interpunzione nei testi latini di età carolina », Grafia e interpunzione del latino nel Medioevo (Roma, sept. 1984), éd. Alfonso Maieru, Rome, 1987.

Ernst Pulgram, « Spoken and written Latin » , Language. Journal of the Linguistic Society of America, t. 26, 1950.

3.Experimentsinfine-grainedentitytypingforDutchMariekevanErpandPiekVossenComputationalLexicologyandTerminologyLab,VrijeUniversiteitAmsterdam

IntroductionManyentityrecognitionapproachesclassifyrecognisedentitiesintoalimitedsetofcoarse-grainedentitytypes[1].However,fine-grainedentitytypesaremoreusefulfordeepernaturallanguageanalysisandend-usertasks,inparticularinthedigitalhumanitiesdomainwhereentitylinking(groundinganentityinaknowledgebase)isnotpossible.Forexample,whilestandardnamedentityrecognitionmaydeterminethatanentityisapersonknowingwhetherthatentityisawriterorapoliticianisimportantforpopulatingadatabaseofpersonswithparticularoccupations.Currently,fine-grainedentitytypinghasonlybeeninvestigatedforEnglish.Inthisabstract,wepresentafine-grainedentitytypingsystemforDutchusingtrainingdataextractedfromWikipediaandDBpedia.OursystemachievescomparableperformancetoEnglishwithanF1measureof.90on59typesand.57on269types.

ApproachOurapproachtogeneratetrainingdataisinspiredby[2]and[3].In[2],thetrainingdataisgeneratedusingWikipedia,wherethewikilinkanchortextisextractedasanentitymentionwhichmapittoitscorrespondingFreebaseentitytypes.WealsotaketheWikipediawikilinks,anchortextandsurroundingtext,butinsteadoflinkingittoFreebase,welinkittoDBpedia[4].TheadvantageofDBpediaisthatitisbasedonWikipedia,thereforethereisadirectlinkavailablebetweenawikilinkandDBpediathroughamappingsfile.37

Featurename Description Example

Mention Theentityphrase SanFrancisco

Head Thesyntacticheadoftheentityphrase Francisco

Non-head Thenon-headtokensintheentityphrase San

Entity-shape Thewordshapeofthewordsintheentityphrase

AaaAaaaaaaa

Trigrams Charactertrigramsintheentityhead _FrFraranancncicisiscscoco_

Wordbefore Thewordbeforetheentityphrase te

Wordafter Thewordaftertheentityphrase Californië

37http://downloads.dbpedia.org/2016-04/core-i18n/nl/wikipedia_links_nl.ttl.bz2

Page 42: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

42

Table1:Descriptionoftheextractedfeatures

Webaseourfeaturevectorson[3],whereweleaveoutthedependencyandtopicrelatedfeaturesduetoprocessingconstraints.ThisresultsinthefeaturesdisplayedinTable1.

Tocompareourresultstothoseinpreviouswork,wemappedtheDBpediatypehierarchytotheentitytypinghierarchyusedin[2]and[3].Outofthe86typesthatwerepresent,9typescouldnotbemappedtotheDBpediatypehierarchy.38Asnotalltypesarepresentinthedataset,weonlyfind59ofthetypesfrompreviousworkinourdataset.WealsoperformaseriesofexperimentswiththefullDBpediatypehierarchy,resultinginanexperimentwith269typestopredict.

Astherearenofine-grainedentitytypingdatasetsavailableforDutchyet,wesplitthegenerateddatasetinto⅔partsfortrainingand⅓partsfortest.Thisresultsinabout1millioninstancesfortrainingonthesetwith59entitytypes,and2milliononthesetwith269entitytypes.

WeusetheFastTextalgorithm[5,6]39totrainourtypepredictionmodel.Thisalgorithmlearnsrepresentationsforcharactern-gramsandwordsarerepresentedasthesumofthen-gramvectors.Thishelpsincoveringmorphologicallyrichlanguages,wordsthatdonotoccuroftenandpotentiallyentitymentionsthatdonotoccurinthetrainingcorpus.

ExperimentsandResultsWefirstevaluateourapproachontheentitytypesfrompreviouswork(rows2-6inTable2).AtLevel1,coarse-grainedentitytypes(person,location,organisation,andother)areevaluated.Thesearethesamehigh-leveltypesthatarepresentinmostnamedentityclassificationtasks.AtLevel2,thefiner-grainedentitytypesthataredirectlybelowtheseareevaluated(e.g.person/artistandorganisation/company).AtLevel3,superfine-grainedtypesareevaluated,forwhichwestillachieveamacroF1of.90(e.g.person/artist/musicandorganisation/company/news).

Types Precision Recall F1

Level1:4types .98 .98 .98

Level2:33types .92 .90 .91

Level3:24types .89 .91 .90

Overall(59types) .93 .88 .90

Overallonlydarkentities(59types) .67 .56 .60

DBpediatypes(269) .68 .52 .57

DBpediatypes,onlydarkentities(269types) .50 .41 .44

Table2:Precision,recallandmacro-averageF1

38Thetypeswecouldnotmapwerethefollowing:location/structure/government,organization/stockexchange,other/health,other/livingthing,other/product/car,other/product/computer,person/education,person/education/student,person/education/teacher39https://github.com/facebookresearch/fastText

Page 43: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

43

Wealsoevaluatedtheapproachononlydarkentities(i.e.entitymentionsthatwerenotpresentinthetrainingdata).40HereweseethatthescoresdroptoandF1of.60whichisinlinewithpreviousresearch[7].Itisunlikelythatthereisnooverlapbetweenthetrainingandtestdata,butthisissuedeservesfurtherinvestigation.

Furthermore,weseethattheresultsfortheDBpediatypehierarchycontaining269typesaresignificantlylower,butthereislesstrainingdataavailableforthoseandnotall685DBpediatypesarecovered.ThisispartlyaresultofthemappingsfileonlycontainingthemostspecificDBpediatype,forexamplehttp://nl.dbpedia.org/resource/Old_Amsterdamislistedashavingtype‘Cheese’inthemappingsfile,butitssuperclass‘Food’isnotpresent.

ConclusionsandFutureWorkWehavepresentedanapproachandexperimentsforfine-grainedentitytypingforDutchwhichcanbeparticularlyinterestingforcollectinginformationaboutentitiesindigitalhumanitiessources.OurresultsareonparwithpreviousworkforEnglishandoursoftwareisavailableathttps://github.com/cltl/multilingual-finegrained-entity-typing.

Forfuturework,weaimtotesttheapproachonhistoricaldatasetssuchastheNIOD“GetuigenVerhalen”datasetandBiografischPortaal.Wealsointendtocompileasubsetofmostrelevanttypesforthedigitalhumanitiesdomainandprovideatrainedmodelforreusebyhumanitiesresearchers.

References:[1]Nadeau,D.,Sekine,S.:Asurveyofnamedentityrecognitionandclassification.LingvisticaeInvestigationes30(1),3–26(2007)

[2]Ling,X.,Weld,D.S.:Fine-grainedentityrecognition.In:AAAI(2012)

[3]Gillick,D.,Lazic,N.,Ganchev,K.,Kirchner,J.,Huynh,D.:Context-dependentfine-grainedentitytypetagging.In:arXiv(2014)

[4]Bizer,C.,Lehmann,J.,Kobilarov,G.,Auer,S.,Becker,C.,Cyganiak,R.,Hellmann,S.:DBpedia-acrystallizationpointforthewebofdata.WebSemantics:science,servicesandagentsontheworldwideweb7(3),154–165(2009)

[5]Bojanowski,P.,Grave,E.,Joulin,A.,Mikolov,T.:Enrichingwordvectorswithsubwordinformation.Tech.rep.,Archiv(2016),https://arxiv.org/abs/1607.04606

[6]Joulin,A.,Grave,E.,Bojanowski,P.,Mikolov,T.:Bagoftricksforefficienttextclassification.Tech.rep.,arXiv(2016),https://arxiv.org/abs/1607.01759

[7]Yaghoobzadeh,Y.,Schütze,H.:Corpus-levelfine-grainedentitytypingusingcontextualinformation.In:Proceedingsofthe2015ConferenceonEmpiricalMethodsinNaturalLanguageProcessing.pp.715–725.AssociationforComputationalLinguistics,Lisbon,Portugal(17-21September2015)

40Whilstwemadesurethetrainingdataandtestdatawereseparateontheinstancelevel,popularentitiescanstillbementionedinbothdatasets

Page 44: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

44

SessionG

1.PredictingfamilialriskofdyslexiabyapplyingmachinelearningtoinfantvocabularydataAoChen*1,2,FrankWijnen2,CharlotteKoster3,HugoSchnack11DepartmentofPsychiatry,BrainCenterRudolfMagnus,UniversityMedicalCenterUtrecht,Utrecht,theNetherlands2InstituteofLinguistics,UtrechtUniversity,Utrecht,theNetherlands3CenterforLanguageandCognitionGroningen,UniversityofGroningen,Groningen,theNetherlands

BackgroundThecombinationofrapidprogressinthedevelopmentofcomputationaltools,suchasmachinelearning,andthegrowingavailabilityofdigitizeddatainlanguageresearch(e.g.,theDANSdataarchive)andtoolstoassessthesedata(e.g.,viaCLARIAH),hasmadeitpossibletoinvestigatelanguageacquisitioninanautomatedwayandonalargescale(weused22,000vocabularyscoresinourstudy).Inthisstudy,weappliedamachinelearningalgorithmtovocabularydatatomapthepatternofvocabularydevelopmentinindividualchildren.Weinvestigatedwhetherindividualdifferencesbetweenchildreninthewordknowledgeindifferentwordclasses(e.g.,nouns,pronouns,helpingverbs)canbeusedtodetectifachildisatriskofdevelopingdyslexia.Earlydetectionofdevelopmentaldyslexia,aspecificreadingdisorder,willenableinterventionsatanearlyage,beforetheonsetofformalreadingandspellinginstruction.Althoughdeviationsinearlyspeech/languagedevelopmenthavefrequentlybeenrelatedto(riskof)dyslexia(vanderLeijetal,2013),noneofthesemarkershavebeensuccessfullyusedtopredictlaterlanguage/literacyperformanceattheindividuallevel.Machinelearningisatechniquecapableofdiscoveringpatternsindatatomakesuchpredictions.Inthepastdecademachinelearninghasbeensuccessfullyemployedin,e.g.,medicineandthehumanities.Recentexamplesincludethepredictionofdisease-courseinpsychosis(Koutsoulerisetal,2016)andtheattributionofawriterwhowaspreviouslynotconsidered,asauthoroftheDutchanthem(Kestemontetal,2016).Theaimofthisstudywastoinvestigateifearlyvocabularydevelopmentcanbeusedtopredictwhetherornotaninfantisatriskofdyslexia.

MethodWeinvestigatedearlyvocabularydevelopmentintwolarge,independentsamplesofchildrenatfamilialriskofdyslexia(FR;N=495)andtypicallydevelopingchildren(TD;N=498)between17and35monthsofage.TheDutchversionoftheMcArthur-BatesCommunicativeDevelopmentInventory(WordsandSentences)(N-CDI;Fensonetal,1993)wasusedtomeasureeachinfant’svocabularydevelopment.Thiswasdonebycountingthenumberofwordshe/sheknewin22wordcategories.Theseso-called22featuresformedtogetherthefeaturevectorrepresentingthissubject.Wetrainedalinearsupportvectormachine(SVM;Vapnik,1999)topredictthestatusofat-riskattheindividuallevel,basedonthesefeaturevectors.SVMisasupervisedmachinelearningtechniquethatisabletofindpatternsintheinputdata(wordcountsin22categories,inourcase)thatarerelatedtosomeoutputmeasure(inourcase:belongingtotheFRorTDgroup).Thetrainingprocedureresultsinamodelthatoptimallypredictsfor(new)subjectstowhichgrouptheybelong.Thispredictionisbasedontheweightedsumoftheinputvariables,wheretheweightsaretheresultoftheoptimizationprocedureduringtraining.

PerformanceofourpredictionmodelwasassessedbythepercentagesofFRsubjectsthatwerecorrectlyclassifiedasFR(sensitivity),thepercentageofTDsubjectscorrectlyclassified(specificity)andthebalancedaccuracy(meanofsensitivityandspecificity).

Page 45: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

45

Themodel’sgeneralizabilitywastestedusingcross-validation.Inthissetupthemodelistrainedandtestedindifferentsubsamples.

ResultsTherewasaspecificageperiod,18-20months,inwhichthemodelwassensitivetopredictthestatusofbeingatrisk(FR).At19-20monthsofage,thecross-validationaccuracywas68%(p<0.01),withsensitivitybeing70%andspecificitybeing67%.Intheotheragegroupstheaccuracywaslowerandnotsignificant.

Notall22featurescontributedtothesameextenttothediscriminationbetweentheFRandTDsubjectsatage19-20months.Theweightsof5wordcategoriesweresignificantlydifferentfromzero.Thecategorieshelpingverbsandprepositionsandlocationscontributedmost.Themodelhadlearntfromthedatathatknowingfewerwordsinthesecategoriesatthisageisasignificantmarkerforbeingatfamilyrisk.

ConclusionMachinelearningmethodsarepromisingtechniquesforseparatingFRandTDchildrenatanearlyage,beforetheystartreading.ThereisasensitivewindowinwhichthedifferencebetweenFRandTDismostevident.ThemodelalsoindicatedthewordcategoriesinwhichFRinfantsknow(onaverage)fewerwordsascomparedtoTDinfants.Itshouldbenotedthatwedidnotpredictthemanifestationofdyslexia,butonlyelevatedrisk.Wewillfollowthesechildrenup,andtheultimategoalistotrainamodelthatisabletodiscriminatebetweentheFRchildrenwhodevelopdyslexiaandwhodonotatanearlyage.

ReferencesCLARIAH.http://www.clariah.nl

DANS.https://dans.knaw.nl

Fenson,L.,Dale,P.S.,Reznick,J.S.,Thal,D.,Bates,E.,Hartung,J.P.,etal.(1993).TheMacArthurCommunicativeDevelopmentInventories:User’sGuideandTechnicalManual.SanDiego,CA:SingularPublishingGroup.

KestemontM,StronksE,DeBruinM,DeWinkelT.VanwieishetWilhelmus?(2016Dec)AmsterdamUniversityPress.

KoutsoulerisN,KahnRS,ChekroudAM,LeuchtS,FalkaiP,WobrockT,DerksEM,FleischhackerWW,HasanA.Multisitepredictionof4-weekand52-weektreatmentoutcomesinpatientswithfirst-episodepsychosis:amachinelearningapproach.LancetPsychiatry.2016Oct;3(10):935-946.doi:10.1016/S2215-0366(16)30171-7.

vanderLeij,A.,vanBergen,E.,vanZuijen,T.,deJong,P.,Maurits,N.,andMaassen,B.(2013).Precursorsofdevelopmentaldyslexia:anoverviewofthelongitudinaldutchdyslexiaprogrammestudy.Dyslexia19,191–213.doi:10.1002/dys.1463.

Vapnik,VN.(1999).Anoverviewofstatisticallearningtheory.NeuralNetworks,IEEETransactionson,10(5),988-999.doi:10.1109/72.788640.

Page 46: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

46

2.TheDictionaryoftheSouthernDutchDialects(DSDD):DesigningaVirtualResearchEnvironmentfordigitallexicographicalresearchProf. dr. Jacques Van Keymeulen Ghent University, Belgium

ThesouthernDutchdialectareaconsistsoffourdialectgroups:(1)theFlemishdialects,spokeninFrenchFlanders(France),WestandEastFlanders(Belgium)andZeelandFlanders(TheNetherlands);(2)theBrabanticdialects,spokeninAntwerpandFlemishBrabant(Belgium)andNorthernBrabant(TheNetherlands);(3)theLimburgiandialects(spokenintheLimburgprovincesofBelgiumandTheNetherlands);(4)theZeelanddialects,spokeninZeelandandGoeree-Overflakkee(theNetherlands).

ThedialectvocabularyoftheFlemish,BrabanticandLimburgiandialectsiscollectedinthreeregionaldictionaries(WVD,WBDandWLDrespectively),whicharesetupaccordingtothesameplan,conceivedbyprof.A.Weijnen(Nijmegen):theyareonomasiologicallyarrangedandpublishedinthematicfascicles.Contrarytotheirtitles,thesedictionariesaretobeconsideredasgeographically-orientatedinventoriesofwordusage,andnotasdictionariesproper,sinceitisimpossibletodescribemeaninginanonomasiologicallyarrangeddictionary.Theyareatlasses,notdictionaries!Weretaintheworddictionaries–however–sincethethreeprojectsaretraditionallyknownassuch.

Figure1:Researchareasofthe4regionaldialectdictionariesofthesouthernDutcharea

ThethreedictionariesdescribethevocabularyofthetraditionaldialectsofthefirsthalfofthetwentiethcenturyinthesouthernpartoftheDutchlanguagearea,inajointinternationalandinter-universityproject.TheWBD,the'mother'ofthetwootherprojects,wasfinishedin2005;theWLDwascompletedin2008.TheywerecompiledattheUniversityofNijmegenandtheUniversityofLeuven.TheWVDstarted12yearslaterthanitssisterprojects(in1972attheGhentUniversity,byprof.W.Pée)andwillcontinueuntilabout2019.

Page 47: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

47

Thedictionariesweresetupinparallelinordertomakepossibletheaggregationofthedata,thusfulfillingtheobjectivesofthefoundersoftheprojects.Tothateffect,in2016aconsortiumof11linguists,computerscientists,digitalhumanitiesexpertsandgeographerswascreatedsupportingtheproject“DictionaryoftheSouthernDialects”(DSDD).ItaimsattheaggregationandstandardizationofthethreecomprehensivedialectlexicographicdatabasesintooneDSDD-database(towhichhopefullythealphabeticallyarrangedWZDwillbeaddedinthefuture).Inparticular,dialectologistsfromGhentUniversityworkcloselywiththeGhentCentreforDigitalHumanities(GhentCDH)topreparethegroundfortheaggregationofthethreeSouthernDutchdialectdatabasesandtheirexploitationviaaVirtualResearchEnvironmentfordigitallexicographicalresearch.TheGhentteamwillworkcloselywiththeInstituutvandeNederlandseTaalwithregardtothetechnicalandlinguisticsustainabilityoftheDSDD.ThroughthiscollaborationinteroperabilitywithCLARINwillalsobeensured.TheDSDDisadditionallyapilotprojectofDARIAH-BEBelgium.

TheDSDDVirtualResearchEnvironmentwillenablearesearchprogrammewithnewresearchquestions,particularlyinthefieldofquantitativelexicologyandgeographicalanalysis.Duringtheproject2-3researchusecaseswillbedevelopedtotesttheapplicabilityofthenewlyaggregatedDSDDfordigitalscholarship.Forexample:

1. Whatsystematiclexico-geographicalpatternsdothesouthernDutchdialectsshow?Dotheycoincidewiththetraditionalones,basedonphonology?(seeDeVriendt2012).Aretheregeographicalpatternsinsemantics?

2. Inordertoexplorethegeographicalspreadingofseveraldialectologyconceptsandtolinkthemto“Kloekeplaatscodes”(whichareusedinlinguisticresearchformapping/linkingdialectologyconceptstogeographicalregions),asetofgenericbuildingblocksforautomaticatlas/heatmapgenerationwillbedeveloped.Segmentationandclusteringtechniquescanberunoverthegeneratedatlases/heatmapsinordertoautomaticallydetectthehomogeneity(orheterogeneity)ofaparticulardialectologyconcept.Furthermore,spatialqueryingtechniqueswillbesupportedinordertogeographicallysearch/explorethiskindofdialectologyconcepts.

3. Clusteranalysisandexplorationofthelinkage(andvisualization)oflinguisticdatawithsynchronicanddiachronicextralinguisticdataofallkinds.

Bytheendoftheproject,theDSDDwilla)makethenewlyaggregatedDSDDavailableviaauser-friendlywebsiteandb)enabletheDSDDfordigitalscholarship.Toenablethis,aprofessionallydesigneduser-friendlywebapplication,orVirtualResearchEnvironment,(includingapplicationprogramminginterface(API)fordataexport)willbecreated.Theexporteddatawilluseexistingdigitalresearchtools(e.g.forgeo-visualisation,qualitativelexicologyanddialectometry)tovalidatetheresearchcasestudiesdescribedabove.

AttheDHBeneluxConference,wewillproposetheplanfortheaggregation,thestructureofthedatabaseanddwellonthedifferent‘editorial’problemsthathavetobesolved.Thedifferentdictionaries/databasewereindeedcomposedoveraverylongperiodoftime,atdifferentplaces(Nijmegen,Leuven,Ghent)andbydifferenteditors,henceagreatnumberofinconsistenciesaroseovertime.InordertocomposeanaggregatedDSDD-database,anumberofstandardizationactivitieshavetobecarriedout.Additionally,wewillpresenttheinitialresultsoftheVirtualResearchEnvironmentrequirementsanalysis.

3.Establishinginterdisciplinarydialogue:conductingaqualitativeinvestigationintolinguisticrequirementsforNaturalLanguageGenerationEmmaClarkeandOwenConlan

Page 48: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

48

BackgroundDialoguesystems,commonlyreferredtoaschatbotsarebecomingincreasinglypopular.In2016,chatbotwasshortlistedaswordoftheyearbyOxfordDictionaries41andplatformssuchasFacebook

Messenger2arefrequentlyutilisedtocommunicateupdatesorinformation,sellproductsorprovideservices.Whilethegoalofadialoguesystemwhichcommunicatesnaturallywithitsuserappearedtohavebeen‘withinreach’asfarbackas2001(Rambowetal.,2001),currentNaturalLanguageGeneration(NLG)researchapproachescontinuetohavelimitationswhenitcomestothe‘natural-ness’oftheirinteractions(LeCunetal.,2015)(ReiterandDale,2006)(ManningandSchütze,1999).Thus,theNLGfieldislookingtomovetowardsmorenaturalconversationalinterfacesbytakinginfluencefromnaturalhumanspeechandasdialoguesystemsbecomemorehuman-like,theinterspersionofpersuasivelanguagewithinthemwillbecomemoreapplicable.Somepriorresearchhasbeencarriedoutonthedevelopmentofpersuasivedialoguesystems(Prakken,2009)(Parsonsetal.,2003)(WaltonandKrabbe,1995).Mostrecently,Hiraokaetal.(2016)observedthat“thesepersuasivedialoguesystemsareintheirfirststagesofdevelopment,andarefarfromtheabilitiesoftheirhumancounterparts,bothintermsofpersuasiveability,andalsoabilitytoachieveusersatisfaction”.Thefocusofthisresearchprojectisthelanguageofpersuasion,namelyrhetoricaldevices.WebelievethatinordertounderstandtherequirementsoftheNLGcommunityinthisarea,theestablishmentofcross-disciplinaryconversationisessential.

ChallengeThenuancesofhumanspeechsuchassarcasm,slangandwordplayandthehumanabilitytoprocessandunderstandthesesubtletiesmakethemequallyfascinatingandfrustratingforresearchersintheareasofnaturallanguageprocessing,understandingandgeneration.AmajorchallengefacedbyNaturalLanguageGeneration(NLG)researchersishowtoincorporatelinguisticunderstandingintoNLGsystemsinordertogeneratemorenaturalsoundinglanguage.Thischallengeisexpectedtocontinuetopervadeinthenextgenerationofnaturallanguagesystems(Dale,2016)(LeCunetal.,2015)(WardandDeVault,2015)(Gartner,n.d.).

OftenlackingindialoguesystemsandNLGresearchislinguisticexpertisepresentedinaformwhichisunderstandable,thatdissectsnaturalelementsofhumanspeech,particularlyelementswhicharedifficultformachinestolearn.WardandDeVault(2015)highlightthisinterdisciplinaryengagementintheir‘TenChallengesinHighly-InteractiveDialogSystems’.

Asinterdisciplinaryresearchbecomesmoreprevalent,therequirementforcomputersciencepractitionerstoengagewithnon-technicalresearchersfromdiversebackgroundswillincrease.Dale(2016)alsoreferstocross-disciplinaryconversationsandencouragesdialoguesystemsdeveloperstoaccesstheexpertiseofthecomputationallinguisticscommunity,inwhichresearchintodiscoursephenomenahasbeenon-goingsincetheinceptionofthefield.Dale(2016)presentsanencouragingcalltoaction:“Ifwewanttohavebetterconversationswithmachines,westandtobenefitfromhavingbetterconversationsamongourselves.”.

ApproachTheoverallaimofthisresearch(fig.1)istoestablishananapproachtounderstandinghowrhetoricaldevicesfunctioninnaturalhumanspeechinordertoproposeamethodwhichcanbebuiltintopracticalNLGapplicationssuchasdialoguesystems(chatbots).Theworkwilldrawuponstructuredratherthanrandominfluencebyobservingtheusageoftheselinguisticstrategiesforpersuasioninhumanspeech.Fromtheseobservations,aTEIschemahasbeencustomisedinorderto

41 https://en.oxforddictionaries.com/word-of-the-year/word-of-the-year-2016 2

https://www.messenger.com/

Page 49: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

49

markupasetofrhetoricaldeviceswithinacorpus.

Figure1

Thispaperwillpresentfindingsonthecentralcomponentofthediagramabove:thecross-disciplinaryengagementwithNLGpractitionersinordertodevelopapragmaticapproachtoincorporatingpersuasivelanguageintodialoguesystems.WeexplorehowacustomisedTEIschemaisusedinsemi-structuredinterviewswithNLGresearchers(anongoing,iterativeprocess).Basedonqualitativefindingsfromtheinterviews,theschemaisrevisedandamendedtoincorporaterequirementsandsuggestions.ThefinalschemawillultimatelybeusedtomarkupandannotatespeechesfromthecorpusinordertobeaddedtoNLGaspartofthesystemtraining.

MethodAseriesofsemi-structuredinterviewsarebeingcarriedoutinwhichtenNLGpractitionersinareaskedquestionsinordertounderstandcurrentandfuturerequirementsofNLGapplicationssuchasdialoguessystems.

Inthecourseofeachinterview,theTEIschemaispresentedandthesuggestionsoftheNLGpractitionerssought.Theinterviewsarerecordedandtheresultingoutcomesareanalysedusingatlas.tisoftware.TheresultsarethensummarisedtocreateanoverallpictureofNLGresearcherrequirements.

Outcomes(todate)Theprocessoutlinedaboveisongoingatthetimeofsubmission.However,preliminaryfindingsfromtheinterviewscanbesummarisedasfollows:

• ●Bothtemplate-drivenanddeeplearningsystemsuseannotateddata.Inarule-basedapproach,annotationsareusedtohelpfurtherengineerfeaturesbyhandwhileadeeplearningapproachusesannotationtohelplearnandunderstandstructure.

Page 50: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

50

• ThereisanemergingquestioninNLGresearchabouthowtodealwithsentencestructureandnuance.Increasingly,researchersareusingmarkeduptexttohelpsystemslearnhigherorderstructures.

• Pattern-matchingaloneisnotarobustenoughapproach.• Averyclearannotationschemathatmarksupfeaturesofrhetoricaldeviceswouldbeusefulfor

NLGresearchersworkingintheareaofpersuasion.ConclusionTheaimofthisresearchistoengageinaninterdisciplinaryconversationwithNLGpractitioners.Theprocessofengagementandthefindingsfromtheinterviewswillbepresentedinthispaper.

ReferencesDale,R.,2016.Thereturnofthechatbots.Nat.Lang.Eng.22,811–817.Gartner,n.d.Gartner’s2016HypeCycleforEmergingTechnologiesIdentifiesThreeKeyTrendsThatOrganizationsMustTracktoGainCompetitiveAdvantage[WWWDocument].URLhttp://www.gartner.com/newsroom/id/3412017(accessed11.24.16).Hiraoka,T.,Neubig,G.,Sakti,S.,Toda,T.,Nakamura,S.,2016.Constructionandanalysisofapersuasivedialoguecorpus,in:SituatedDialoginSpeech-BasedHuman-

ComputerInteraction.Springer,pp.125–138.LeCun,Y.,Bengio,Y.,Hinton,G.,2015.Deeplearning.Nature521,436–444.Manning,C.D.,Schütze,H.,1999.Foundationsofstatisticalnaturallanguageprocessing.

MITPress,Cambridge,Mass.;London.Parsons,S.,Wooldridge,M.,Amgoud,L.,2003.Propertiesandcomplexityofsomeformal

inter-agentdialogues.J.Log.Comput.13,347–376.Prakken,H.,2009.Modelsofpersuasiondialogue,in:ArgumentationinArtificialIntelligence.

Springer,pp.281–300.Rambow,O.,Bangalore,S.,Walker,M.,2001.Naturallanguagegenerationindialogsystems,in:ProceedingsoftheFirstInternationalConferenceonHumanLanguage

TechnologyResearch.AssociationforComputationalLinguistics,pp.1–4.Reiter,E.,Dale,R.,2006.Buildingnaturallanguagegenerationsystems,Digitallyprinted1stpbk.version.ed,Studiesinnaturallanguageprocessing.CambridgeUniversity

Press,Casmbridge,U.K.;NewYork.Walton,D.,Krabbe,E.C.,1995.Commitmentindialogue:Basicconceptsofinterpersonal

reasoning.SUNYpress.Ward,N.G.,DeVault,D.,2015.Tenchallengesinhighly-interactivedialogsystems,in:AAAI

SpringSymposiumonTurn-TakingandCoordinationinHuman-MachineInteraction.

Page 51: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

51

SessionH

1.GettingtheBiggerPicture:ExploratorySearchandNarrativeCreationforMediaResearchintoDisruptiveEventsdr.BerberHagedoorn,UniversityofGroningen,ResearchCentreforMediaStudiesandJournalismdr.SabrinaSauer,UniversityofGroningen,ResearchCentreforMediaStudiesandJournalism

IntroductionDigitalHumanitiescentresonquestionsthatareraisedbyandansweredwithdigitaltoolsintheHumanities.Atthesametime,itinterrogatesthevalueandlimitationsofdigitalmethodsinHumanities’disciplines.WhileitisimportanttounderstandhowdigitaltechnologiescanoffernewvenuesforHumanitiesresearch,itisequallyessentialtounderstand–andtherefore,beingabletointerpret–‘theuserside’ofDigitalHumanities.Specifically,howHumanitiesresearchersappropriateanddomesticatesearchtoolstoaskandanswernewquestions,andapplydigitalmethods.PrevioususerresearchinDigitalHumanitiesconcentratesonassessing,forexample,howandwhyDigitalHumanitiesbenefitsfromstudiesintouserneedsandbehaviour(Warwick,2012),userrequirementresearch,aswellasparticipatorydesignresearch(Kemman&Kleppe,2014).

ExploratorysearchiscrucialforHumanitiesresearcherswhodrawuponmediamaterialsintheirresearch.Audio-visual,onlineanddigitalsourcesareinabundance,scatteredacrossdifferentplatforms,andchangingdailyinourcontemporarylandscape.Supportingresearchers'explorationsbecomesevenmoreimportantwhenscholarsstudymediaevents.A‘mediaevent’isaneventwithaspecificnarrativethatgivestheeventitsmeaning,andisincontemporarysocietiesincreasinglyrecognizedasnon-plannedordisruptive.Disruptivemediaevents,suchasthe‘sudden’riseofpopulistpoliticians,terroristattacksorenvironmentaldisasters,areshockingandunexpected,makingthemdifficulttointerpret.Thisleadstoproblemsformediaresearcherswhoanalysehownarrativesconstructdifferentpolitical,economicorculturalmeaningsaroundsuchevents.Previousresearcharguesthatmediaeventsshouldalwaysbeviewedinrelationtotheirwiderpoliticalandsocioculturalcontexts.Events,astheyunfoldinthemedia,maycorrespondtolong-termsocialphenomena,andthewayinwhichsucheventsare‘constructed’hasparticularconnotations(Jiménez-Martínez,2016).Specificactors(newscasters,governments,institutions)usemediaeventstobuildnarrativesinlinewiththeirownpolitical,economicorculturalpurposes.Mediaresearchersalsobuildnarrativesaroundevents;priorresearchunderlinestheimportanceofvisualizing,constructingandstoringofnarrativesduringtheinformationnavigationtocontextualizematerial(Akkeretal.,2011;Kruijt,2016;DeLeeuw,2012).Offeringmediaresearcherstheabilitytoexploreandcreatelucidnarrativesaboutmediaeventsthereforegreatlysupportstheirinterpretativework.

Thispaperproposestoaddtothisbodyofresearchbypresentingtheinsightsofacross-disciplinaryuserstudythatinvolves,broadlyspeaking,researchersstudyingaudio-visualmaterials,inaco-creativedesignprocess,settofine-tuneandfurtherdevelopadigitaltoolthatsupportsHumanities’researchthroughexploratorysearch.Thispaperfocusesonhowresearchers-inbothacademicaswellasprofessionalsettings-usedigitalsearchtechnologiesintheirdailyworkpracticestodiscoverandexploredigitalaudio-visualarchivalmaterial.Wefocusspecificallyonthreeusergroups,namely(1)MediaStudiesresearchers,(2)Humanitiesresearchersthatuseaudio-visualmaterialsasasourceand(3)Mediaprofessionals.Theseusergroupsaretheforeseenendusersofthetool,becausetheycreateaudiovisualnarrativesfortheirrespectiveworkpurposes.Weset-upco-creativedesignsessionswith74participants(group1:24;group2:40;group3:10)toobserveandreflectonthepracticesofmediaresearchersintermsofhowtheyinteractwithsearchtoolstoexplore,accessandretrievedigitizedaudio-visualmaterial,inordertointerpret,andinsomecases,re-usethismaterialinnewaudio-visualproductions.

Page 52: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

52

MethodologyInouruserstudy,weemployauser-centreddesignmethodologytoevaluateandfine-tunetheexploratorysearchtoolDIVE+mediabrowser.Itoffersevents-drivenexplorationofdigitalheritagematerial,whereeventsareprominentbuildingblocksinthecreationofnarrativebackbones(DeBoeretal.,2015)andlinksavarietyofdifferentmediasourcesandcollections.DIVE+offersintuitiveexplorationofmediaeventsatdifferentlevelsofdetail.Itconnectsmediaobjects,subjects(“concepts”),events,andpersonstoaidintheformulationofresearchquestions,andtocontextualizetheformerintooverarchingnarrativesandtimelines.Ourmainresearchquestionthroughoutthecasestudyishowdoesexploratorysearchsupportmediaresearchersintheirstudyofhowmediaeventsareconstructedacrossdifferentmediaandinstilledwithspecificculturalorpoliticalmeanings?Tobeabletoanswerthisquestion,westudyhowmediaresearchersconstructnavigationpathsviaexploratorysearchand-bymeansofuserstudies-evaluatetheroleofnarrativesin(1)learningand(2)research.Inthisprocess,wecompareDIVE+tootheronlinesearchtools.

TheuserstudyobservesmediaresearchersastheyuseDIVE+toexploremediaevents,across3stages:(1)duringresearchquestionformulation(2)DIVE+use;and(3)comparativeuserevaluationsoftheDIVE+browser,comparedtootheronlinesearchtools.Thecollecteddata,consistingofbothqualitative–observationalandfocusgroup-data,aswellasloggingdatagatheredduringusertesting,providesinsightsabouthowmediaresearcherssearchandexploredigitalaudio-visualarchives.Weutilizeacasestudyapproach,whichcombinesgroundedtheory(thatfostersanunderstandingofhowresearchersinterpretandcreatenarratives)withusabilitymethodologies,suchasworktaskevaluations.This,firstofall,allowsustodrawconclusionsabouthowsearchtoolsanddigitaltechnologiesco-constructtheresearcher’sprofessionalpractice.Second,thedatahelpsusprobethequestionhowthe‘digitality’ofsearchandretrievalshapesthepracticeofmediaresearch,and,inextensionofthis,creativeprocesses.

Theresearchpresentedinthispapertakesaninterdisciplinaryapproach:itcombinesinsightsfromMediaStudies,aswellasfromInformationStudiesandScienceandTechnologyStudiesandintegratesideasaboutnarrativecreation,searchpractices,andoverarchingnotionsabouthowusersandtechnologiesco-constructmeaning.ThereforethepresentedresearchdoesnotfocusonhowDigitalHumanities’toolshaveanimpactonresearchers’practices,butratheranalyseshowresearchersmakeuseofsearchtools.Wesubsequently(1)drawconclusionsaboutscholarlypracticeandtheroleofsearchtechnologiesfordigitizedaudio-visualmaterialstherein;and(2)presentlessonslearnedonhowtooptimizethesearchtoolthatisused,inordertoimproveitsperformance.

AcknowledgmentsTheauthorswouldliketothanktheanonymousreviewersofthefirstversionofthisabstractfortheirhelpfulcommentsandsuggestions.ThisresearchwassupportedbytheNetherlandsInstituteforSoundandVision(partiallyinthecontextofBerberHagedoornasSoundandVisionResearcherinResidencein2016-7)andtheNetherlandsOrganisationforScientificResearch(NWO)underprojectnumberCI-14-25aspartoftheMediaNowproject.ThisresearchwasalsosupportedbyCLARIAH,CommonLabInfrastructureofArtsandHumanities,inthecontextoftheResearchPilotNarrativizingDisruption:Howexploratorysearchcansupportmediaresearcherstointerpret‘disruptive’mediaeventsaslucidnarratives(https://www.clariah.nl/projecten/research-pilots/nardis),CLARIAH-projectnumberCC17-13.Allcontentrepresentstheopinionoftheauthors,whichisnotnecessarilysharedorendorsedbytheirrespectiveemployersand/orsponsors.

BibliographyAkker,C.vanden,Legêne,S.,Erp,Mvan,Aroyo,L.,Segers,R.Meij,L.vander,Ossenbruggen,J.van,Schreiber,G.Wielinga,B.,Oomen,J.,Jacobs,G.(2011).DigitalHermeneutics:AgoraandtheOnline

Page 53: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

53

UnderstandingofCulturalHeritageCategoriesandSubjectDescriptors.WebSci11,Koblenz,Germany.

Boer,V.de,Oomen,J.,Inel,O.,Aroyo,L.,Staveren,E.van,Helmich,W.,&Beurs,D.de.(2015).DIVEintotheEvent-BasedBrowsingofLinkedHistoricalMedia.WebSemantics:Science,ServicesandAgentsontheWorldWideWeb,35(3),152–158.

DeLeeuw,S.(2012).EuropeanTelevisionHistoryOnline:HistoryandChallenges.VIEWJournalofEuropeanTelevisionHistoryandCulture,1(1),3–11.

Jiménez-Martínez,C.(2016).Integrativedisruption:therescueofthe33Chileanminersasalivemediaevent.In:Fox,A.,(ed.)GlobalPerspectivesonMediaEventsinContemporarySociety.IGIPublishers,Hershey,USA,60-77.

Katz,E.,andLiebes,T.(2007).‘NoMorePeace!’:HowDisaster,TerrorandWarHaveUpstagedMediaEvents.InternationalJournalofCommunication1,157-166.

Kemman,M,andKleppe,M.(2014)."UserRequired?OntheValueofUserResearchintheDigitalHumanities."SelectedPapersfromtheCLARIN2014Conference,October24-25,2014,Soesterberg,TheNetherlands.No.116.LinköpingUniversityElectronicPress.

Kruijt,M.(2016).SupportingExploratorySearchwithFeatures,Visualizations,andInterfaceDesign:ATheoreticalFramework.UniversityofAmsterdam.

Warwick,C.(2012)."StudyingusersinDigitalHumanities."DigitalHumanitiesinpractice,1-21.

2.BiasintheanalysisofmultilinguallegislativespeechLauraHollink,AstridvanAggelen,JaccovanOssenbruggenCentrumWiskunde&Informatica,Amsterdam,[email protected]

InthispaperweinvestigatetheapplicationofnaturallanguageprocessingtoolstothemultilingualproceedingsoftheEuropeanParliament.Thisworkispartofastudyinwhichweexplore(1)howsubcorporaindifferentlanguagesmayleadtodifferentconclusionsaboutthepoliticallandscape,(2)howtodeterminewhatapotentiallanguage-relatedbiasoriginatesfrom,and(3)towhatextentwecanlimitorevenpreventanunwantedlanguage-bias.

Parliamentaryspeechhasbeenusedtostudypartypositions[1,2,3],issueselection[4,5,6,7]andthelevelofdisagreementwithinadebate[8].Manystudieshavemovedawayfrommanualcoding(whichisdoneine.g.[4,5])andinsteadpositionspeechtextsononeormore(latent)dimensionsinstatisticalmodelsbasedonrelativewordfrequencies[1,2,3,6,7,8],oftenincombinationwithbasicpre-processingstepssuchasstemmingandstopping.Thesemodelsandtools,whileimperativetoanalysebiggerdatasets,addasourceoferrorsandbias.Onesourceofpotentialbiascomesfromthefactthattheusedtoolsperformdifferentlyondifferentlanguages.ConsideringthattheaforementionedstudieswerecarriedoutontheEuropean,Irish,US,Spanish,NorwegianandSwedishlegislatures,thecomparabilityandreproducibilityoftheresultsfordifferentlanguagesisunclear.

IntheEuropeanParliament,thespokenaccountsappearin(currently)24languages.Here,theuncertaintystemsnotonlyfromtoolsthatperformdifferentlyoneachlanguage,butalsofromthefactthattheavailabilityofdataineachlanguagevaries.MembersofParliament(MEPs)arefreetospeakinanyoftheofficiallanguages.Speechesaresometimestranslatedinto(some)otherlanguages,dependingonprioritizationwiththeEP,specifictranslation-requestsofthemembersand

Page 54: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

54

(supposedly)budgetaryconstraints.Thus,weareleftwith24subcorporaofvaryingsize,oneperlanguage,includingbothoriginalandtranslatedspeech.

Theneedtostudylanguage-effectsinthiscontexthasbeenrecognisedbefore.Prokschetal.[3]reportedamodestlanguage-effect42intheirstudyofpartypositionsintheEuropeanParliament,whichtheyascribedtotranslationratherthanactualdifferencesinpositiontakingbetweenthreecountries.However,whiletheoveralleffectmaybesmall,wearguethatspecificlocaleffectscouldstillleadtosignificantbiasesintheresults.Forexample,FrenchtranslationsofGermantextsseemedtosystematicallygetamoreneutralpositionthantheoriginaltext,whiletheoppositewasnotthecase.ItisimportanttorealisethattheproceedingsoftheEuropeanParliamentarenotonlyacorpusforresearchers.ResidentsoftheEuropeanUnionhavearighttoaccessthesedocumentsinordertomakeinformedvotesandtoholdtheMEPsaccountable43.ThisrightwouldbecompromisedwhenFrenchspeakingcitizenscometodifferentconclusionsaboutwhathasbeendiscussedthanGermanspeakingcitizens.Ouraimistogaininsightintohowworkingwithsubcorporaindifferentlanguagesmayleadtodifferentconclusionsaboutthepoliticallandscape.

Inthisstudy,weusethedataprovidedbytheTalkofEuropeproject[9],inwhichspeechtranscriptsandallavailabletranslationswerecrawledfromthewebsiteoftheEP44,andtranslatedintothesemanticwebformatRDF.Dataisavailablefrom1999to2015andcontainsaround300Kspeechesin22Kdebates.Weapplytopicdetectiontosixlanguage-specificsubcorporaoftheproceedingsoftheEuropeanParliament:German,English,French,Italian,SpanishandDutch.WeusetheJEXsoftwaredevelopedbytheEuropeanCommission'sJointResearchCentre,whichlearnsmulti-labelcategorisationrulesfromdocumentsthatwerepreviouslymanuallyindexedusingthemultilingualEurovocthesaurus[10].Theadvantageofusingthistoolover,forinstance,widelyusedtopicmodelingapproachessuchasLDA[11],isthattheoutputisdirectlycomparableacrosslanguages:thetoolusesasinglethesaurus,Eurovoc,toclassifydocumentsineachlanguage,andconceptsintheEurovocthesaurushavelabelsinalllanguages.Inalaterstageofthestudy,weplantoincludeothertopicdetectiontechniques,andwidenthescopetoallEUlanguages.

Over2000distinctEurovoctopicsweredetectedinthesixsubcorpora.Thefrequencydistributionsovertopicsvaryperlanguage.Figure1visualisesthedistancebetweenlanguages.WeuseKullback–Leiblerdivergence[12],anon-symmetricmeasureforthedifferencebetweentwodistributions.Ahigherscore,visualizedasareddercolour,signifiesagreaterdistance.Forexample,ItalianandFrencharerelativelyclose,whileSpanishandGermanarefarapart.Therearefourhypothesesastowhatthesedifferencesoriginatefrom:

1. MEPsspeakingonelanguageindeedspeakaboutdifferenttopicsthantheircolleagueswhospeakinanotherlanguage.

2. Thereisabiasintheselectionofspeechesthatarebeingtranslated.3. Thereisabiasinhowcertaintopicsaretranslated,e.g.translatorsusemoreambiguousor

polarizedlanguage.4. Thetopicdetectiontoolworksdifferentlyononelanguagethanonanother.

42 A correlation coefficient ranging between 0.86 and 0.93 when comparing party positions derived from texts in German, French and English [3]. 43 Regulation (EC) No 1049/2001 of the European Parliament and of the Council 44 http: //www.europarl.europa.eu

Page 55: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

55

Figure1:Heatmapofdifferencesbetweentopicdistributionsinlanguages.

Inourpresentation,wewilltacklethisissuefromtwosides.Firstly,wecomparedifferentsubsetsoftopicsbasedonwhetherornotspeechesweretranslated,andtowhichlanguages,toexplorehypotheses1and2.Then,tostudyhypothesis4(andtoalesserextenthypothesis3)wezoomintotopicsthatappeartobeparticularlydistinctivebetweenlanguages,andcomparethetopicannotationstowhatwasactuallysaidinthedebates.Asanexampleofthelattermethod,Figure2showsthedifferencesinfrequencyofthedetectedtopics“nuclearweapons”and“nuclearenergy”.Remarkably,onlyFrenchandItalianspeechesseemtobeaboutnuclearweapons,whileEnglishandSpanishspeechesareoftenaboutnuclearenergy.Asacomparison,Figure3plotstheoccurrencesofthephrases“nuclearweapons”and“nuclearenergy”(andtranslationsthereof)intherawspeechtexts.Here,partoftheeffectisgone,suggestinganerrorofthetopicannotationsoftware,whilepartoftheeffectremains-Germantextsindeedseemtotalklessaboutbothnuclearweaponsandnuclearenergy.

Withthisstudy,weaimtocontributetothediscussionaboutsystematicmethodsfortoolcriticismandsourcecriticisminacomplexmultilingualcontextliketheEuropeanParliament.

Figure2:Frequencyoftopicsindebates.

Page 56: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

56

Figure3:Frequencyofphrasesindebatetexts.

References[1]Benoit,Kenneth,andMichaelLaverNd.EstimatingIrishPartyPositionsUsingComputerWordscoring:The2002Elections.IrishPoliticalStudiesVol.18,Iss.1,2003.

[2]Laver,MichaelJ.,KennethR.Benoit,andJohnGarry.ExtractingPolicyPositionsfromPoliticalTextsUsingWordsasData.AmericanPoliticalScienceReview97(2):311–31,2003.

[3]Proksch,S.-O.andSlapin,J.B.PositionTakinginEuropeanParliamentSpeeches,BritishJournalofPoliticalScience,40(3),pp.587–611,2010.

[4]HannaBäck,MarcDebus&JochenMüller.WhoTakestheParliamentaryFloor?TheRoleofGenderinSpeech-makingintheSwedishRiksdag.PoliticalResearchQuarterly67:504–518,2014.

[5]MarkusBaumann.ConstituencyDemandsandLimitedSupplies:ComparingPersonalIssueEmphasesinCo-sponsorshipofBillsandLegislativeSpeech.ScandinavianPoliticalStudies,Vol.39,issue4,pp.366-387,2016.

[6]Pardos-Prado,Sergi,andIñakiSagarzazu.ThePoliticalConditioningofSubjectiveEconomicEvaluations:TheRoleofPartyDiscourse.BritishJournalofPoliticalScience46(4),799-823,2016.

[7]KevinM.Quinn,BurtL.Monroe,MichaelColaresi,MichaelH.Crespin,DragomirR.Radev.Anautomatedmethodoftopic-codinglegislativespeechovertimewithapplicationtothe105th-108thUSSenate.MidwestPoliticalScienceAssociationMeeting.2006.

[8]BenjaminE.Lauderdale,AlexanderHerzog.MeasuringPoliticalPositionsfromLegislativeSpeech.PolitAnal;24(3):374-394,2016.

[9]AstridvanAggelen,LauraHollink,MaxKemman,MartijnKleppe,andHenriBeunders.Thedebatesoftheeuropeanparliamentaslinkedopendata.SemanticWeb,8(2):271–281,2017.

[10]PouliquenBruno,SteinbergerRalf,CameliaIgnat.AutomaticAnnotationofMultilingualTextCollectionswithaConceptualThesaurus.InProceedingsoftheWorkshopOntologiesandInformationExtractionattheSummerSchoolTheSemanticWebandLanguageTechnology-ItsPotentialandPracticalities(EUROLAN'2003).Bucharest,Romania,28July-8August2003.

[11]Blei,DavidM.,Ng,AndrewY.,Jordan,MichaelI.Lafferty,John,ed.LatentDirichletAllocation.JournalofMachineLearningResearch.3(4–5):pp.993–1022,2003.

[12]Kullback,S.,Leibler,R.A.Oninformationandsufficiency.AnnalsofMathematicalStatistics.22(1):79–86,1951.

Page 57: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

57

SessionI

CulturalHeritageDataforResearch:AEuropeanaResearchPanelNienkevanSchaverbeke,HeadofEuropeanaCollectionsMarjoleindeVos,EuropeanaDataPartnerServicesDr.AgiatisBenardou,DigitalCurationUnit,R.C."Athena",InstitutefortheManagementofInformationSystems

Panelmembers:NienkevanSchaverbeke-HeadofEuropeanaCollections-sessionChair

Dr.AgiatisBenardou-DigitalCurationUnit,R.C."Athena",InstitutefortheManagementofInformationSystems-ResearcherNeedsManagement

1MemberofourBoardfromaresearchnetwork(http://research.europeana.eu/blogpost/europeana-research-advisory-board-established)-TBC

Marjolein de Vos - Europeana, Digitised Medieval Manuscripts Maps - Data Quality

Dr. Caroline Ardrey - University of Birmingham - Europeana Grants Winner

Dr. Dana Mustata - University of Groningen. Academic in a digital humanities related field, outsider to Europeana - TBC

CulturalHeritageDataforResearch:AEuropeanaResearchPanel

InthispanelmembersoftheEuropeanaResearchAdvisoryBoard,EuropeanaDataPartnerServices,oneoftheResearchGrantswinnersand,importantly,anacademicexternaltoEuropeanawillpresentanddiscussthevalueofEurope’sculturalheritagedataforresearchinthehumanitiesandsocialsciences,andthewaysinwhichEuropeanaResearchispromotingandenablingitsuse.Thepanelispartofalargerdiscussiongoingonaboutmakingculturalheritageavailableforresearchandtheopportunities,challenges,andconsiderationsinvolvedinthis.

Inshort,thepanelwillfocusonthefollowingpoints:

• EuropeanaResearch-Objectives&Achievements• Relationshiptootherresearchnetworksandinfrastructures(DARIAH,CLARIN,EHRI,Parthenos

etc)• Researcherneedsandcommunityengagement• Dataaggregationandqualityimprovement• UsingEuropeanadatainresearch

EuropeanaResearchwasestablishedasalinkbetweenculturalheritageinstitutionsandresearchers.WerecognizethatundertakingresearchonthedigitisedcontentofEurope’sgalleries,museums,libraries,andarchiveshashugepotentialthatshouldbeexploited.Butissueswithregardstolicensing,interoperability,andaccesscanoftenimpedethere-useofthatdatainresearch.EuropeanaResearchaimstohelpwiththeseissues,liberatingculturalheritageformeaningfulacademicre-use.WeworkonaseriesofactivitiestoenhanceandincreasetheuseofEuropeanadataforresearch,anddevelopthecontent,capacity,andimpactofEuropeana,byfosteringcollaborationsbetweenEuropeanaandtheculturalheritageandresearchsector,aswellasliaisingwithotherdigitalresearchinfrastructuresandnetworks.

EuropeanaResearchisgovernedbyanAdvisoryBoardcomprisingofrenowneddigitalhumanitiesexpertswhohelpusgrowandstrengthenservicesforDHresearchers.Inthefirstsectionofthepanel

Page 58: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

58

wewillhighlightourmainobjectivesandgreatestachievements,suchastheResearchGrantsProgramme.

Followingthisintroduction,oneofourpanelmembers,arepresentativefromaresearchnetworkthatwecollaboratewithandanacademicwhoisnotconnectedtoEuropeanawillexpandandelaborateonthisrelationshipbetweentheirnetworkandEuropeana,andthevaluethereof.

Sinceourtargetaudienceareresearchcommunitiesinthehumanitiesandthesocialsciences,itisvitaltounderstandtheirheterogeneousneedsvisàvistheirinformationbehaviourandtheirinteractionwithdigitalcontent.Inthispartofthepanel,wewillgointodetailabouthowwecometounderstandtheneedsofourusers,howtocatertothem,andhowwecontinuouslydevelopandfurtherthisunderstandingandadapttotherequirements.

Withmorethan54millionobjectsfrom40countriesandinavarietyoflanguages,theEuropeanaportalcontainsasubstantialamountofdatatomanage.TheDataPartnerServicesteamdoesnotonlyworkcontinuouslyoningestingnewdatafortheportal,butalsoinveststimeintoevaluatingandimprovingexistingdata.Wemakedataqualityplanswithaggregatorsanddirectproviderstofurtherfindabilityandgranularityoftherecordsintheportal.Furthermore,thereisaspecialassignedDataQualityCommitteethatworksonrefiningandexpandingtheEuropeanaDataModel.Duringthispartofthepanel,wewilltalkabouttheworkthatisbeingdonefromthemetadataperspectiveondataquality,theimportanceofunderstandingresearchersneedsforthis,andthevalueofculturalheritagedataforresearch.

In2016theEuropeanaResearchGrantsProgrammewaslaunched,inwhichDigitalHumanitiesresearcherswereencouragedtoapplywithaprojectwhereEuropeanadatawouldbecentralinansweringtheirresearchquestion.Theunprecedentedsuccessofthiscallforproposalsshowsushowimportantitistomakeheritagedataavailable;thevarietyinideasshowingustherangeofpotentialofwhatisintheportal.TofurtherillustrateandstrengthenthepointsthatwillbementionedinthepaneloneofthewinnersoftheEuropeanaResearchGrantsProgramme2016willdiscussherprojectasashowcaseofEuropeanadatare-useforresearchandthepotentialofferedtoresearchcommunitiesthroughopenaccess,clearlicensing,andadequatedigitaltools.

Afterprovidingshortexplanationsonthepointsmentionedinthisproposal,wewillencouragediscussionfromthepanelandtheaudienceonthesematters.ThesecouldleadtovaluableinsightsforEuropeanaResearchinthewiderdiscussionofopeningupculturalheritagefortheresearchcommunity.WealsowelcomesuggestionsforEuropeanaResearch’sfutureactivitiesandimprovingservices.

Page 59: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

59

SessionJ

Textmininginpractice:Adiscussiononuser-appliedtextminingtechniquesinhistoricalresearch.Language:English,Duration:60minutes

Inthispanelwelookattheapplicationoftextminingtechniquesinhistoricalresearch.Inrecentyears,textmininghascomewithinreachofanyvaguelycomputer-literatescholar.Thegrowingavailabilityoflargedigitaltextcollectionsleadstogrowingabilitiestoapplydigitalandquantitativeapproachestothestudyofhistoricaltexts.CommonlyusedlanguagesandstatisticalenvironmentssuchasPythonandR,offerapplicablesoftwaresolutionsforfree.Thishasliberatedhistoriansandotherhumanitiesscholarsfromtheshacklesoftime-consumingandoftenexpensiveprogrammingworkbyhiredexternalprogrammers.

Techniquesliketopicmodelling,wordembeddings,sentimentandemotionminingareincreasinglybeingusedinthehumanitiesandsocialsciences.Historians,politicalscientists,sociologistsandothersnowhavetheopportunitytouseadvancedtextminingtechniquesonlargedatasetsfromtheirdesktops.Althoughstillmostlyexperimental,thepotentialgainsnowappearenormous.

Itisoftenclaimedthatthisenablesresearcherstostudyconceptsanddevelopmentsinlongitudinal,systematicandquantitativewaysthatwereimpossiblebefore.Butwhatdothesedigitaltechniquesreallyaddtomoretraditionalapproaches?Howcantraditionalapproachesandinnovativedigitalmethodologiesbepairedinameaningfulandenrichingmanner?Doesquantitativetextanalysisprimarilyprovidecontexttoexistingknowledge,orisitaradicaldeparturefromwhatwentbefore?

Webelievethatquantitativetextanalysiscouldwellprovetobeadramatic,agenda-settingchange.Asyet,however,severalproblemsneedtobeaddressed.First,mostofthetechniquesinvolvedarelessthanadecadeold,researchersarescatteredamongdepartmentsanddisciplines,andthereisasyetnooverarchingdiscussionaboutbestpractices,pitfallsandproblemswithmethodology,orevenasharedplatformtodiscussbasictechnicalproblemshasbeenestablished.Thereisadistinctneedforabetterexchangeofinformationandsharingofexperience,bothinsideandoutsidetheworldofdigitalhumanities.

Asecondproblemthatneedstobeaddressedistheslowadvancementofnewtechniquesinpublishedresearchoutsidethenarrowdigitalhumanitiesworld.Anecdotalevidencesuggeststhatleadingjournalsinthehumanities,politicalandsocialsciencesarenotparticularlykeenonpapersusingtext-miningmethodologies.Thisunwillingnessisatleastinpartinspiredbytheproblemmentionedabove.Therearefewestablishednormstoevaluatethevalidityofnewtechniques.Ontheotherhand,conservatismmayalsoplayarole.

Athirdproblem,whichalsoimpactspublicationopportunities,isthatthebulkofpublicationssingtext-miningtechniquesarestillprimarilyabouttextmining.Thecorporaused,andtheresearchquestionsasked,inmanycasesstillseemperipheraltotechnologicalglitz.Itisofcourseusefultoinvestigatethetechnicalopportunitiesthatnewtechniqueshavetooffer,butforthewiderdisseminationofthesetechniquesitwillprobablyprovenecessarytotackleexistingresearchproblemsinvariousfieldsandshowthatthisparticularfieldofthedigitalhumanitieshassomethingtooffertothestudyofhistory.

Weproposetodiscusstheseproblemswithamixedpanelofexperiencedtextminingresearchersfromdifferent(sub-)disciplines.Ourcentralgoalistodiscusspracticesforvalidationoftechniquesandmethodologies.Wewanttocomeupwithaproposalforintegratingtextminingtechniquesin

Page 60: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

60

historicalresearchpracticeinameaningful,substantive,andcontributiveway,andpavethewayforthemoveoftextminingintocommonresearchpractice,beyondthecurrenthype.

Chair:

• Dr.RalfFutselaar(EUR/NIOD)

Panelmembers:

• Dr.JessedeDoes(IvdNT)• Prof.dr.YasutoNakano(KGU,Japan)• Dr.MartijnSchoonvelde(VU)• MilanvanLange,MA(NIOD/UU)

Page 61: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

61

SessionK

MappingHistoricalLeiden:TheCreationofaDigitalAtlasOrganiser: ArievanSteensel,UniversityofGroningen([email protected])

Panellist: JaapEvertAbrahamse,CulturalHeritageAgency([email protected])

Speakers: EllenGehring,ErfgoedLeidenenOmstreken([email protected])RoosvanOosten,LeidenUniversity([email protected])ArievanSteensel,UniversityofGroningen([email protected])

Thedigitalrevolutionhasrenderedmapsevenmoreusefulforallkindsofpurposes,suchasnavigating,locatingservices,orgeotaggingactivities.Moreover,agrowingarrayofdigitaltechnologies,applicationsandplatformsoffernewresearchopportunitiesforscholarsinthehumanities,forwhommapsarebothasourceaboutthepastandatooltostudythepast,andtheyallowheritageorganisationstounlock,visualiseandanalysediversehistoricalandarchaeologicaldataandobjectsinnovativelyonthebasisofgeographicalrelations.Itisbeyonddoubtthatthespatialencodingofobjectsandtextualinformationoffersanewframeworkofanalysisandenablesustobetterexploretheexperiencesandmeaningsofspaceandplaceinthepast.45Tools,mapsanddataareoftenreadilyavailableforthestudyofthemorerecentpast,butthisislessthecaseforthepre-modernperiod.Ingeneral,itrequiresaconsiderabletimeinvestmenttodevelophistoricalGeoInformationSystems(GIS)andonlinemappingplatforms.Theseefforts,however,payoffinthelongrun,sincetheseapplicationsopenawholerangeofnewresearchopportunitiesandnovelwaystopresentandvisualiseresearchresults.46

ThispanelpresentsandcriticallydiscussesthefirstresultsoftheMappingHistoricalLeidenproject,whichaimstodevelopadynamicdigitalatlasofthepre-moderncityofLeiden.Thefirstphaseofthisproject–acollaborationbetweenhistorians,archaeologistsandLeiden’sheritageorganisation(ErfgoedLeidenenOmstreken)–wasrecentlycompleted(thefirstversionoftheatlasisaccessibleonlineathlk.erfgoedleiden.nl,inDutch).Themappingtoolstillrequiresfurthertechnicalimprovementstomakeiteasiertouploadandanalyseadditionaldata,andmoregeocodeddatasetswillbecomeavailableinthecomingmonths.Thetoolenablesuserstolink,identifyandsearchdataacrossplaceandtime,ratherthanprovidingstaticsnapshotsoftheurbanspaceinthepast.

Apartfromitstechnicalresourcesandaspects,themappingtool’sresearchpossibilitieswillbedemonstratedbytwocasestudies:oneontherelationbetweenspaceandwealthinsixteenth-centuryLeiden,andtheotheronthecity’ssanitaryinfrastructureintheearlymodernperiod.Together,thesepresentationswillofferanopportunitytodiscussthepossibilitiesofdigitalmappingtoolsandthevalueofcollaborationbetweenscholarsandspecialistsfromtheheritagesectorinthe45 See, for example, Anne Kelly Knowles and Amy Hillier, eds., Placing History: How Maps, Spatial Data, and

GIS Are Changing Historical Scholarship (Redlands, Calif: ESRI Press, 2008); David J. Bodenhamer, John Corrigan, and Trevor M. Harris, eds., The Spatial Humanities: GIS and the Future of Humanities Scholarship (Bloomington: Indiana University Press, 2010); Alexander von Lünen and Charles Travis, eds., History and GIS: Epistemologies, Considerations and Reflections (Dordrecht: Springer, 2013); Ian N. Gregory and A. Geddes, eds., Toward Spatial Humanities: Historical GIS and Spatial History (Bloomington: Indiana University Press, 2014).

46 See, for example, Onno Boonstra and Gerrit Bloothooft, eds., Tijd en ruimte: nieuwe toepassingen van GIS in de alfawetenschappen (Utrecht: Matrijs, 2009); a theme issue of PCA. Post Classical Archaeologies 2 (2012) on GIS for archaeologists and historians; Hélène Noizet, Boris Bove, and Laurent Jacques Costa, eds., Paris de parcelles en pixels: analyse géomatique de l’espace parisien médiéval et moderne (Saint-Denis: Presses Universitaires de Vincennes, 2013); Nicholas Terpstra and Colin Rose, eds., Mapping Space, Sense, and Movement in Florence: Historical GIS and the Early Modern City (New York: Routledge, 2016).

Page 62: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

62

fieldofdigitalhumanities,butalsothepracticalandtechnicalchallengesofhistoricalGISandpotentialpitfallsofpartnerships.

Presentation1(EllenGehring):OneSizeFitsAll?DevelopingaMulti-FunctionalDigitalMappingTool

Buildingacutting-edgemapapplicationforscholars,heritagemanagersandthegeneralpublicisamajorchallengeintechnicalandmethodologicalterms.MappinghistoricalLeidenhasovercomesomeofthebarriers,andthispresentationfocusesonthetechnicalaspectsofthemappingtool.Crucialfortheproject,forexample,wasthedevelopmentofaso-calledhistoricalgeocoder,whichallowstolinkdifferentgeometricformsandtodefinetheirrelations.Apartfromtechnicalities,itwillbefurthershownhowverydiversedatacanbestandardisedthroughanadvanceduseofdatabasestoensuremeaningfulspatialanalyses.Thecodeofthemappingtoolisavailableasopensource,andsinceitisunnecessaryforotherstoreinventthewheel,itwillbefinallyexplainedhowthetoolcanbeutilisedinothercontexts.

Presentation2(ArievanSteensel):WealthandPlaceinLateMedievalLeiden:aParcel-BasedAnalysis

Leidenhasauniquesource,theso-calledBookofWaterwaysandStreets,whichcontainsaboutahundredcadastralmapsthatweredrawnforfiscalpurposesinthesecondhalfofthesixteenthcentury.Inthispresentation,itwillbefirstdemonstratedhowthesemapswereturnedintoageoreferencedbasemap.Secondly,itwillbeshownhowthissixteenth-centurypre-cadastralmapcanbeusedtoanalysetherelationbetweenwealthandspaceinthecityofLeidenataparcellevel,resultinginamorerefinedunderstandingofthecomplexrelationshipbetweenoccupation,wealthandplace,whichchallengescommonassumptionsaboutthesocialgeographyofpremoderncitiesandtowns.ThemainpointtobemadeisthathistoricalGISmakesitpossibletoreinterpretsourcesthatinformusabouttheimportanceofspaceandlocalityinstructuringhumaninteractions,aswellastopresentthesedatainanattractiveandaccessibleway.

Presentation3(RoosvanOosten):Wassanitaryinfrastructureaprivilege?

Scholarshavegenerallyacceptedthatsanitaryinfrastructurewastheprivilegeofthewealthyfew.However,withtheuncoveringofhundredsofcesspitsandwatersupplyfacilitiesinthetownofLeideninthepastdecades,thisassumptioncannowbetestedfordifferenttimeperiods.Inordertoinvestigatethequestionofaccessibilitytosanitaryarrangements,thearchaeologicallydocumentedsanitarystructuresmustbeplottedandfinancialvaluationattachedtothem.Socio-economicdatabasedontaxregistersareavailablefromabout1600,whichwillbemostusefulinthisventure.Furthermore,thankstoHISGIS,wealsohaveaccesstosocio-economicdatafrom1832,whichwillallowustoestablishalong-termperspectiveonthedevelopmentofLeiden’ssanitaryinfrastructure.

Page 63: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

63

SessionL

1.WastheFerguutwrittenbyoneortwoauthors?TheoMeder,GosseBouma,HannahMars,TrudyHavinga(RUG)

In1989,WillemKuiperpublishedhisthesisontheMiddleDutchromanceFerguutinwhichheconcludedthattheromanceiswrittenbytwoauthors.Kuipershoweddifferencesinwritingstyleatalllevels(rhyme,syntax,vocabulary,spelling)andconcludedthiswasnocoincidence.AccordingtoKuiper,thefirstauthortranslatedtheOldFrenchFergusbyGuillaumeleClerc,approximatelyuntilvs.2592,whereafterthesecondauthorcompletedthesecondhalfwithoutFrenchexample-inthespiritofFergus,butinhisownwords.Nowhereinthetextthereisaclearreferencetoadualauthorship(cf.theRomanvanWalewein),butthestylebreakhalfwaythroughthetextwasneverthelesssomethingthatascholarlikeEelcoVerwijsnoticedaswell.OtherresearchersquestionedordeniedthefindingthattheFerguutwaswrittenbytwoauthors,likeW.J.A.Jonckbloet,andaftertheappearanceofKuiper’sthesisalsoBartBesamuscaandMikeKestemont.WiththethesisofKestemontweenteredtheeraofe-humanities.WhereasKuiperhadtodohisquantitativestyleanalysisbyhand,todaytheprogramminglanguageRincollaborationwiththestylometricprogramStylocanperformthejobmuchfaster,morethoroughandcompletelyunbiased(Stylodoesn’tknoworcarewhattextsitgetspresentedandwhattheoutcomemaybe,whereashumanresearchersmaybeinfluencedbypreconceivedideas).Initsanalysis,thesoftwarenotonlytakesallthedifferencesintoaccount(likeKuiperdid),butallthesimilaritiesaswell,evenatlevelswherewritersandreadersarehardlyawareof,suchaswordorderandtheuseoffunctionwords.Atthisleveleveryauthorleaveshismostpersonalfingerprintbehind.

SomewhatcautiousKestemontfinallyassumesthatFerguutwaswrittenbyoneauthor,whoasatranslatorpulledopenanotherregisterthanasafreewriter.BecauseFerguutplaysnoprominentroleintheinvestigationofKestemont,wewanttozoominmorefocusedonthisparticularromance.Thecentralquestion:istheFerguutwrittenbyoneortwoauthors?

InordertoinvestigatewhetherthetwopartsoftheFerguutarestylisticallysimilar,wecomparethesimilaritybetweenthetwopartsoftheFerguutwiththesimilaritybetweentwoorthreepartsofother‘randomly’selectedMiddleDutchtextsfromaroundthesameperiodandregion,mostofthemdealingwithcourtlylife.Seventextsweknowtohavebeenwrittenbyasingleauthor,aneighthtextweknowthatitiswrittenbytwoauthors.Weinvolvethefollowingtextsintheanalysis:Ferguut,Beatrijs,DeBorchgravinnevanVergi,Lanceloetenhethertmetdewittevoet,VandenvosReynaerdebyWillem(theAernoutmentionedintheprefaceistheauthorofanOld-FrenchRenarttranche),threepoems(adeliberatemisfit)byWillemvanHildegaersberch(VandenSerpent,VandenPaepdiesijnBaeckgestolenwert,VandenWijnvaet)andDeRomanvanWalewein–forthisexperimentwelookedatthecompletetexts,andcutupthelongertextsintotwoorthreeevenpiecesincasetherewerenocleartextualdivisions.Alltheeditionshadtobethoroughlycleanedandconvertedtotxtformat.

OnlyDeRomanvanWaleweinismostcertainlywrittenbytwoauthors:toabouttwo-thirdsofthetotalnumberofverses,thestoryiswrittenbyPenninc(vs.1–7.880),thelastpartiswrittenbyPieterVostaert(vs.7.881–11.198).Fortheanalysiswethereforecutthistextintothreepieces,sothatthethirdpartiswrittenbyVostaert.Asanexperiment,wecutVandenVosReynaerdeinthreeevenpieces.Theotherlongertextswecutintotwoevenpieces.Ferguutiscutatthelocationwherethestyletransitionshouldoccur,sotheplacewherethesecondauthortookoverfromthefirst,accordingtoKuiper.AllthesetextsandfragmentsarethenpresentedtoStyloforanalysis.Inthisway,wecancomparethesimilaritybetweenthetwopartsoftheFerguutwiththesimilaritybetweenthetwo/threepartsofanumberoftextsthatweknowarewrittenbyasingleauthor,andthethreepartsofatextwhichweknowthatitwaswrittenbytwoauthors.Ifthestylometricanalysis

Page 64: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

64

showsthatthetwopartsoftheFerguutlookasmuchalikeastwopartsofthetextsofoneauthor,andresembleeachothermorethanthefirsttwoandthethirdpartoftheWalewein,thatindicatesthattheFerguutwasalsowrittenbyoneauthor.IftheanalysisshowsthatthetwopartsoftheFerguutlooklessalikethanthetwopartsofthetextsofoneauthor,andjustasmuch,orlessthanthethreepartsoftheWaleweintogether,thismayindicatethattheFerguutiswrittenbytwoauthors.

Inabovegraph,basedonwordtri-grams,Styloshowswhatmanyalreadyexpected:allnovelsandwritersareclusteringneatlytogether(N.B.:thesamehappenswithwordbi-gramsandwithcharacterbi-gramsandtri-grams.Asonecanseeinthegraph,inhindsightthefulltextsneednothavebeenincluded,butwewantedtobeverysurewewouldnotencounteranynastysurprises).ThethreepartsoftheReynaertarestylisticallymostalike,thetwopartsoftheBeatrijsmostlyresembleeachother,Vergipart1looksmostlikeVergipart2etc.AlsothetwopartsofFerguutstylisticallymatcheachotherratherthananyothertext.Eventheexemplum,thejestandthesongofHildegaersberhsharethestyleofoneandthesameauthor.OnlyWaleweinexhibitstheexpecteddeviation:Part3wandersoffandpositionsitselfsomewherebetweenFerguutandReynaert,ratherthannexttotheotherpartsoftheWalewein.ThisgraphofthestylometricanalysisjustifiesnootherconclusionthanthattheWaleweiniswrittenbytwoauthors,butFerguutbyoneauthor.Furthermore,itshowsthatthethreeArthurianromancesandReynaertclustertogether,andthecourtly,religiousandmoralistictextsstandtogetherseparately.Weexperimentedwithallkindsofdifferentparameters,buttheresults(practically)remainedthesame.Rollingdeltaresultedintonothingconclusive.OnlycuttinguptheFerguutinevensmallerpiecesandclusteringthemresultedinthestyledifferencesthatKuiperdiscovered,basedonsmallpiecesofcomparison,butdeprivedofalong-termsimilarityoverviewoverthetextmaterial.

Page 65: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

65

Reservationscanbemadeforthetechniquesused:stylometricsworksbetterwithlongertextsthanshorterones,stylometricsworksbetteronStandardModernDutchthanonMiddleDutchtextswithitsunstablespelling,stylometricsworksbetteronMiddelDutchrhymepairs,alltheeditionsshouldbeeitherdiplomaticorcriticalorinanyotherwaynormalized/standardizedetcetera.

Still,allthingsconsidered,basedonmultiplestylometricexamination,StyloseesmoresimilaritiesthandifferencesbetweenthetwopartsoftheFerguut,bothonthelevelofwordorderandtheuseoffunctionwords–traitsthatareconsideredtoberatherpersonalforeachauthor.TheFerguutismostprobablywrittenbyoneauthor.Inwritingthesecondhalfofthetext,theauthormay–alsostylistically–beinspiredbythefairytaleknownasATU314ATheShepherdandtheThreeGiants,thatwaspresentintheOld-FrenchFergusaswell.WhatwealreadyknewaboutWaleweinisconfirmed:thelastpartoftheromanceshowsmorestylisticdifferencesthansimilaritiescomparedtootherromancesliketheReynaertandevenFerguut,andthereforeWaleweinwaswrittenbytwoauthors.Finally,itisgoodtoknownowthatoneauthorcouldhaveseveralstylisticregisters:oneforwhenhetranslated,andoneforwhenhefreelyretoldastory.

ReferencesB.Besamusca:‘DeVlaamseopdrachtgeversvanMiddelnederlandseliteratuur:eenliterair-historischprobleem’,in:Denieuwetaalgids84(1991),p.150-162.

A.Th.Bouwman:ReinaertenRenart.HetdiereneposVandenvosReynaerdevergelekenmetdeOudfranseRomandeRenart.2parts,Amsterdam1991.

W.Bisschop&E.Verwijs(eds.):WillemvanHildegaersberch:Gedichten.’s-Gravenhage1870.

K.H.vanDalen-Oskam:DestijlvanR.Amsterdam2013.

T.Dekker,J.vanderKooi&T.Meder:VanAladdintotZwaankleefaan.Lexiconvansprookjes:ontstaan,ontwikkeling,variaties.Nijmegen1997.

M.Draak(ed.):Lanceloetenhethertmetdewittevoet.6thimprint,DenHaag1979.

M.Eder,J.Rybicki&M.Kestemont:‘StylometrywithR:apackageforcomputationalanalyses’,in:TheRJournal(2016),asdownload:https://journal.r-project.org/archive/accepted/eder-rybicki-kestemont.pdf

G.A.vanEs(ed.):DejeestevanWaleweinenhetschaakbord.Zwolle1957.

J.D.Janssens,R.vanDaele&V.Uyttersprot(eds.):VandenVosReynaerde.HetComburgsehandschrift.2ndimprint,Leuven1998.

W.J.A.Jonckbloet(ed.):Beatrijs.EenesprokeuitdeXIIIeeuw.DenHaag1841.

W.J.A.Jonckbloet:GeschiedenisderNederlandscheletterkunde.4thimprint,Groningen1888,part1.

M.Kestemont:Hetgewichtvandeauteur.StylometrischeauteursherkenninginMiddelnederlandseliteratuur.Gent2013.

P.deKeyser(ed.):DeBorchgravinnevanVergi.Antwerpen1943.

W.Kuiper:Dieridderemettenwittenscilde.Oorsprong,overleveringenauteurschapvandeMiddelnederlandseFerguut,gevolgddooreendiplomatischeeditieeneendiplomatischglossarium.Amsterdam1989.

E.Rombauts,N.dePaepe&M.J.M.deHaan(eds.):Ferguut.DenHaag1982.

Page 66: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

66

E.Stamatatos:‘Asurveyofmodernauthorshipattributionmethods’,in:JournaloftheAssociationforInformationScienceandTechnology60(2008)3,p.538–556.

H.-J.Uther:TheTypesofInternationalFolktales.AClassificationandBibliography.3volumes.Helsinki2004.

2.StylometryappliedtobookpreferencesPeterBoot,[email protected]

IntroductionOneoftheoldestandmostactivefieldsinDigitalHumanitiesisauthorshipattribution.Ithasbeenshownmanytimesthatwritershaveacharacteristicstylethatcanbeusedtotellthemapart(e.g.Burrows,2002).Itisalsowellknownthatwordusagecanbeusedtopredictpersonalitycharacteristics(e.g.Noecker,Ryan,&Juola,2013).Personalitycharacteristicsinturnarerelatedtopreferencesindifferentartforms(e.g.Cantador,Fernández-Tobías,Bellogín,Kosinski,&Stillwell,2013).Thissuggeststhat,asonewouldhope,thestylisticdifferenceswherebywetellauthorsapart(suchasdifferencesinfunctionwordusage)arenotjustmeaninglesspreferencesforonefunctionwordoveranother,butarerelatedtoartisticpreference,inawaythatisstilltobeclarified.

Thispaper,continuingearlierwork(Boot,2014),triestocontributetothatclarification,inthatitwillremovethemiddleterm(thepersonalitycharacteristics)andshowthatthereisadirectrelationbetweenthewordsthatpeopleuseandtheirpreferencesinart,inthiscase,forbooks.ThewritersthatIstudyherearethewritersofbookreviews,notbooks.Inthefirstsection,Iwillusebookreviewsandratingsfrombookdiscussionsitesandshowcorrelationsbetweenwordusageandbookratings.Inthesecondsection,Iwilltakeanexploratoryapproachandcreateaclusteringofreviewersbywordusage.Forthetwoclusters,Iwillthenlookattheirpreferredwordusage,aswellasthewordusageinthebookdescriptionsoftheirpreferredbooks.

CorrelationsbetweenwordusageandratingsThedatathatthepaperuseswerecollectedfromanumberofDutchbookdiscussionsites.Thesesitesincludehebban.nl,lezerstippenlezers.be,bol.comandthenowdefunctsiteswatleesjij.nuanddizzie.nl.

Thecorrelationswerecomputedasfollows:Iselectedreviewsfromuserswhohadwrittenatleast100000characters,excludingsomeuserswithmultipleaccounts.Icomputedrelativewordfrequenciesintheirreviews,andnormalizedtheresults(centeraroundzeroanddividebythestandarddeviation).Inordertoremovewordswiththematiclinkstobooks(murder,war,castle,love)IlimitedthecomputationtowordsdefinedasfunctionwordsintheDutchLIWC2007dictionary(Boot,Zijlstra,&Geenen,2017,inpress).ForthesameusersIretrievedthebookratingsandcreatedamatrixofusersbyrating,excludingbooksthatwereratedonlyonce.Icomputedthebiascorrecteddistancecorrelation(amultivariategeneralizationofthecorrelationcoefficient,seeSzékely&Rizzo,2013)betweenthetwomatrices,andrepeatedthatcomputationforreviewsinallgenres,inliteratureandintheliterarythriller.TheresultsaregiveninthefirstrowofTable1.

Tobeabsolutelysurethatnocontent-aspectsofthereviewswerereflectedinthewordusage,IrepeatedthecomputationusingPart-of-speech-tags.ThetextsweretaggedusingTreetaggerandinsteadoftherelativewordfrequenciesIusedrelativefrequenciesofPOSbigrams.Theresultsaregiveninthesecondrowofthetable.

Page 67: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

67

Table 1

Correlationswithp-values Allgenres189reviewers166reviews(avg.)

Literature41reviewers126reviews(avg.)

Literarythriller32reviewers88reviews(avg.)

functionwords(200)vs.ratings 0.20(0.000) 0.16(0.000) 0.41(0.000)

POSbigrams(100)vs.ratings 0.16(0.000) 0.10(0.002) 0.22(0.000)

Itishardtointerpretthesecorrelationsizes,butitisclearthatthereareverysignificantcorrelationsbetweenfunctionwordusageandbookratings.ThefactthatthesecorrelationspersistevenwhenlookingatPOSbigramsshowsthattherelationistosomeextentbasedpurelyonlinguisticstyle,notoncontent.WhysequencesofPOS-tagsshouldberelatedtoliterarypreferenceisanintriguingquestionthatthispaperwillnotsolve.

ExploratoryanalysisTogetafeelforwhatthiscorrelationmightmeanintermsofrealreviewsandratings,Icreatedaclusteringbasedonfunctionwordusageforagroupofreviewers.Iremovedafewoutliersandwasleftwithtwoclusters,cluster1containing20reviewersandcluster2containing11.

Ithenlookedattheirreviewsandpreferredbooks.Asampleofreviewsfromcluster1showedtheirinformal,directandverypersonalwriting,characteristicsthatweremuchlessprominentincluster2.Thisimpressionisconfirmedwhenlookingatcontrastivekeywordsinthereviewsofbothclusters.The20keywordswiththelargesteffectsize(Gabrielatos&Marchi,2011)forbothclustersareshownintable2.Itisclearcluster1prefersthefirstperson,cluster2hasmoreinterestinwriting.

Table2

Cluster Preferredreviewwords

1 thought(wasoftheopinion),very,because,completely,me,actually,therefore,read(pastpart.),beautiful,afterall,had,have(1stpers.sing.),am,I,very,all,good,otherwise,yet,again

2 writer(fem.),writer,novel,reader,years,under,know,these,characters,one,between,gives,second,the,them,of,until,end,in,who

Turningtotheratings,whilethereweremanybooksthatwereratedsignificantlyhigherbyoneofthegroups,thepreferenceswerehardtounderstandintermsoftaste.Ratingssummedbygenredidn’tshowaveryclearpictureeither.Itwasonlywhenlookingatcontrastivewordusageinthe(publisher-provided)bookdescriptionsforbooksreadbyeitherclusterthataclearerpictureemerged.

Table3

Cluster Keywordsinpreferredbookdescriptions

1 thriller,investigation,police,murdered,murder,case,body,someone,further,secret,above,know,very,sits,very,disappeared,within,nothing,appears,found,become,part,truth,books,there,something,else

2 inwhich,without,about,parents,family,city,bigstories,last,exist,us,we,writer,history,love,country,tells,century,novel,Netherlands,war

Page 68: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

68

Hereitbecomesclearthatcluster1prefersthrillersandpolicenovels,whilecluster2hasaless-focussedinterestinfamily,writingandthecountry.Itisworthwhiletorepeatthattheseclustersofcontentwordsresultfromclusteringreviewersonthebasisoffunctionwords.

ConclusionTakentogether,thecorrelationsandtheexploratoryanalysisshowthatthereisarelationbetweenthefunctionwordsthatpeopleuseandtheirpreferencesforbooks.Thisrelationstillholdsatthelevelofpart-of-speechtags.Thisclearlyshowsthatthewordusagethathelpstellauthorsapartistosomeextentrelatedtoartisticpreference.Apossibleexplanationwouldbethatthereviewersunconsciouslyimitatethebookstheyreadintheiruseoffunctionwords.Thatseemsunlikely,amongotherreasonsbecausetheeffectisalsovisiblewhenwejustlookatthereviewsinasinglegenre(secondandthirdcolumnoftable1).Themorelikelyexplanationisthatfunctionwordusageisatleastinpartdeterminedbyartisticpreferenceandrelatedpersonalitycharacteristics.The‘fingerprint’metaphorthatisoftenusedinthiscontext,withitssuggestionofanessentiallyrandomidentifier,unlikelytoberelatedtoartisticpreference,mustthereforebeconsideredasinappropriate.

LiteratureBoot,P.(2014).Dimensionsofliteraryappreciation.Worduseandratingsonabookdiscussionsite.DigitalHumanities2014.Retrievedfromhttp://dharchive.org/paper/DH2014/Paper-825.xml

Boot,P.,Zijlstra,H.,&Geenen,R.(2017,inpress).TheDutchtranslationoftheLinguisticInquiryandWordCount(LIWC)2007dictionary.DutchJournalofAppliedLinguistics,6(1).

Burrows,J.(2002).‘Delta’:Ameasureofstylisticdifferenceandaguidetolikelyauthorship.LiteraryandLinguisticComputing,17(3),267-287.

Cantador,I.,Fernández-Tobías,I.,Bellogín,A.,Kosinski,M.,&Stillwell,D.(2013).RelatingPersonalityTypeswithUserPreferencesinMultipleEntertainmentDomains.Proceedingsofthe1stWorkshoponEmotionsandPersonalityinPersonalizedServices(EMPIRE2013),atthe21stConferenceonUserModeling,AdaptationandPersonalization(UMAP2013).

Gabrielatos,C.,&Marchi,A.(2011).Keyness:Matchingmetricstodefinitions.Theoretical-methodologicalchallengesincorpusapproachestodiscoursestudies-andsomewaysofaddressingthem.

Noecker,J.,Ryan,M.,&Juola,P.(2013).Psychologicalprofilingthroughtextualanalysis.LiteraryandLinguisticComputing,28(3),382-387.

Székely,G.J.,&Rizzo,M.L.(2013).Thedistancecorrelationt-testofindependenceinhighdimension.JournalofMultivariateAnalysis,117,193-213.

3.Corpusenrichmentfor17thcenturyDutch:apilotstudyFeikeDietz1,MarjovanKoppen2,IreneKramer1andMarijnSchraagen21InstituteforCulturalInquiry,2UtrechtInstituteofLinguisticsOTSUtrechtUniversity

1 IntroductionTheDutchlanguageinthe17thcenturywasamixtureoffadinglinguisticpropertiesfromtheprecedinglanguagephase,MiddleDutch,andupcomingnewwaystoconstructwordsandsentences.Withintheselanguagedynamicsweobserveatypeoflanguagevariationthathasrarely

Page 69: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

69

beenaddressedbefore:variationwithinindividuallanguageusers(intra-authorvariation).Theaimofthecurrentprojectistodescribeandanalyseindetailthelinguisticandliterary/rhetoricalcontextsinwhichintra-authorvariationoccurs.Asaprerequisite,thedataneedstobeannotatedlinguistically,usingpartofspeech(POS)informationand(morpho-)syntacticstructure,andsociolinguistically,describingvariousfactorsthatinfluencelanguageuse.

InapilotprojectwerestrictourresearchtothelettersofthefamousDutchauthorandpoliticianP.C.Hooft,writtenbetween1600and1638.Thiscollectionisrelativelylarge(approximately800letters,∼300.000words)andcontainssociolinguisticvariationintypeofcorrespondentandtypeofletter.Thecorpuscanbeused,i.a.,tostudythelossofnegativeconcordinDutch,whichisobservedinHooft’slettersfromthisperiod(Paardekooper,2016).

AsastartingpointforobtainingPOStags,theAdelheidtaggerforMiddleDutch(vanHalterenandRem,2013)isused.BecausethetaggeristrainedonMiddleDutch,theresultsarenothighlyaccuratefor17thcenturytexts.Therefore,acorrectionprocedureforPOS-tagsandlemmasisperformedbyhumanannotators.Additionally,theannotatorsprovidethenecessarysociolinguisticinformationaboutlettersandcorrespondents.Whenannotationiscompleted,adetailedandsystematicanalysisoflinguisticphenomenawillbecomefeasible.

2 ApproachThesourcedataisavailableinadiplomaticedition(VanTricht,1976).WeusethiseditionafterseparatingHooftsoriginalseventeenthcenturytextsfromthemetadata(pagenumbers,footnotes,annotations).

Figure1:Exampleofthenewlydevelopedannotationtool

2.1 Part-of-Speechtagging

AcollaborationwiththeNederlabproject(Brugmanetal.,2016)isestablishedtoincreaseavailabilityoftheenrichedcorpus,byincludingthePOStaggingandsociolinguisticmetadataintheNederlabresearchinfrastructure.TheintegrationnecessitatesconversionoftheCRMtagsetusedbyAdelheidtotheCGNtagsetusedbyNederlab.Additionally,thetaggingneedstoberepresentedintotheFoLiA

Page 70: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

70

XMLformatforlinguisticannotation(vanGompelandReynaert,2013).TheCRMtagsetismoreextensivethanCGN,notablyintheuseofsurfaceformfeaturessuchasform-e(wordsendingin-e).Surfaceformfeaturesarerelatedtocasemarking,whichisanimportantaspectinthestudyoflinguisticvariationin17thcenturyDutch.Therefore,wedecidedtokeepthesefeaturesinthemappingtoCGNtags(seeFigure1).

2.2 Sociolinguistictagging

Akeyhypothesisinintra-authorvariationistheinfluenceofsociologicalfactorsonlinguisticchoices.Toevaluatethishypothesissystematically,alllettersarebeingannotatedwiththefollowinginformation:

• Goal:expressthanks,askadvice,recommend,invite• Topic:politics,religion,personalaffairs,administration• Forindividualcorrespondents:

o name,gender,yearofbirthanddeatho statusofcorrespondentasliteraryauthoro relationtoHooft:familymembers,literaryfriends,politicians,etc.

• Forgroupcorrespondents:o nameo domain:government,financialorlegalinstitutions,civilassociations

• Letterstructure:greeting,introduction,narratio,closingformulas

2.3 Annotationprocess

Atoolhasbeendeveloped(seeFigure1)toperformPOSandsociolinguisticannotationinanefficientway.Apoolofannotatorsisavailableforthetask,whichwillperformpartlyoverlappingannotationstoallowforagreementmeasurements.Theannotationprocessiscurrentlyongoing.Aprotocolhasbeendevelopedtoguidethepost-correctionprocess(seeFigure2forexamples).

Figure2:Annotationguidelineexamples

3 AnalysisInrelatedwork(Kramer,2016)theuseofnegationbyHoofthasbeenstudiedmanually.KramershowsthatHooftusesmostlysinglenegationindifferentsyntacticalenvironments(subclauses,inversion,mainclauses,localnegation,V1(verb-initial)sentences).Additionally,thenegationparticlenietcanbeusedasalternativeforthenounnothing.Furthermore,Hooftusesbipartitenegationinalmostallsyntacticalenvironmentsaswell(allexceptinV1).InKramer’sresearch,not

Comparative and superlative adjectives are annotated individually. This

rule is also applied for irregular adverbs, such as veel, meer, meest and

wel/goed, beter, best. As an example, minste in the sentence below (1634,

Van Tricht p. 527) receives a separate lemma minst:

. . . waer aen het minste deel niet en zal hebben, Me Jo↵r

e

.

Nominatives and non-nominatives are di↵erentiated. We chose not to de-

nominate dative, genitive, accusative and ablative. Instead, the surface

form, related to case marking, is annotated. An example from 1633 (Van

Tricht p. 437):

Veel gelux

N(ev,non-nom,form-s)

met . . . den

LID(bep,form-n)

jongen

N(ev,non-

nom,form-n)

Arnout, dien god geeve ’t lof des

LID(bep,form-s)

geenen nae te

ijvren, daer hij den naem af draeght.

Page 71: Abstracts DHBenelux Tuesday · currencies such as Bitcoin, it seems future generations will see much less currencies than people in the past. ... Subverting cartography: the situationists

71

oneenvironmentseemedtoparticularlyaskfortheuseofbipartitenegation.Thisresearch,however,encompassedonly107letters.Thefullyannotatedcorpuswillallowamorequantitativeanalysis,aswellasalargerrangeandhigherlevelofdetailoflinguisticphenomena.

NobelsandRutten(2014)notetheinfluenceofgenderandsocialclassonnegation(p.41):‘whilesinglenegationspreadfromthenorthtothesouth,italsoturnedintoasocialvariant,astheupperranksinsocietyandmaleletterwritersseemedtobequickertopickupontheincomingvariantthanthelowerranksandfemaleletterwriters’.NobelsandRutten(2014)alsonote(p.43)thattraditionsinletterwritingaffectlinguisticdevelopment:‘fixedformulaewerememorizedasawhole(orcopied)bywritersfromanysocialbackground.Thesefixedformulaeoccurincertainpartsoftheletters,mostlyinthebeginningandtheending’.Withthecurrentannotationeffort,thistypeofobservationscanbestudiedsystematically.

ReferencesBrugman,H.,Reynaert,M.,vanderSijs,N.,vanStipriaan,R.,TjongKimSang,E.,andvandenBosch,A.(2016).Nederlab:TowardsasingleportalandresearchenvironmentfordiachronicDutchtextcorpora.InProceedingsofLREC2016.

vanGompel,M.andReynaert,M.(2013).Folia:Apracticalxmlformatforlinguisticannotation-adescriptiveandcomparativestudy.ComputationalLinguisticsintheNetherlandsJournal,3:63–81.

vanHalteren,H.andRem,M.(2013).Dealingwithorthographicvariationinatagger-lemmatizerforfourteenthcenturyDutchcharters.LanguageResourcesandEvaluation,47(4):1233–1259.

Kramer,I.(2016).Variatieinnegatie,eensyntactischenretorischeanalysevanhetgebruikvanenkeleentweeledigenegatieindebrievenvanP.C.Hooftvan1633tot1638aanJoostBaekenTesselschadeRoemersdochterVisser.BAthesis,UniversiteitUtrecht.

Nobels,J.andRutten,G.(2014).Languagenormsandlanguageuseinseventeenth-centuryDutch:negationandthegenitive.InRutten,G.,editor,Normsandusageinlanguagehistory,1600-1900.Asociolinguisticandcomparativeperspective.,pages21–48.JohnBenjaminsPublishingCompany.

Paardekooper,P.(2016).Bloeienondergangvanonbeperktne/en,vooraldatbijniet-woorden.Neerlandistiek.nl.

vanTricht,H.(1976).DebriefwisselingvanPieterCorneliszoonHooft.TjeenkWillink/Noorduijn.