archive-itbesser.tsoa.nyu.edu/howard/talks/18composers-iipc.pdf · archive-it besser-iipc...

14
11/13/18 1 Archiving websites containing streaming media: the Music Composer Project Howard Besser, NYU h4p://besser.tsoa.nyu.edu/howard/Talks/ Besser-IIPC 13/11/2018 1 Archiving websites containing streaming media: the Music Composer Project The Problem with Heritrix and Archive-It The Project Our Technical CollaboraLon Our CollaboraLon with Content Creators & restricLons Architectures & Workflows How things look EvaluaLon Impact beyond this Project Caveat I: This is an in-progress report; the project is unfinished Caveat II: I am not involved in system architecture & hand-offs, so may not be able to answer detailed quesLons in these areas Besser-IIPC 13/11/2018 2 PROBLEMS WITH HERITRIX AND ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It The leading applicaLon/service for curated web archiving in North America Run by the Internet Archive, and is much more targeted and curated than their WayBack Machine Is based on Crawler soZware developed by IA (Heritrix) in 2003-2004 Is very poor at capturing streaming audio or video as well as inserLng it properly into a composed web page- Besser-IIPC 13/11/2018 4 Archive-It Issues w/Streaming Media Besser-IIPC 13/11/2018 5 Archive-It Issues w/Streaming Media Besser-IIPC 13/11/2018 6

Upload: others

Post on 09-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

1

Archivingwebsitescontainingstreamingmedia:theMusic

ComposerProject

HowardBesser,NYUh4p://besser.tsoa.nyu.edu/howard/Talks/

Besser-IIPC 13/11/2018 1

Archivingwebsitescontainingstreamingmedia:

theMusicComposerProject

•  TheProblemwithHeritrixandArchive-It•  TheProject

–  OurTechnicalCollaboraLon–  OurCollaboraLonwithContentCreators&restricLons–  Architectures&Workflows–  Howthingslook–  EvaluaLon

•  ImpactbeyondthisProject

•  CaveatI:Thisisanin-progressreport;theprojectisunfinished•  CaveatII:Iamnotinvolvedinsystemarchitecture&hand-offs,somaynot

beabletoanswerdetailedquesLonsintheseareas

Besser-IIPC 13/11/2018 2

PROBLEMSWITHHERITRIXANDARCHIVE-IT

Besser-IIPC 13/11/2018 3

Archive-It

•  TheleadingapplicaLon/serviceforcuratedwebarchivinginNorthAmerica

•  RunbytheInternetArchive,andismuchmoretargetedandcuratedthantheirWayBackMachine

•  IsbasedonCrawlersoZwaredevelopedbyIA(Heritrix)in2003-2004

•  IsverypooratcapturingstreamingaudioorvideoaswellasinserLngitproperlyintoacomposedwebpage-

Besser-IIPC 13/11/2018 4

Archive-ItIssuesw/StreamingMedia

Besser-IIPC 13/11/2018 5

Archive-ItIssuesw/StreamingMedia

Besser-IIPC 13/11/2018 6

Page 2: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

2

Archive-ItIssuesw/StreamingMedia

Besser-IIPC 13/11/2018 7

Archive-Itscreenshotsgeneratedaspartofourproject-

•  ByLorenaRamirez-Løpez

Besser-IIPC 13/11/2018 8

Archive-ItIssuesw/StreamingMediaFireFoxversion39.0.ScreenshotofTarikO’Regan’ssitetaken2015/10/05

Besser-IIPC 13/11/2018 9

Archive-ItIssuesw/StreamingMediaFireFoxversion39.0.ScreenshotofTarikO’Regan’ssitetaken2015/10/05

Besser-IIPC 13/11/2018 10

Archive-ItIssuesw/StreamingMediaFireFoxversion39.0.ScreenshotofTarikO’Regan’ssitetaken2015/10/05

Besser-IIPC 13/11/2018 11

Archive-ItIssuesw/StreamingMediaFireFoxversion39.0.ScreenshotofTedHearne’swebsitetaken2015/10/05

Besser-IIPC 13/11/2018 12

Page 3: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

3

Somesourcesofstreamingissues

•  Problemswithcapturingresourcesresidingon3rdpartyservices(YouTube,Vimeo,Soundcloud)

•  ProblemswithhowfaithfullytheA/VmaterialsarecapturedandplacedbyArchive-It

•  ProblemswithwebsitesgeneratedthroughsitebuildingplajormssuchasSquarespace

Besser-IIPC 13/11/2018 13

OtherIssueswe’retryingtosolve

•  DiscoveringURLsgeneratedbyJavascript

Besser-IIPC 13/11/2018 14

THEPROJECT

Besser-IIPC 13/11/2018 15

ArchivingComposerWebsitesh4p://www.nyu.edu/about/news-publicaLons/news/2015/03/27/nyu-libraries-to-team-with-internet-archive-to-preserve-high-

quality-musical-content-on-the-web.html

•  Collect,preserve,&makeavailableWebsitesofComposers

•  $480,000grantfromMellonin2015toNYULibrary/MIAP/InternetArchive

•  Dealingwiththeissuethatcontemporarycomposerwebsitesgoupanddown(andalsoincorporaterelaLonship-buildingbtwncomposerandfans)

•  AddressingtheproblemsofcollecLngstreamingmedia•  AlsoselecLvelycollecLnghigh-qualityversionsthatareusedtogeneratethestreams,andallowingfutureresearcherstosee/hearthehigherqualityversions

Besser-IIPC 13/11/2018 16

ArchivingComposerWebsites

Besser-IIPC 13/11/2018 17

•  DevelopgoodandongoingrelaLonshipsbtwnLibrariesandComposers

•  DevelopTrust–  fordevelopingcollecLons,andconLnuingtoaddtothem–  forPolicyreasons

•  Examinewhattypeoferrorstakeplace–  howfaithfullyaudiovisualmaterialsarebeingcaptured–  howresourcesthatresideonthird-partyweb-services(YouTube,Vimeo,Soundcloud)are(not)displayedwithinArchive-It’sinterface

–  IssueswwebsitesgeneratedthroughsitebuildingplajormssuchasSquarespace

•  Findwaystofixthoseerrors

Somemethodsused

•  BeganwithNPR’slistof100importantcomposersunder40,andaugmetedthelistwithfacultyandlibrariansuggesLons

•  IdenLfiedwebsiteinfrastructuresencounteredandcreatedaclassificaLonmatrix-

Besser-IIPC 13/11/2018 18

Page 4: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

4

WebsiteInfrastructureencountered

Besser-IIPC 13/11/2018 19

ProjectTeam•  JeffersonBailey(InternetArchive)•  HowardBesser(MIAP)•  LoriDonovan(InternetArchive)•  AprilHathcock(Lib/ScholComm)•  NicoleGreenhouse(Lib/ACM)•  CarolKassel(Lib/DLTS)•  Sco4Statland(MIAP)•  DonaldMennerich(Lib/ACM/DLTS)•  DavidMillman(Lib/DLTS)•  CourtneyMumma(InternetArchive)•  RobinPreiss(Lib/AFC)•  LorenaRamirez(MIAP)---specialthanks!•  MichaelStoller(Lib/C&RS)•  KentUnderwood(Lib/AFC)•  ChelaSco4Weber(Lib/AFC)--departed

Besser-IIPC 13/11/2018 20

OURTECHNICALCOLLABORATION:CRAWLING

Besser-IIPC 13/11/2018 21

NYU/IACollaboraLon

Besser-IIPC 13/11/2018 22

TradiLonalCrawlers

Besser-IIPC 13/11/2018 23

•  Archive-ItandotherwebarchivesuseHeritrix•  Followlinks,capturemostwebcontent•  Lesssuccessfulwithstreamingvideoanddynamiccontentexecutedinthebrowser

•  Umbrahelps

BROZZLER!

“browser” | “crawler” = BROZZLER

Logo: Noah Levitt Besser-IIPC 13/11/2018 24

Page 5: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

5

Besser-IIPC 13/11/2018 25

BrozzlerSystemArchitecturev1

Besser-IIPC 13/11/2018 26

BrozzlerModel

•  job:collecLonofseeds•  seed:principalunitofcrawlconfiguraLon

–  onebrowserworksononeseedataLme(politeness)–  seedhasitsownconfiguraLon,alsoinheritsfromparentjob

•  page:atomicunitofcrawlingfrombrozzlerperspecLve

•  url:onlybrowsers,warcproxhavetodealwitheveryurl

Besser-IIPC 13/11/2018 27

Warcprox:WARC-wriOnghPpproxy

•  man-in-the-middleforh4ps•  asynchronous:WarcWriterThread

– writeswarcrecords– savesdeduplicaLoninfo– updatesstaLsLcs

Besser-IIPC 13/11/2018 28

Otherpieces

•  pythonwayback•  Rethinkdb(distributeddocumentstore)

Besser-IIPC 13/11/2018 29

StreamcapturereliesonYoutube-dlh4ps://rg3.github.io/youtube-dl/supportedsites.html

Besser-IIPC 13/11/2018 30

Page 6: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

6

OURCOLLABORATIONWITHCONTENTCREATORS,IPISSUES

Besser-IIPC 13/11/2018 31

YoungComposersCorpus

•  BeganwithNPR’s2011listof“100ComposersUnder40”

•  91of100haveownself-containedsites•  WithinayearofstarLngwehadwri4enagreementswith165Composers(25ofthemfromNPR’slist)

•  Plannedtorecruit10ofthemforenhancedarchiving(uncompressed;be4erthanwhatisonwebsite)–  Thiswillrequireanaddedappendixtocontract/agreement(whichmayinvolvedarkarchivingand/orrestrictedaccess)

Besser-IIPC 13/11/2018 32

BuildingrelaLonshipswithComposers

•  EngagethemwiththeideaofpreservingtheirWebsite

•  Aretheywillingtogiveusricherversionsofcontentontheirsite?

•  Aretheywillingtomakeall(orjustpart)ofthecontentfreelyaccessible?Dotheywanttoembargosomecontentinadarkarchive?

•  DonorAgreement/Contract-

Besser-IIPC 13/11/2018 33

DonorAgreement/Contract

•  Workedonthiswithlawyersforwelloverayear•  Havehadfairlystablelanguageinitandmanycontractsalreadysignedandreturned

•  Doesdefaulttoallowinguscompleterightsforreformasngandforallowingresearcherstosee/hearallhighqualityversionsatminimumon-site– AndthusfarallComposerscontactedhaveagreedtothoseprinciples(butnotnecessarilytothecontractuallanguage)

Besser-IIPC 13/11/2018 34

ContractIntrotentaLvelanguage

•  NYUandComposerwishtoestablishlong-termpreservaLonofthematerialslistedatthehighestpossiblequality.TheParLeswishtoenterintothisAgreementtoestablishguidelinesandstandardswithregardtoongoingandfuturelibraryprocessesrelatedtosuchpreservaLon.

Besser-IIPC 13/11/2018 35

ElementsintheContract

•  Whatisbeingacquired•  TermsofTransfer•  TermsofuserAccess•  Rights&ResponsibiliLes(bothNYU&Composer)

•  Appendixdescribingeachitem(format,content,amount,otherper4nentdescriptors)

•  AppendixwithAccessRestricLons-Besser-IIPC 13/11/2018 36

Page 7: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

7

4possibleLevelsofStreamingAccess

•  Availableforcopy-protectedstreamingfromtheNYULibraries’websitewithunrestrictedaccessbythegeneralpublic.

•  Availableforcopy-protectedstreamingfromtheNYULibraries’website–  withaccesslimitedtoregisteredNYUfacultyandstudentsand–  toexternalresearcherswitheligibilitytouseNYULibraries’archivalresourcesaccordingtoNYULibraries’generalaccesspolicies,withpasswordauthenLcaLon,onoroffcampus.

•  Availableforcopy-protectedstreamingonNYULibrariespremises,atdesignatedworkstaLons,withaccessmediatedbyNYULibrariespersonnel.

•  NotavailableforstreamingunLladesignatedfuturedate.

Besser-IIPC 13/11/2018 37

TentaLvepiecesoftheContract•  TheuncompressedmasterfilesofMaterialslicensedforinclusionwillbemadeavailabletotheLibrariestoenabletheresearchanddevelopmentofhigherqualitytoolsandprocessesforarchivingontheWebandsuccessortechnology.Theresultanthigh-qualitycopiesofComposer’swebsite—incorporaLngthebestqualitymediafiles—willbepreservedashistoricaldocumentsinthearchive,whichwillbeaccessibleworldwideontheWeborsuccessortechnologyasastorehouseofculturalmemoryandavehicleforresearchandscholarship.ComposerretainsexisLngrightstohisorherMaterials,subjecttothelicensegrantedinthisAgreement.

Besser-IIPC 13/11/2018 38

TentaLvepiecesoftheContract•  non-exclusiveworldwide,perpetual,irrevocable,royalty-freerighttoproduce,use,copy,anddistributeDerivaLveWorks

•  strictlylimitedtoreforma4eddigitalfilesortoexcerptsandabridgements(suchasthumbnails)createdforthetechnicalpurposesofbuilding,preserving,andprovidingaccesstotheWebarchiveovertheWorldWideWeboritssuccessor

•  maybeusedonlyforthenon-profiteducaLonalandresearchpurposesprovidedunderthisAgreement

•  Agreementdoesnotaffectortransferanycopyrightsorotherintellectualpropertyrights

Besser-IIPC 13/11/2018 39

ARCHITECTURE&WORKFLOWS

Besser-IIPC 13/11/2018 40

Architecture&Workflows

•  TheFindingAidsaregeneratedfromArchiveSpace(whichcontainsrichmetadata)

•  ThereisanoverallComposersFindingAid,aswellasaseparateFindingAidforeachcomposer(lisLnginventoryandwebarchives,andlinktoassets)

•  WebarchiveisstoredinArchive-It;richercontentinNYURepository

•  ConnecLonsbuiltoffofArchiveSpaceback-endAPIDemoSite

Besser-IIPC 13/11/2018 41

SoZware&ServiceComponents

•  IA’sArchive-It•  NYUdigitallibraryinternalcomponents

– Aeonforworkflowmanagement– ArchiveSpace– EAD

Besser-IIPC 13/11/2018 42

Page 8: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

8

UnfinishedDevelopmentwork

•  Supplyingaseparateaudioplayer?•  SLllworkingonpreciseformsofnavigaLonbtwnArchiveSpace,Archive-It,andrichercontentwithinNYU’sdigitalrepository

•  WhatwillbeontheworkstaLonforitemsthatneedtobelookedaton-site?

•  Issueswithstreamsthatwerenotcaptured•  ExampleofworkdoneonIA’sAPI-

Besser-IIPC 13/11/2018 43

InterimworkonAPItoIA•  WhatIAneedsfromNYUAPI

–  APIURL–  CredenLals(username,password)->AuthenLcaLonToken()–  RepositoryID–  ResourceID

•  WhatIAwillreturnasJSONarray–  UnitTitle–  Creator–  DataExpression–  ExtentStatement–  TechCharacterisLcs–  [SomethingBasedonAccessRestricLon,i.e.canitbestreamed]???

•  WeSpeakEtruscan,1993May21,23.5MB,1AIFFfileStereouncompressed16bit/44.1K

•  TheDreamofInnocenceIII,1998March26,150MB,1AIFFfileStereouncompressed16bit/44.1K

Besser-IIPC 13/11/2018 44

HOWTHINGSMAYLOOK

Besser-IIPC 13/11/2018 45

QuerypathssLllunderdevelopment

Besser-IIPC 13/11/2018 46

OneopLonforUserQueries

•  UserbrowsesthroughArchive-It•  UserseesthatA/Vcontentexists(andinsomecases,itwillincluderichercontent,butsomeofthatmightbeaccess-restricted)

•  Archive-IthandsoffusertoNYU(eitherdirectlytoA/Vcontent,ortoFindingAid)

Besser-IIPC 13/11/2018 47

OneopLonforQueries

Besser-IIPC 13/11/2018 48

Page 9: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

9

OneopLonforhighqualitycontent

•  OnarchivedwebsitepagelisLngcomposer’scontent,userseesamessagethathigherqualitycontentisavailable,with:– AccessrestricLons,ifapplicable– Linktorelevantfindingaid–  (lookinglikefollowingimage)-

Besser-IIPC 13/11/2018 49 Besser-IIPC 13/11/2018 50

DemofromAPIsideh4p://composers.dlib.nyu.edu/

Besser-IIPC 13/11/2018 51

FromtheLibraryFindingAidsideh4p://dlib.nyu.edu/findingaids/html/fales/mss_479/

Besser-IIPC 13/11/2018 52

FromtheLibraryFindingAidside(cont)

Besser-IIPC 13/11/2018 53

FromtheLibraryFindingAidside(ContainerList)

Besser-IIPC 13/11/2018 54

Page 10: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

10

FromtheLibraryFindingAidsideh4p://dlib.nyu.edu/findingaids/html/fales/mss_460/dscaspace_7951feea619b6c41436c556e0674d1c8.html

Besser-IIPC 13/11/2018 55

FromtheArchive-Itsideh4ps://archive-it.org/collecLons/7872

Besser-IIPC 13/11/2018 56

FromtheArchive-Itsideh4ps://archive-it.org/collecLons/7872?

q=h4p%3A%2F%2Fwww.bitrosie.com&show=SeedVideos&fc=seedId%3A1157594

Besser-IIPC 13/11/2018 57

FromanydirecLon,usermightneedtoauthenLcate

Besser-IIPC 13/11/2018 58

SOMEOTHERINTERNALTRACKING

Besser-IIPC 13/11/2018 59 Besser-IIPC 13/11/2018 60

Page 11: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

11

CrawlRecords

Besser-IIPC 13/11/2018 61

EVALUATION

Besser-IIPC 13/11/2018 62

EvaluaLonforImprovement

•  ComposersandtheirsaLsfacLonwiththewaysinwhichaudienceswillbeabletoviewarchivesoftheirwebsites(improvingusability)

•  Researchers,andwhetherthecontentandfuncLonalityofthesewebarchivesworksforthem(contentpresentaLon

•  Tweakingwhatwedoinordertobe4erserveCreatorsandResearchers

•  Findingoutwhethercapturesreallyworked

Besser-IIPC 13/11/2018 63

FindingssLllbeinganalyzed

•  Streamingcapturesappearmoresuccessful,butwesLllexperiencesomestreamingcaptureproblems

•  NeedfurtherexploraLontoseetheprecisecauseofthecrawler/captureissues(&recLfythemifpossible)

Besser-IIPC 13/11/2018 64

CrawlerIssues(brokenheaderlinks)

Besser-IIPC 13/11/2018 65

CrawlerIssues(failedvideocapture)

Besser-IIPC 13/11/2018 66

Page 12: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

12

CrawlerIssues(videocapturefailure)

Besser-IIPC 13/11/2018 67

CrawlerIssues(Flashvideoissue)

Besser-IIPC 13/11/2018 68

CrawlerIssues(videocapturedwithoutaudio)

Besser-IIPC 13/11/2018 69

CrawlerIssues(brokenvideolinks)

Besser-IIPC 13/11/2018 70

CrawlerIssues(1audionotcaptured)

Besser-IIPC 13/11/2018 71

CrawlerIssues(audionotcaptured)

Besser-IIPC 13/11/2018 72

Page 13: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

13

CrawlerIssues(audiofailure&anchorproblem)

Besser-IIPC 13/11/2018 73

CrawlerIssues(parLalcapturefailure)

Besser-IIPC 13/11/2018 74

CrawlerIssues(incompleteloading)

Besser-IIPC 13/11/2018 75

CrawlerIssues(Captureissues)

Besser-IIPC 13/11/2018 76

CrawlerIssues(unknownproblems)

Besser-IIPC 13/11/2018 77

CrawlerIssues

•  Campjulie.com:–  Anycapturedate:IfveryslowloadLme,hardtotellifwasworkingornot,sosomesubjects

gaveup.[Siteownersaysthisisinherenttosite,somightnotbeacaptureproblem.]–  Discrepanciesbetweenwhenonehopoutiscapturedornot.

•  Kmariekim.com:–  Sep26,2017capture(latestcapture):A4emptstoplaymusicfromarchivedtumblrpagefrom

variousplajorms(youtube,soundcloud,etc.).•  Bitrosie.com:

–  Allcapturedates:linkstakeroughly5minutes(assumedbrokenatfirst)•  Adelefournet.com/video/:

–  Sep12,2017capture:VideoerroraZerroughly10seconds.Stopsplaying"BeretsofMaryJeanPlace",andstartsplayinganothervideowithopeningLtle"BarrancoDistrict,Lima,Peru".Therestofthevideosonthepagedonotplay.Linkto"BeretsofMaryJeanPlace"ontheInternetArchivealsoplaysincorrectvideo("BarrancoDistrict,Lima,Peru").

•  MichaelRobinsonarchivedwebsite:Errormessage

Besser-IIPC 13/11/2018 78

Page 14: Archive-Itbesser.tsoa.nyu.edu/howard/Talks/18composers-iipc.pdf · ARCHIVE-IT Besser-IIPC 13/11/2018 3 Archive-It • The leading applicaon/service for curated web archiving in North

11/13/18

14

EvaluaLonResults•  ThesubjectswerebasicallysaLsfiedwiththecaptures,but

hadverymanysuggesLonsforimprovementswithlabeling,searching,display,andperformance.MostalsowantedaddiLonalfuncLonality.

•  ManyofthesubjectswereconfusedbetweencapturedsitesandtheFindingAidsforthem.InaddiLon,thewords“Papersof”incollecLonLtlesbaffledpeoplewhentheywerelookingforrecordings,notpapers.

•  Bothusersandsiteownerswereunclearaboutthescopeofcontentthathadactuallybeencollected.Onesiteownerexpresseddisappointmentthatreviewsthattheylinkedtowerenotcaptured.Andonlyonesubjectfiguredouthowtonavigatetoasuggested“liveweb”pagethathadnotbeennotarchived.

Besser-IIPC 13/11/2018 79

FuncLonalityrequestedbyusers•  Mostsubjectswantedmoremetadatadisplayed.Examplesincluded:

displayingadescripLonoftheComposersProjectandlikelycontentsontheiniLalstartpage;displayofaudio/videorun-Lmeinsteadoffilesize;descripLon,thumbnails,excerptsformaterialrestrictedtoonsiteuse(sothattheycoulddecidewhetherornottheyreallyneededtomakeasitevisit);morefieldsshowninvariousdisplays(bothinlistsandinlinkstoessence).

•  BothsiteownersrespondedposiLvelytotheideaofprovidingasitemapwithacollapsingmenuoflinks.

•  Mostsubjectswantedasearchbox.AndmostwantedtobeabletoimmediatelysortamulL-columndisplaylistbyanycolumnoftheirchoosing.

•  Onesubjectfounditmisleadingwhenarestrictedobjectlinkedtoanewpage.

•  Onesiteownerpreferredthattheirdigitalobjectsbeorganizedbyproject,ratherthaninanundifferenLatedlistofeverydigitalobjectontheirsite.

Besser-IIPC 13/11/2018 80

FuncLonalityrequestedforlocalworkstaLons

•  Abilitytotakescreengrabs•  AccesstoaddiLonalbrowserwindow

•  Previewframewhenscrubbing(fastforwarding)throughvideomaterial

•  Useoftheirownlaptoporanotherwindow•  DisplayofLmecode•  And2subjectsspecificallyrequestedthe

–  abilitytoslowvideo/audiofiletotranscribe–  abilitytodroppin/a4achnotestospecificpointinvideo/audiofile

Besser-IIPC 13/11/2018 81

IMPACTBEYONDTHISPROJECT

Besser-IIPC 13/11/2018 82

ImpactBeyondthisProject•  TherewillbeanalternaLvetoHeritrixforcapturing

streamingmedia,andArchive-Itwillideallybeabletobe4erhandlestreamingmedia,anddisplayitinpropercontext

•  WewillhavearchitecturesandworkflowsforArchive-Ittointeractwithricherlocalresources(aswellasexamplesofhowinteracLonandnavigaLoncanproceedbtwnArchive-It,ArchiveSpace,FindingAids,andaninternaldigitalrepository)

•  ModelsforinteracLonbtwncreatorsandcollecLngorganizaLonswillhavebeendeveloped(incldonoragreements)

•  Wehavepreserved100+++websitesofyoungcomposers

Besser-IIPC 13/11/2018 83

Archivingwebsitescontainingstreamingmedia:theMusicComposer

Project

•  h4p://besser.tsoa.nyu.edu/howard/Talks/•  h4p://www.nyu.edu/about/news-publicaLons/news/

2015/03/27/nyu-libraries-to-team-with-internet-archive-to-preserve-high-quality-musical-content-on-the-web.html

•  h4p://archive.org/~nlevi4/reveal.js/•  h4p://composers.dlib.nyu.edu/•  h4ps://rg3.github.io/youtube-dl/supportedsites.html

Besser-IIPC 13/11/2018 84