Meshups- embedding content from other websites, mostly maps:http://dinby.dk
In netarchive: no map – just a ”black hole” – no solution netarkivet
Flash:Ex.:http://www.b.dk/billedeserier/skoejtekongernehttp://viborg-folkeblad.dk/foto/galleri-volume-and-dance-ix-2011-i-tinghallen
In netarchive: flash player tries loading (ongoing), only thumbnails are visible netarkivet
Sound (radio)Streaming does not workmp3-files: test-harvesting with new template: Default_order_with_xml-extraction_10levels (2 levels would be enough)Ex: den2radio.dk -> http://feed.podcastmachine.com/podcasts/70/mp3.rssMidifiles.dk -> http://www.midifiles.dk/articlelist.51
Solution: creation of a new template netarkivet
videoHarvested- ex.: Trier.gyldendal.dk ->http://lmp.lynxmedia.dk/Trier/intro_til_temaplayer/1181299957075374541/export_popup?serverinfo=1181299892540916715&skin=Trier/Trier&mode=clip# Not harvested – ex: Folketinget (http://www.ft.dk/webtv/video/20111/salen/14.aspx?as=1 ) -> (live) streaminghttp://kino.dk -> streaming Jp.dk -> http://jp.dk/jptv/ more and more news sites display videos on their sites.
netarkive
More than 50% of the problems we have are about video cntent
Another big issue: login/password-content
1. scenario: the website ownergives acces for our harvesters IP-adresses -> that works fine (ex.: mediawatch.dk)
2. Scenario: the website owner delivers login and password -> a developper task *)
Last not least: Facebook and Twitter….. Never ending stories **)
netarkive
*) Password content3 methods
with cookies (does not work any more)html-login: no solutionhttp-login: addition to the template
Ex. http login: finanswatch.dk – template addition: <newObject name="finanswatch_login_1" class="org.archive.crawler.datamodel.credential.HtmlFormCredential"> <string name="credential-domain">finanswatch.dk/login</string> <string name="login-uri">https://secure.finanswatch.dk/mainLogin?hidden=true&mode=</string> <string name="http-method">POST</string> <map name="form-items"> <string name="j_username">[email protected]</string> <string name="j_password">netarkivet</string> <string name="_spring_security_remember_me"/> <string name="loginButton">Log ind</string> <string name="spring-security-redirect">https://secure.finanswatch.dk/mainLogin?hidden=true&mode=</string> </map> </newObject> netarkive
**) Facebook: Special template, under ongoing revision (following Facebook changes)
Twitter: Next button does not work in the archive.Solution: harvesting 6 times a day (frontpage). Twitter harvests do not at all work any more scince Twitter began putting !# into the url’s of the Twitter profiles.
netarkive