This is the presentation I gave at the Internet Archive's "Make Books Apparent" meeting, held in San Francisco, October 19-20, 2009. The meeting was mainly about our exciting new project, called BookServer, a distributed lending & vending system over the Internet.


  • 1. Hello. name is George Oates, and Im leading the Open Library project.

2. about 6 months ago Redesigning everything, and I thought Id tell you a little bit about that. 3. First Steps Listen to people Answer help emails Meet team in personMet in San Francisco in June Streamline deploys 1 button! Redraw sitemap Refocus on core Dream a littleAsk silly questions, assess competition Get acclimatized 4. Understand relationships, what have we got, and how does it all inter-relate?Any relationship can be made into a hyperlink. 5. Reach into the network - weve also arranged a little Flickr integration, so if people take photos of books, they can link them to Open Library records. Were not using them yet. - as you may have noticed last night, we also added a link from Internet ARchive book pages into Open Library. We reckon thats almost doubled our modest traffic. (About 250k unique IPs per day) 6. Challenges Dense library metadata Designed for classic institutionalsearch/retrieve practice Data is dry, sometimes poor quality No insight into the community Distributed teamUS, India, UK -so one thing I began was to start reading and answering enquiries that come to [email protected] (this is a good thing to have new people do for a while) - found that some questions repeated themselves and there was a key mismatch in understanding what Open Library was about. e.g. people would write in to us asking us to correct errors, not knowing they were able to do it themselves. 7. There are 4 Agatha Christies in this list, 2 of which appear to the eye to be identical. Computers have trouble recognising that these Authors are the same woman. Its easy for humans to do. How could we build a UI to help people help us to merge these duplicates? 8. What have we got? Loads of data 23 million records Small user base < 20,000 Small team6 people Small architecture12 servers Good framework infogami, Certainly there are challenges to trying to make use of a large but shallow dataset, but Open Library has lots of advantages in terms of a small team & system being able to change rapidly. This exibility will hopefully help us. 9. Began experimenting with the data we have to try to see the catalog landscape. What do we already have that were not showing to people yet? Look at all these subjects! These timeframes! How can we make use of them? 10. Look at all these new links! ISBN -> Publisher names -> Show me all the books this publisher has published... Show me all the subjects related to cheese... Add links and hey presto! Youre bouncing around the catalog. 11. What if? Adjacent books Not efficiency, but effectiveness(conversation broker, records improveover time) - Shirky Not a purchasing engine, but a libraryAs an exercise, its fun to ask what might happen if there were no search box on Open Library? Could you still use it? 12. Changing the look of the logo will hopefully encourage people to come inside and look around. Break the conventional library look and try to warm it up a little... We are literally open - both at the software level, but also all of Open Librarys records are editable, by anyone. 13. Add a Book? So, lets take a look at one of the key UIs on Open Library - How to add a new record. This is the current form. Basically just a web UI to a pretty dense, librarian-centric form. A lot of the elds are difficult for not-librarians to complete - a denite barrier to entry for both adding new records and editing existing things. 14. The idea is to break it into two steps. This is step 1.The most important thing to do is to make it feel easy to add a record. This rst step also gathers enough info to allow us to do a decent search for any existing records. If we nd a match, we can direct people towards the Edit view of that record. If theres no match, we move on... 15. Step 2 is a massive form. Theres no way to hide that basically. All the elds are potentially useful. What we can do is organize the info a little, so related things (the physical object, pagination) are grouped together. Were also going to try adding a tabbed view to try to soften the blow a little. Also, hopefully, adopting a conversational tone with the form labels might help direct people a little more about the sort of data we want. 16. It would be awesome if we could start to collect excerpts from books. A personal touch from people about particular bits theyve enjoyed and why. Also, these excerpts could be indexed to help boost books in our search. 17. Links, links, links.... This networked catalog is all about how many things we can connect books to. This is the principal of metadata giving records a sort of surface tension to keep them from sinking into the depths. 18. Those rst 3 tabs (About, Excerpts, Links) are about the Work level of our records. Were going to try this rst version not worrying about exposing this slightly weird metadata-y thing called Work to visitors, but still attempt to collect data at the Work-y level. Theres a specic tab just for Editions too, that contain elds mainly about publishing info and the physical (or virtual!) object itself. 19. Another experiment were looking forward to trying is about identiers. Were not particularly concerned about canonical identiers. Perhaps its a waste of time to wait for one, so instead, were going to try and attach as many ID types to our records as we can. (This list is just a braindump - not active yet.) The idea is that people could add a URL or actual identier and Open Library would just do the right thing. A suggestion (after this presentation was delivered) was that people could ping Open Library with an identier, not even knowing what TYPE of ID it is. Perhaps Open Library could help triangulate this query towards a book record. Record laundering. 20. Key Features History Activity, life, cause, effect Notifications thereof List(s) More small, ad hoc collections Public / private Exportable (ad hoc catalogs) - Planning two features that play off the strengths of the underlying Wiki: History & Lists - AD HOC (so, BookServer feeds should be expected to be ad hoc. No point in trying to agree on a hierarchy etc for feeds. Waste of time.) 21. Were excited about how we might improve the display and linkage from history of our records. They are another source of connections into and around the catalog, so we should activate them where we can to connect to people, subjects, publishers, even dates. See everything that happened on Open Library on May the 4th, 2009. Version 1 probably wont be quite this robust :) 22. Tension? 23. Im not sure how much were going to be able to assist the Library of Congress 24. Small Collections Catalogues to & from from book lovers who may or may not be professional librarians Effective & Personal; Inefficient & Charming, Detailed Looking to integrate cool cataloging services like Koha, Delicious Monster - Anyone?? It was only last night I met a woman who is cataloguing a businesss library of some 1,100 books. She had said she was looking on Open Library for a way to upload a CSV le to us. We should do that, and note it on each editions history. (*Note: Design that CSV and get it online!) 25. History there was some talk about timestamps yesterday. Being able to slice things by time will only increase in importance as the web gets older, so, Id suggest putting timestamps on anything you can think of. 26. Substrate: any surface on which a plant or animal lives or on which a material sticks 27. What if we position library records like that? 28. Build it so anyone cancontribute any amount. Clay Shirky 29. act of adding a book to a library catalog is a bit like playing tetris. 30., librarians are (very clever) humans too. And everyone whos responsible for putting books into a traditional catalogue must work within patterns. Patterns that have grown semantically remarkable and deeply complex. 31. "But heres a question for you, lets say youhave an 856 URL to full text for a serial. Andyou know what date ranges it covers. Whatsub-field would you put that in? $3 or $z? Isee it in both." Jonathan Rochkind, Bibliographic Wilderness glad I dont have to either ask or answer this question. 32. Library metadata is diabolically rational.Karen Coyle, 33. Hic sunt dracones. detail from a map of the East Indies showing, outlined in pink, the rst European discoveries along the Cape York Peninsula. Early in 1606, towards the northern tip of the peninsula, Willem Jansz made here what was almost certainly the rst landing by Europeans in Australia. This map rst appeared in 1635 and was reprinted unchanged until 1664. 34. Here be dragons. detail from a map of the East Indies showing, outlined in pink, the rst European discoveries along the Cape York Peninsula. Early in 1606, towards the northern tip of the peninsula, Willem Jansz made here what was almost certainly the rst landing by Europeans in Australia. This map rst appeared in 1635 and was reprinted unchanged until 1664. 35. is one of the few maps in the eighteenth century devoted entirely to Australia. Jacques Bellin was hydrographer to the French King Louis XIV. He has added a hypothetical coast line joining Australia, New Guinea and Tasmania - a note says that this is included without proof. It is further suggested that New Zealand might be part of the great southern continent. 36. I wonder