exploring museum collections online: the quantitative method

Post on 30-Jun-2015

3.365 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

A slightly adjusted (better described) version of the presentation given at Museums and the Web 2008.

TRANSCRIPT

Exploring Museum Collections Online: The Quantitative Method

Frankie Roberto, Science Museum

frankie@frankieroberto.com

It all started at a ‘Mashup workshop’ here...

If you want to mashup museum object data, you’ll come across 3 problems:

If you want to mashup museum object data, you’ll come across 3 problems:

1. Getting It - museum data isn’t always accessible

If you want to mashup museum object data, you’ll come across 3 problems:

1. Getting It - museum data isn’t always accessible

2. Structure - the format of the data varies across museums.

If you want to mashup museum object data, you’ll come across 3 problems:

1. Getting It - museum data isn’t always accessible

2. Structure - the format of the data varies across museums.

3. Dodgy Data - data is often full of errors, typos and incomplete fields.

There are 3 traditionally-advocated

solutions to these problems

1. Getting It = APIs

1. Getting It = APIs

2. Structure = Metadata standards

2. Structure = Metadata standards

3. Dodgy Data = Hard Work (data entry)

3. Dodgy Data = Hard Work (data entry)

These may be good solutions, but they’re all hypothetical, and rely on other people

doing things sometime in the future.

I’m not interested in perfection, I just want data that’s...

Good Enough(Assez bon)

...and I’ll return to this slide if we start to get bogged down in a search for perfection.

So here are my alternative solutions to

the 3 problems...

1. Getting It = Screen Scraping

1. Getting It = Screen Scraping

Or...

1. Getting It = making a Freedom of Information request

2. Structure = Crude data mapping

A Simple Format

Museum X Museum Y

Museum Z

(some logic) (some logic)

(some logic)

3. Dodgy Data = Just say...

3. Dodgy Data = Just say...

Good Enough(Assez bon)

I’m also interested in how we’re displaying museum collections

data online.

This is the usual approach:

This is the usual approach:

Or like this:

Or like this:

Or this...

Or this...

...basically, an image with a description.

...basically, an image with a description.

Which is all well and good, but doesn’t give you much of a sense of

this...

The objects as a collection.

So I sent an FOI request to a bunch of

museums.

...and this was the response:

Museum Granted? Response

British Museum

...and this was the response:

Museum Granted? Response

British Museum No

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum Yes

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum Yes Word document

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum Yes Word document

Tate Galleries

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum Yes Word document

Tate Galleries No none

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum Yes Word document

Tate Galleries No none

Victoria & Albert Museum

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum Yes Word document

Tate Galleries No none

Victoria & Albert Museum Yes

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum Yes Word document

Tate Galleries No none

Victoria & Albert Museum Yes 2.9GB XML file on DVD

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum Yes Word document

Tate Galleries No none

Victoria & Albert Museum Yes 2.9GB XML file on DVD

Wallace Collection

...and this was the response:

Museum Granted? Response

British Museum No “3-4 days and over £1000”

Imperial War Museum No none

Museum of London No none

National Gallery No none

National Maritime Museum Yes 50 Excel spreadsheets

National Museums Liverpool No “More than 2.5 days”

National Portrait Gallery No “Information on website”

Natural History Museum No “70m specimens”

Royal Armouries Yes 50MB CSV file

Sir John Soane’s Museum Yes Word document

Tate Galleries No none

Victoria & Albert Museum Yes 2.9GB XML file on DVD

Wallace Collection No none

...and this was the response:

This is the data I was after:

This is the data I was after:

Who?

What?

Where?

When?

How?

This is the data I was after:

Who?

What?

Where?

When?

How?

collector/curator

object type

country of origin

year of acquisition

acquisition method

And these are some examples of

the data I got...

Who?

Who?

hoped-for data: “Henry Wellcome”, “curator x”

Who?

actual response: no data

hoped-for data: “Henry Wellcome”, “curator x”

What?

What?

target data: tag-style descriptors in the form of this is an x

What?

target data: tag-style descriptors in the form of this is an x

actual data: categories in the style of‘Horological Instruments’,

‘Coins & Commemorative Medals’,‘Jewellery’

What?Mapping process

‘Firearm’‘Edged weapon’

‘Furniture’‘Glass’

‘Merchant Ship Plans’‘Miscellaneous Antiques’

firearmweaponitem of furniturepiece of glasswareship planantique

Category Object-type tag

What?Mapping process

‘Firearm’‘Edged weapon’

‘Furniture’‘Glass’

‘Merchant Ship Plans’‘Miscellaneous Antiques’

firearmweaponitem of furniturepiece of glasswareship planantique

Would it be better to try and parse the title?

Category Object-type tag

Where?target data: country-level location

(list of countries from ISO website)

Where?

“Netherlands”

“Clydebank, Cumbria, England”

“Barrow, Cumbria, England:

“Germany”

“Glasgow, Strathclyde, Scotland”

“Newcastle upon Tyne, Tyne and Wear, England”

“Birkenhead, Merseyside, England”

“France”

“Sheernet Dockyard, Isle of Sheppey, Kent, England”

“Dockyard, Chatham, Kent, England”

“England”

“Dockyard, Portsmouth, Hampshire, England”

“Dockyard, Devonport, Devon, England”

“London, England”

0 750 1,500 2,250Number of objects

Where?

“Netherlands”

“Clydebank, Cumbria, England”

“Barrow, Cumbria, England:

“Germany”

“Glasgow, Strathclyde, Scotland”

“Newcastle upon Tyne, Tyne and Wear, England”

“Birkenhead, Merseyside, England”

“France”

“Sheernet Dockyard, Isle of Sheppey, Kent, England”

“Dockyard, Chatham, Kent, England”

“England”

“Dockyard, Portsmouth, Hampshire, England”

“Dockyard, Devonport, Devon, England”

“London, England”

0 750 1,500 2,250Number of objects

Some examples of the actual data:

Where?National Maritme Museum: 1,496 unique place_made strings

Tricky cases:

“probably Germany”“Asia”

“possibly: Chatham Dockyard”“USSR”

“Far East”“Continental Europe”

“Arabia”“Italy or England”

“Persia”“Czechoslovakia”

Good Enough(Assez bon)

Good Enough(Assez bon)

This is where you have to ignore some data and say...

When?

19051935-0412/09/0613/04/0928/06/96

190519352006 (possibly)1909 (probably)1996? 1986?

When?target data: year of acquisition

19051935-0412/09/0613/04/0928/06/96

190519352006 (possibly)1909 (probably)1996? 1986?

When?target data: year of acquisition

(this should be easy)

19051935-0412/09/0613/04/0928/06/96

190519352006 (possibly)1909 (probably)1996? 1986?

When?target data: year of acquisition

(this should be easy)

actual data:

19051935-0412/09/0613/04/0928/06/96

190519352006 (possibly)1909 (probably)1996? 1986?

How?

How?

anticipated: donation / purchase / loan

How?

anticipated: donation / purchase / loan

real data: gift / purchase / bequest / transfer / transfer from MOD / transfer; gift / purchase; transfer / loan / Sale / deposit / Acceptance in Lieu of Tax / Acquisition / Exchange / presented / museum copy / Allocated by the Naval War Trophies Committee

Lessons from politics

This site uses publicly-available House of Commons data. Would someone create a MuseumsCollectForYou.com?

Lessons from politics

This site uses publicly-available House of Commons data. Would someone create a MuseumsCollectForYou.com?

Issues

Issues

• All objects are counted equally.

Issues

• All objects are counted equally.

• How can we add photographs?

Issues

• All objects are counted equally.

• How can we add photographs?

• Should we incorporate user interactions and annotations?

Where next?

Where next?

• Prototype site online: http://www.museum-collections.org

Where next?

• Prototype site online: http://www.museum-collections.org

• Get data from more museums?

Where next?

• Prototype site online: http://www.museum-collections.org

• Get data from more museums?

• I’m happy to share the data

Where next?

• Prototype site online: http://www.museum-collections.org

• Get data from more museums?

• I’m happy to share the data

• Who’s role is it to do this stuff?

Where next?

• Prototype site online: http://www.museum-collections.org

• Get data from more museums?

• I’m happy to share the data

• Who’s role is it to do this stuff?

• Would it also work for private collections (eBay addicts)?

Thanks!(presentation originally given at

Museums and the Web 2008, Montreal)

http://www.archimuse.com/mw2008/papers/roberto/roberto.htmlSee also my written paper:

top related