how is data made? from dataset literacy to data infrastructure literacy

Post on 07-Aug-2015

473 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

How is Data Made?From Dataset Literacy to

Data Infrastructure Literacy

30th June, Web Science 2015, University of Oxford Jonathan Gray | jonathangray.org | @jwyg

Some thoughts on data literacy beyond the dataset.

Not just reading and using datasets.

Thinking critically and constructively about their contexts of production.

Bigger picture: role of data in society.

Not just literacies to read and use datasets, but literacies to read and shape data infrastructures.

What is data literacy?

What is data?

A metaphor.

Data and photography.

Jonathan Gray (2012) “What Data Can and Cannot Do”. The Guardian. Available at: http://www.theguardian.com/news/datablog/2012/may/31/data-journalism-focused-critical

Jonathan Gray (2012) “What Data Can and Cannot Do”. The Guardian. Available at: http://www.theguardian.com/news/datablog/2012/may/31/data-journalism-focused-critical

Early optimism about veracity and fidelity of photography.

– Franklin v. State of Georgia, 69 Ga. 36; 1882 Ga

“We cannot conceive of a more impartial and truthful witness than the sun, as its light stamps and seals the similitude of the wound on the photograph put before

the jury; it would be more accurate than the memory of witnesses, and as the object of all evidence is to show

truth, why should not this dumb witness show it?”

Critical literacy around photography.

Critical literacy to read images: !• How is the camera set up to take shots? • What is captured and how? • What is not captured? • How does equipment mediate the image? • Selection, framing, arrangement, post-

production?

Instead of the camera, the elaborate sprawl of public information systems.

Data infrastructures as socio-technical systems.

What do they measure or capture, and how?

But datasets are not photographs.

Specificities of data infrastructures.

Datasets are heterogeneous.

Datasets are generated by a mixture of social and technical processes, including e.g.: !• Laws and policies • Administrative protocols • Registration procedures • Instruments and equipment • Software systems • Financial audits • Feedback systems • Management systems • Metadata from digital services • Standards bodies/standardisation procedures

Data literacy is not just about knowing how to use data analysis software

or understanding statistics..

But also understanding methods, rationales, assumptions, definitions, technologies, institutions,

through which datasets were generated.

Democratising the data revolution.

Not just liberalising access to the informational by-products of public institutions.

But also bringing data infrastructures back into realm of democratic political life.

Recent examples.

1. Beneficial ownership advocacy. 2. “Statactivism” and counting the uncounted.

1. Beneficial ownership advocacy."2. “Statactivism” and counting the uncounted.

Gray. J. & Davies, T. (2015) “Fighting Phantom Firms in the UK: From Opening Up Datasets to Reshaping Data Infrastructures?”. Available at SSRN: http://ssrn.com/abstract=2610937

In case of campaigning around company ownership, the disclosure of existing datasets was not enough.

Civil society organisations had to undertake a more creative, sustained and holistic engagement with shaping and influencing the development of data

infrastructures as socio-technical systems.

This included research and advocacy around: !• Costs, functionalities and user interfaces of

software systems that would run the register; • Changes to primary and secondary legislation; • Additional administrative requirements and their

impacts on different actors inside and outside the public sector.

Campaigners had to look beyond the question of what information is released, towards the question of what information is collected and generated by the public sector in the first place, how this is information is generated through data infrastructures.

1. Beneficial ownership advocacy. 2. “Statactivism” and counting the uncounted.

1. Beneficial ownership advocacy. 2. “Statactivism” and counting the uncounted.

“Statactivism”

Bruno, I. and Didier, E. and Vitale, T. (2014) “Statactivism: Forms of Action between Disclosure and Affirmation”. Available at SSRN: http://ssrn.com/abstract=2466882

Not just blanket critique or withdrawal of quantification and “metrification”.

Highlighting limitations of existing forms of measurement and proposing alternatives.

For example, gender equality, climate change, working conditions and health.

What should be measured and how?

What is not currently being measured?

Recent examples from data journalism.

New “action repertoires” for civil society actors to shape data infrastructures.

To what extent do data infrastructures address needs and interests of civil society actors?

How to broaden the publics that shape data as well as the publics that use it?

Legal, social and technical measures for making open data initiatives more

responsive to concerns of civil society?

ROUTE TO PA: http://routetopa.eu

DEMOCRATISING PUBLIC INFORMATION:FROM OPENING UP DATASETS TO RESHAPING DATA INFRASTRUCTURES?

JONATHAN GRAY

JULY 2015

Question of what is measured and how.

But also who uses information, and how information acts.

From “information as resource” to “information as agent”.

!(Sandra Braman, Change of State,

MIT Press, 2009)

“Participatory data infrastructures”

In conclusion…

Going beyond focus on literacy with datasets, towards literacy with data infrastructures

through which they are generated.

Role of data infrastructures in addressing global challenges - from climate change

to tax base erosion.

Data infrastructures as crucial part of democratic politics in 21st century.

Jonathan Gray | jonathangray.org | @jwyg

top related