schemas for the real world [madison rubyconf 2013]

118
Schemas for the Real World Carina C. Zona @cczona Well, hello. [As Jim noted,] my name is Carina C. Zona. You can find me all over the internet as 'cczona'. I'm a developer. I'm also a sex educator. I volunteer for an organization called San Francisco Sex Information, or SFSI. SFSI has been operating a phone hotline for over 40 years. Our mission is to answer any question: accurately, confidentially, and without judgement. You'd think that people would ask all sorts of questions. And they do. But the majority of questions boil down to something very simple and universal: Am I Normal? Do I fit into the world? So, here I have these two roles: developer and sex educator. They may seem completely different to you. Not to me. I think a lot about how they overlap.

Upload: carina-c-zona

Post on 18-Dec-2014

631 views

Category:

Technology


2 download

DESCRIPTION

Social app development challenges us how to code for users’ personal world. Users are giving push-back to ill-fitted assumptions about their identity — including name, gender, sexual orientation, important relationships, and other attributes they value. How can we balance users’ realities with an app’s business requirements? Facebook, Google+, and others are grappling with these questions. Resilient approaches arise from an app’s own foundation. Discover schemas’ influence over codebase, UX, and development itself. Learn how we can use schemas to both inspire users and generate data we need as developers. -- META Where: Madison Ruby Conference 2013 (Madison, Wisconsin, USA) Date: August 23, 2013 Video: http://www.confreaks.com/videos/2627-madisonruby2013-schemas-for-the-real-world

TRANSCRIPT

Page 1: Schemas for the Real World [Madison RubyConf 2013]

Schemas for the Real World

Carina C Zona cczona

Well hello [As Jim noted] my name is Carina C Zona You can find me all over the internet as cczona

Im a developer Im also a sex educator I volunteer for an organization called San Francisco Sex Information or SFSI SFSI has been operating a phone hotline for over 40 years Our mission is to answer any question accurately confidentially and without judgement Youd think that people would ask all sorts of questions And they do But the majority of questions boil down to something very simple and universal Am I Normal Do I fit into the world

So here I have these two roles developer and sex educator They may seem completely different to you Not to me I think a lot about how they overlap

cczona

mdashxkcd 940

And they do Quite a lot

cczona

Imagine walking through the world knowing that everyonersquos first assumptions about how you see yourself who you love and what feels right for you are completely wrong Now imagine signing up for a cool website and then being required to select an option from a drop-down menu that doesnrsquot include anything that represents you

Yoursquoll feel defeated Yoursquoll want to argue that whatever they think theyrsquore learning from that drop-down menu itrsquos not really true Yoursquoll want to tell them that theyrsquore adding to your humiliation by making you do this Yoursquoll want to tell them that theyrsquore missing a huge part of youhellip

mdashSarah Dopp

Users are giving pushback to assumptions that leave them out Social apps in particular are being pressed to adjust Facebook Google and others have been dealing with these questions for years and are still working it out So if you feel out of depth youre not the only one

Normalization

When developers talk about normalization were talking about databases About computer science But when dealing with human attributes were inherently dealing in the social sciences too

cczona

Construction of social norms

Sociological normalization [READ]

Conflating those works fine if you only want users from among the select few who belong to the idealized norm But most of us are going for broad userbase

cczona

Database Normalization

Mirror real-world concepts and their interrelationships

What we lose track of is one of the core tenets of database normalization Were supposed to [read] When the database is in tension with peoples own real world -- then its not people who need to be flexible Its us

Why IS this hard Because at first glance this stuff looks easy Its just forms Weve done those a million times

cczona

But were getting tripped up by some flawed premises

cczona

Gender is one of those things everyone thinks they understand but most people dont Like Inception

mdash Sam Killerman

First a premise that deeply personal stuff about humans can be reduced to lists

cczona

Hey this is just a system I can figure out easily is also a problem among engineers first diving into the stock market

mdashxkcd 592

Second assumptions that canonical lists for these exist Or are at least SURELY must be createable

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 2: Schemas for the Real World [Madison RubyConf 2013]

cczona

mdashxkcd 940

And they do Quite a lot

cczona

Imagine walking through the world knowing that everyonersquos first assumptions about how you see yourself who you love and what feels right for you are completely wrong Now imagine signing up for a cool website and then being required to select an option from a drop-down menu that doesnrsquot include anything that represents you

Yoursquoll feel defeated Yoursquoll want to argue that whatever they think theyrsquore learning from that drop-down menu itrsquos not really true Yoursquoll want to tell them that theyrsquore adding to your humiliation by making you do this Yoursquoll want to tell them that theyrsquore missing a huge part of youhellip

mdashSarah Dopp

Users are giving pushback to assumptions that leave them out Social apps in particular are being pressed to adjust Facebook Google and others have been dealing with these questions for years and are still working it out So if you feel out of depth youre not the only one

Normalization

When developers talk about normalization were talking about databases About computer science But when dealing with human attributes were inherently dealing in the social sciences too

cczona

Construction of social norms

Sociological normalization [READ]

Conflating those works fine if you only want users from among the select few who belong to the idealized norm But most of us are going for broad userbase

cczona

Database Normalization

Mirror real-world concepts and their interrelationships

What we lose track of is one of the core tenets of database normalization Were supposed to [read] When the database is in tension with peoples own real world -- then its not people who need to be flexible Its us

Why IS this hard Because at first glance this stuff looks easy Its just forms Weve done those a million times

cczona

But were getting tripped up by some flawed premises

cczona

Gender is one of those things everyone thinks they understand but most people dont Like Inception

mdash Sam Killerman

First a premise that deeply personal stuff about humans can be reduced to lists

cczona

Hey this is just a system I can figure out easily is also a problem among engineers first diving into the stock market

mdashxkcd 592

Second assumptions that canonical lists for these exist Or are at least SURELY must be createable

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 3: Schemas for the Real World [Madison RubyConf 2013]

cczona

Imagine walking through the world knowing that everyonersquos first assumptions about how you see yourself who you love and what feels right for you are completely wrong Now imagine signing up for a cool website and then being required to select an option from a drop-down menu that doesnrsquot include anything that represents you

Yoursquoll feel defeated Yoursquoll want to argue that whatever they think theyrsquore learning from that drop-down menu itrsquos not really true Yoursquoll want to tell them that theyrsquore adding to your humiliation by making you do this Yoursquoll want to tell them that theyrsquore missing a huge part of youhellip

mdashSarah Dopp

Users are giving pushback to assumptions that leave them out Social apps in particular are being pressed to adjust Facebook Google and others have been dealing with these questions for years and are still working it out So if you feel out of depth youre not the only one

Normalization

When developers talk about normalization were talking about databases About computer science But when dealing with human attributes were inherently dealing in the social sciences too

cczona

Construction of social norms

Sociological normalization [READ]

Conflating those works fine if you only want users from among the select few who belong to the idealized norm But most of us are going for broad userbase

cczona

Database Normalization

Mirror real-world concepts and their interrelationships

What we lose track of is one of the core tenets of database normalization Were supposed to [read] When the database is in tension with peoples own real world -- then its not people who need to be flexible Its us

Why IS this hard Because at first glance this stuff looks easy Its just forms Weve done those a million times

cczona

But were getting tripped up by some flawed premises

cczona

Gender is one of those things everyone thinks they understand but most people dont Like Inception

mdash Sam Killerman

First a premise that deeply personal stuff about humans can be reduced to lists

cczona

Hey this is just a system I can figure out easily is also a problem among engineers first diving into the stock market

mdashxkcd 592

Second assumptions that canonical lists for these exist Or are at least SURELY must be createable

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 4: Schemas for the Real World [Madison RubyConf 2013]

Normalization

When developers talk about normalization were talking about databases About computer science But when dealing with human attributes were inherently dealing in the social sciences too

cczona

Construction of social norms

Sociological normalization [READ]

Conflating those works fine if you only want users from among the select few who belong to the idealized norm But most of us are going for broad userbase

cczona

Database Normalization

Mirror real-world concepts and their interrelationships

What we lose track of is one of the core tenets of database normalization Were supposed to [read] When the database is in tension with peoples own real world -- then its not people who need to be flexible Its us

Why IS this hard Because at first glance this stuff looks easy Its just forms Weve done those a million times

cczona

But were getting tripped up by some flawed premises

cczona

Gender is one of those things everyone thinks they understand but most people dont Like Inception

mdash Sam Killerman

First a premise that deeply personal stuff about humans can be reduced to lists

cczona

Hey this is just a system I can figure out easily is also a problem among engineers first diving into the stock market

mdashxkcd 592

Second assumptions that canonical lists for these exist Or are at least SURELY must be createable

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 5: Schemas for the Real World [Madison RubyConf 2013]

cczona

Construction of social norms

Sociological normalization [READ]

Conflating those works fine if you only want users from among the select few who belong to the idealized norm But most of us are going for broad userbase

cczona

Database Normalization

Mirror real-world concepts and their interrelationships

What we lose track of is one of the core tenets of database normalization Were supposed to [read] When the database is in tension with peoples own real world -- then its not people who need to be flexible Its us

Why IS this hard Because at first glance this stuff looks easy Its just forms Weve done those a million times

cczona

But were getting tripped up by some flawed premises

cczona

Gender is one of those things everyone thinks they understand but most people dont Like Inception

mdash Sam Killerman

First a premise that deeply personal stuff about humans can be reduced to lists

cczona

Hey this is just a system I can figure out easily is also a problem among engineers first diving into the stock market

mdashxkcd 592

Second assumptions that canonical lists for these exist Or are at least SURELY must be createable

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 6: Schemas for the Real World [Madison RubyConf 2013]

cczona

Database Normalization

Mirror real-world concepts and their interrelationships

What we lose track of is one of the core tenets of database normalization Were supposed to [read] When the database is in tension with peoples own real world -- then its not people who need to be flexible Its us

Why IS this hard Because at first glance this stuff looks easy Its just forms Weve done those a million times

cczona

But were getting tripped up by some flawed premises

cczona

Gender is one of those things everyone thinks they understand but most people dont Like Inception

mdash Sam Killerman

First a premise that deeply personal stuff about humans can be reduced to lists

cczona

Hey this is just a system I can figure out easily is also a problem among engineers first diving into the stock market

mdashxkcd 592

Second assumptions that canonical lists for these exist Or are at least SURELY must be createable

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 7: Schemas for the Real World [Madison RubyConf 2013]

cczona

But were getting tripped up by some flawed premises

cczona

Gender is one of those things everyone thinks they understand but most people dont Like Inception

mdash Sam Killerman

First a premise that deeply personal stuff about humans can be reduced to lists

cczona

Hey this is just a system I can figure out easily is also a problem among engineers first diving into the stock market

mdashxkcd 592

Second assumptions that canonical lists for these exist Or are at least SURELY must be createable

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 8: Schemas for the Real World [Madison RubyConf 2013]

cczona

Gender is one of those things everyone thinks they understand but most people dont Like Inception

mdash Sam Killerman

First a premise that deeply personal stuff about humans can be reduced to lists

cczona

Hey this is just a system I can figure out easily is also a problem among engineers first diving into the stock market

mdashxkcd 592

Second assumptions that canonical lists for these exist Or are at least SURELY must be createable

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 9: Schemas for the Real World [Madison RubyConf 2013]

cczona

Hey this is just a system I can figure out easily is also a problem among engineers first diving into the stock market

mdashxkcd 592

Second assumptions that canonical lists for these exist Or are at least SURELY must be createable

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 10: Schemas for the Real World [Madison RubyConf 2013]

cczona

And the third problem is our faith that the first two problems here can be easily solved Just add more list items[SCROLL for a while] Easy to wind up looking foolish Without even having solved the problem

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 11: Schemas for the Real World [Madison RubyConf 2013]

Social Networks

What IS a social network What is it in purely human terms

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 12: Schemas for the Real World [Madison RubyConf 2013]

Social

It involves trust Individuals revealing WHO THEY ARE WHAT THEY VALUE and WHO THEY CARE about Its as personal as we get This is the REAL LIFE that our apps are meant to replicate and build upon

What relationships are we fostering between person and app What relationships are we accidentally inhibiting or denying

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 13: Schemas for the Real World [Madison RubyConf 2013]

cczona

In real life

I know your personhood better than you

sounds presumptuous

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 14: Schemas for the Real World [Madison RubyConf 2013]

cczona

In real life

sounds presumptuous

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 15: Schemas for the Real World [Madison RubyConf 2013]

cczona

In real life

Your existence isnt possible

sounds clueless

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 16: Schemas for the Real World [Madison RubyConf 2013]

cczona

In real life

sounds clueless

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 17: Schemas for the Real World [Madison RubyConf 2013]

cczona

In real life

Who you are is invalid

sounds arrogant

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 18: Schemas for the Real World [Madison RubyConf 2013]

cczona

In real life

sounds arrogant

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 19: Schemas for the Real World [Madison RubyConf 2013]

cczona

Data modeling is psychology and it is philosophy It reflect individuals beliefs about reality rather than reflecting reality itself It rejects realitys complexity and richness Thats NOT what we set out to do Its not the COMMUNITY we seek to build

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 20: Schemas for the Real World [Madison RubyConf 2013]

cczona

[READ]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 21: Schemas for the Real World [Madison RubyConf 2013]

cczona

Be conservative in what you do be liberal in what you accept from others

-Postels Law

Postels Law reminds us to [READ]ARE we taking to heart that wisdom

[PAUSE FOR END OF SECTION]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 22: Schemas for the Real World [Madison RubyConf 2013]

cczona

Schemas plural

There are two kinds of schemas

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 23: Schemas for the Real World [Madison RubyConf 2013]

cczona

Mental Schema

bull Pre-conceived ideas

bull Framework for representing some aspect of the world

bull System of organizing amp perceiving new information

Mental Schema are [READ]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 24: Schemas for the Real World [Madison RubyConf 2013]

cczona

Database Schema

bull Structure described in the databases language

bull Blueprint for database construction

bull Describes how the real world is being modeled

Database schemas are closely related to that Its a mental schema translated into blueprints for a database

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 25: Schemas for the Real World [Madison RubyConf 2013]

cczona

These are simply frontend manifestations of various individuals MENTAL schemas And [SLOW DOWN] when we look at them closely we can see [BRIEF PAUSE] that schemas are foundation [BRIEF PAUSE] for expressing

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 26: Schemas for the Real World [Madison RubyConf 2013]

cczona

deeply intimate

things that are [READ]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 27: Schemas for the Real World [Madison RubyConf 2013]

cczona

important relationships

Schemas are foundation for expressingthings [READ]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 28: Schemas for the Real World [Madison RubyConf 2013]

cczona

self-image

Schemas are foundation for expressingthings [READ]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 29: Schemas for the Real World [Madison RubyConf 2013]

cczona

Schemas define the

user experience

Schemas define user experience

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 30: Schemas for the Real World [Madison RubyConf 2013]

cczona

Our schemas are leaving people behind

amp UX

[read]We can fix that

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 31: Schemas for the Real World [Madison RubyConf 2013]

cczona

What benefit will the user notice

By asking ourselves What benefit will the USER NOTICE THIS is not equivalent to How will the user benefit Because THAT is a question which grants us too much latitude To ASSUME that what we want is of course to their benefit because its gonna help us deliver a product thats Awesome

[READ AGAIN] focuses us on EXPERIENCES

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 32: Schemas for the Real World [Madison RubyConf 2013]

cczona

Evaluating from user perspective gives us focus

There isnt ONE WAY There isnt ONE RIGHT ANSWER Because the NATURE of our dilemma is that users vary And so do their social worlds So we always need to understand varying approaches How they fit different contexts What tradeoffs they bring AND the obvious boring question which options best serve THIS apps business requirements

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 33: Schemas for the Real World [Madison RubyConf 2013]

cczona

CheckboxRadio buttonSelect menuRanges

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checkboxes radio buttons select menus ranges these each imply that EVERY possible value can -- and is -- included If any value isnt included there its real world being rejected because it didnt match up with our mental schema

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 34: Schemas for the Real World [Madison RubyConf 2013]

cczona

CheckboxRadioSelect

Coerced DiscretionaryGuided

Corrective Text InputTextarea

Checking a box is a one step action Entering a text string is not So a freeform solution such as a textarea or text input isnt automatically exciting BUT they can bring visible meaningful benefit

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 35: Schemas for the Real World [Madison RubyConf 2013]

cczona

11 years

At MetaFilter gender has been a text field for over a decade

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 36: Schemas for the Real World [Madison RubyConf 2013]

cczona

hellipthe early crowd at MeFi were often programmers and they hated the idea of dirty data collectionhellip

mdashMatt Haughey founder

Initially some SHUDDERED at the thought Because [READ]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 37: Schemas for the Real World [Madison RubyConf 2013]

cczona

All girl

XY

Fella

Queer

Fembot

Alto

It depends

MYOB

Dangly bits

Chicklet

Innie not outie

Convex

Sideburns

Ambisextrous

Member of the patriarchy

For about 5 seconds Then they jumped on board Because they could be creative and silly

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 38: Schemas for the Real World [Madison RubyConf 2013]

cczona

I speak using the male gender when required by language 50 quintessential

tomboy 50 total girly-girl

AND because they could express this THING about themselves fully With authentic voice

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 39: Schemas for the Real World [Madison RubyConf 2013]

cczona

TransgenderGenderqueer

[PAUSE]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 40: Schemas for the Real World [Madison RubyConf 2013]

cczona

The freeform gender field is something I cherish about Metafilter

mdashMeFi user

That text field grew into a beloved institution Whatever you as user choose to put into that field says something revealing about who YOU are

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 41: Schemas for the Real World [Madison RubyConf 2013]

cczona

It was one of the earliest indications Id landed in the right place

mdashMeFi user

That youre allowed to put in ANYTHING -- or put in NOTHING AT ALL -- says something revealing about what how MetaFilter envisions community That schemas trust in users was the foundation for Metafilter users to ASK to share themselves even more

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 42: Schemas for the Real World [Madison RubyConf 2013]

cczona

In 2010 Diaspora likewise turned gender into a text field Just like on MetaFilter users felt set free by it

Theres an important distinction here though Metafilter is closed source its one developer and its entirely in English Diaspora needed to meet a different set of needs Its open source and its international

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 43: Schemas for the Real World [Madison RubyConf 2013]

cczona

Not amused

So some of its developers are not amused Its fair to object that this approach wreaks chaos on internationalization of pronouns [BEAT] And heres what Ive got to say to that

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 44: Schemas for the Real World [Madison RubyConf 2013]

cczona

Internationalization is hellish period

Internationalization of pronouns is hellish period You get no safety net in this Constraining gender options doesnt solve translation of languages Dont set yourself up for trying to deal with internationalization this way It will fail you

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 45: Schemas for the Real World [Madison RubyConf 2013]

cczona

Whats your legal gender

bull Indeterminate

bull X

bull Sex Not Specified

And if you really thing that boiling this question down to Whats your legal gender is an easy out lets talk about internationaliztion of that too Because this is the year that indeterminate has become a legal gender on German birth certificates and that X is a legal gender on Australian passports In New South Wales Australia Sex Not Specified is yet another legally-recognized gender Binary gender is today APP FAIL

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 46: Schemas for the Real World [Madison RubyConf 2013]

cczona

the most complicated thing Irsquove ever spent a lot of time learning about

And Irsquove spent a lot of time learning about quantum mechanics

mdashRandall Munroe xkcd

Randall Munroe best known as xkcd has examined the issue of pronouns many times in depth for English language projects He says its [READ] So hes what he ultimately concluded

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 47: Schemas for the Real World [Madison RubyConf 2013]

cczona

Ask

Ask

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 48: Schemas for the Real World [Madison RubyConf 2013]

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

itititselfitsits

by name

Asking straight up Which Pronouns Do You Prefer is truly the best he could come up with

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 49: Schemas for the Real World [Madison RubyConf 2013]

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Or maybe slightly refining like these

And I know you may be looking at that 3rd row and protesting

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 50: Schemas for the Real World [Madison RubyConf 2013]

cczona

Which pronouns do you prefer

hehimhimselfhishis

sheherherselfhersher

theythemthemselftheirstheir

personal name

other ____________

Our english teachers told us they and their arent singular right Nope Theyre plural AND SINGULAR and have been for 400 years Take the word of Jane Austen Shakespeare Chaucer Lewis Carroll Modern english authorities agree on this one too Its excellent English -- and excellent SOCIAL -- to use it when people ask us to

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 51: Schemas for the Real World [Madison RubyConf 2013]

cczona

Even Facebook has taken note that its okay Theyve been using this construction for at least 5 years and are doing fine -)

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 52: Schemas for the Real World [Madison RubyConf 2013]

cczona

Home is the place where when you have to go there they have to take you in

mdashRobert Frost

We aspire to make our social apps feel like home We can do that by making development choices that say Hey user we GET you Be as individual and messy as you want to We can handle _you_ being _you_

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 53: Schemas for the Real World [Madison RubyConf 2013]

Social Research

When collecting data on people were in a different realm Social sciences If you want to run useful analytics about personal attributes and behavior then data collection needs to meet at least the two minimum criteria for social science research

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 54: Schemas for the Real World [Madison RubyConf 2013]

ExhaustiveEvery possible option

A fields values must include every possible option

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 55: Schemas for the Real World [Madison RubyConf 2013]

Mutually ExclusiveNo overlap exists between them

And the fields options must be mutually exclusive

How many social apps have both of those bare minimum criteria covered

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 56: Schemas for the Real World [Madison RubyConf 2013]

cczona

Weve seen this one enough At least that set of values seems covered fully

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 57: Schemas for the Real World [Madison RubyConf 2013]

cczona

`

But its not

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 58: Schemas for the Real World [Madison RubyConf 2013]

cczona

Still lacking

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 59: Schemas for the Real World [Madison RubyConf 2013]

cczona

Trying hard But still inadequate to express the real worlds complexity

We do realize that the real world has great variety Of course we do But its hard to know how to handle it

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 60: Schemas for the Real World [Madison RubyConf 2013]

cczona

What user experience does this schema drive us toward

Its easy to figure Eh we can refactor later But initial choices set differing user experiences in motion Thinking at the outset about the real worlds variety and complexity starts us asking early questions that set foundation for social app-building Ask What user experience does this schema drive us toward

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 61: Schemas for the Real World [Madison RubyConf 2013]

cczona

Data doesnrsquot have to be for analysis

Its easy to get into the habit of structuring data for easy analysis But we can choose to look at human data from a different perspective Step back Wallow in the users perspective

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 62: Schemas for the Real World [Madison RubyConf 2013]

cczona

Data can be sheerexpressiveness

Data can be sheer expressiveness Data that has character individualism distinctiveness

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 63: Schemas for the Real World [Madison RubyConf 2013]

cczona

What we want What we get

bull Structured

bull Predictablebull Validations exceptions

bull Conditionals partials

bull Relational

bull Indexed bull Premature optimization

bull Exhaustivebull Cultural variability

bull Individual POV

bull Moving target

bull Easy analytics

bull Data-driven decisionsbull Decisions based on false

premises

As developers we have a vision of what a good codebase should be and not be

[READ]

In the most sensible of ways we often arrive at solutions that are factually TRUTHY while far removed from real life utility

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 64: Schemas for the Real World [Madison RubyConf 2013]

cczona

What is your religion if any

ARIS is the largest ongoing survey of Americans religious identification It asks this simple OPEN-ENDED question [slide]

Which nets over 100 unique answers Which if youre making a form based on that is tricky Theres no form element that makes it easy for our users to pick themselves out of a list of so many possibilities

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 65: Schemas for the Real World [Madison RubyConf 2013]

cczona

ARIS found though that these could be compressed into 13 major categories More manageable list right We could use that for a form But eh a lot of those are edge cases Wed rather want to focus on genuinely major groupings

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 66: Schemas for the Real World [Madison RubyConf 2013]

cczona

What is your religion if any

Christian 76

Other 4

None 15

Dont Know or Refused 5

Which brings us down to this At least a quarter of Americans are Christian Done Every other religion would just be clutter Edge cases

And then theres this sort of crummy data with it Wed probably assign nil value for more than one of these categories right

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 67: Schemas for the Real World [Madison RubyConf 2013]

cczona

What is your religion

Christian 76

Other 4

na 20

Fixed Which focuses attention on a problem here 1 in 5 are not useful answers from an advertisers perspective or for out own analytics The NILS -- gotta go

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 68: Schemas for the Real World [Madison RubyConf 2013]

cczona

What is your religion

Christian 76

Other 24

So what were left with is a good clear list It covers ALL the big stuff When you get reductive enough for Americans religion it a binary Which from a storage standpoint is great Booleans Score

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 69: Schemas for the Real World [Madison RubyConf 2013]

cczona

Religion

But we dont do THIS It covers the biggest categories But oh how it leaves people out

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 70: Schemas for the Real World [Madison RubyConf 2013]

cczona

People arent edge cases

People arent edge cases

And theyre pushing back on apps that treat them that way

[LONG BEAT LONG] [BREATH] [CHANGE GEARS]

So Reductive has big problems So does scaling upward

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 71: Schemas for the Real World [Madison RubyConf 2013]

cczona

Balancing between approaches

As engineers its instinctually uncomfortable to DELIBERATELY NOT STRUCTURE DATA for easy analysis I feel ya I really do This freaks me too But again the foundational question is What benefit will the user notice What identities are okay with us

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 72: Schemas for the Real World [Madison RubyConf 2013]

cczona

And if necessary we CAN strike a middleground This is where guided response comes in Autosuggest

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 73: Schemas for the Real World [Madison RubyConf 2013]

cczona

CheckboxRadioSelect

TextareaText

Required OptionalAutosuggest

Minimal Suggest

When theres a subset of values that youre most interested in do minimal suggest instead Autosuggest using just the handful of values you care about

Structured data from those who want to give it Free form to incite expressiveness in those who want THAT

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 74: Schemas for the Real World [Madison RubyConf 2013]

Unguided TextOf those who use MetaFilters gender field

40 of responses are f m female male

[read] So structured data IS there This can be a balanced solution in many cases where youre willing to tolerate some ambiguity Of course there are tradeoffsData quantity is lower Freed to opt out of proving personal info many doOn the other hand data quality should improve

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 75: Schemas for the Real World [Madison RubyConf 2013]

Optional Select60 of Facebook users select a relationship status

Its fine to mix and match here Find the right approach for your users and your apps business objectives Facebook for instance makes relationship status completely optional but coercive for those who do opt-in to setting a value Most users do opt-in 60 of them select a relationship status

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 76: Schemas for the Real World [Madison RubyConf 2013]

cczona

We want everyone to feel excited what weve

built

The bottom line is that [read]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 77: Schemas for the Real World [Madison RubyConf 2013]

cczona

We want users to feel passionate about their

involvement

[read]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 78: Schemas for the Real World [Madison RubyConf 2013]

cczona

Analytics investments amp monetization are

based on a premise that data is accurate

[read]

BUT the data has been collected by coercive approaches the RISK is that--

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 79: Schemas for the Real World [Madison RubyConf 2013]

cczona

this premise is garbage Some people are lying because lying has been made requirement for getting past the barriers So conclusions drawn from that bad data can misdirect decision-making about the next stage of development

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 80: Schemas for the Real World [Madison RubyConf 2013]

cczona

CheckboxRadioSelect

TextareaText

Required

Corrective

DiscretionaryGuided

The restrictive options that stuff at bottom left those dont actually have to be marked required

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 81: Schemas for the Real World [Madison RubyConf 2013]

cczona

ADD column gender NOT NULL VARCHAR(6) DEFAULT female

But the way we setup schema often embeds assumptions that we should and we will So we do An attribute that is ambiguously named is destined to become a form field whose data wanders away from its original intent and code that misunderstands how to use it A field thats not allowed to be null is destined to be mandatory A field that is assigned a maximum length is an ASSERTION that all possible values are knowable and fit within it

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 82: Schemas for the Real World [Madison RubyConf 2013]

cczona

ADD column gender_identity NOT NULL VARCHAR

BOOM This is foundation for a whole different user experience And the cool ninja move was that we decided to DO LESS Make this stuff flexible upfront Optimize storage later Decide whats valid later

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 83: Schemas for the Real World [Madison RubyConf 2013]

cczona

CheckboxRadioSelect

TextareaText

Required Optional

Alteration

Guided

So there you have it a discretionary field Whether to respond is left up to the user

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 84: Schemas for the Real World [Madison RubyConf 2013]

cczona

tstring relationship_status null =gt true

As developers we may upon THIS expression as redundant completely unnecessary Duh null is true by default But making that explicit is a communication to the team and to your future self Its a statement of intent Its documenting a product decision

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 85: Schemas for the Real World [Madison RubyConf 2013]

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

What would a canonical set of relationship statuses look like Three years ago Facebook figured this list was pretty good Arguably pretty progressive too right Users disagreed Strongly

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 86: Schemas for the Real World [Madison RubyConf 2013]

cczona

FacebookSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

Under pressure Facebook nearly doubled the options in just two years time

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 87: Schemas for the Real World [Madison RubyConf 2013]

cczona

Google+Single

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

While Google+ has largely adopted that list it has not included Separated or Divorced Notice that they also added something choice Opt out of labeling

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 88: Schemas for the Real World [Madison RubyConf 2013]

cczona

Allowing users to identify their relationships with labels of greater personal significanceThats being driven by people rejecting a user experience that isnt working for them

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 89: Schemas for the Real World [Madison RubyConf 2013]

cczona

Its a scope

How did some status seem universal while others werent Naming a thing creates scope The assumed validity of a fields values get constrained as soon as the field is named

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 90: Schemas for the Real World [Madison RubyConf 2013]

cczona

marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 91: Schemas for the Real World [Madison RubyConf 2013]

cczona

legal_marital_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Marital status for instance might lead to a list similar to this one A person is assumed to be either unmarried preparing to be married currently married or formerly married

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 92: Schemas for the Real World [Madison RubyConf 2013]

cczona

relationship_statusSingle

In a relationship

Engaged

Married

Its complicated

Open relationship

Widowed

Separated

Divorced

Civil union

Domestic partnership

I dont want to say

Whereas relationship status might lead to a list more like this one In which one either has a current relationship -- or is defined by the absence of any

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 93: Schemas for the Real World [Madison RubyConf 2013]

cczona

singleness_status

As soon as you name a field you define its paradigm and possibilities There are important difference in what each of these envision and are able to measure Naming fields -- with great specificity upfront -- Makes analyses more powerful later

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 94: Schemas for the Real World [Madison RubyConf 2013]

cczona

singleness_rating

No chance whatsoever Suuuuuper duper single

1000

If you change the name you shift the paradigm and possibilities Theres important difference in these what these collect set out to measure Naming fields -- With great specificity upfront --Makes analyses more powerful later

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 95: Schemas for the Real World [Madison RubyConf 2013]

cczona

Its ComplicatedIn a Relationship

Married Divorced

WidowedSingle

We go through life experiencing many relationships They dont all have pre-determined labels And new relationship identities dont necessarily leave old ones behind

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 96: Schemas for the Real World [Madison RubyConf 2013]

cczona

I like to be truthful and Its Complicated is really deceiving It is not complicated I am separated from my husband who I am still legally married to

mdashFacebook user

[READ] Why should anyone have to feel like theyre required to disavowing a relationship thats meaningful for them

It cuts deeply sometimes Is a widow only Widowed until she resumes dating Why does she have to be confronted with a decision like that just to use someones app

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 97: Schemas for the Real World [Madison RubyConf 2013]

cczona

Okay this one Spot the fatal flaw [WAIT]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 98: Schemas for the Real World [Madison RubyConf 2013]

cczona

buffer overflow

An open relationship is definitionally a 1Many join mdash WITH the usual engineering understanding that is that actual number of relationships may be 0 1 OR many

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 99: Schemas for the Real World [Madison RubyConf 2013]

cczona

Choose (one) evasive inauthentic

But its not funny from the personal perspective This schema forces the person to either LOOK EVASIVE or else BE INAUTHENTIC This is what happens when we try to throw more labels at the problem instead of re-examining the schemas assumptions

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 100: Schemas for the Real World [Madison RubyConf 2013]

cczona

Facebook relationship status by user age

srslyWhen we dont give people means to express whats real and important for them they hack around us Are a quarter of Facebooks 13 year olds really married Probably not What they probably have is a BEST FRIEND and married is nearest equivalent available for them to express THIS PERSON is my most important relationship

The dogged pursuit of nice clean CRUNCHABLE data takes us to such wrong places And makes our users want to work against a system that doesnt acknowledge their lives reality

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 101: Schemas for the Real World [Madison RubyConf 2013]

cczona

We build or break community with each

line of code

As developers our lines of code build -- or break -- an apps community So its worth making conscious choices as we go

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 102: Schemas for the Real World [Madison RubyConf 2013]

cczona

Modeling the real world is complex

[READ] And thats okay

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 103: Schemas for the Real World [Madison RubyConf 2013]

cczona

Assuming we know who users are surrenders

opportunity to learn who they are

But [read] Early constraints in schema NET crappy misleading data[beat]So keep constraints out of human and social schema at least at first Gather enough initial response to do some data mining Watch that data for a while Keep an eye out for emergent trends What DISCOVERIES can you make How can they shape the apps growth and evolution

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 104: Schemas for the Real World [Madison RubyConf 2013]

cczona

Quality

Data quality improves when lies are merely optional not required

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 105: Schemas for the Real World [Madison RubyConf 2013]

cczona

Specificity

Data becomes like the real world itself rich and specific So we can unearth the patterns that are undetectable when data is generic

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 106: Schemas for the Real World [Madison RubyConf 2013]

cczona

Adaptability

We get to make discoveries and re-envision possibilities

We adapt quickly

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 107: Schemas for the Real World [Madison RubyConf 2013]

cczona

Loyalty

And their response to all this Engagement Passion LOYALTY

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 108: Schemas for the Real World [Madison RubyConf 2013]

cczona

THOSE are foundations for great user experiences [LOOK UP FINALITY]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 109: Schemas for the Real World [Madison RubyConf 2013]

cczonacczonagmailcomhttpcczonacom

QuestionsConversation

Oh Hell Yeah

Carina C Zona

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 110: Schemas for the Real World [Madison RubyConf 2013]

cczona

Resources

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 111: Schemas for the Real World [Madison RubyConf 2013]

cczona

Data Science amp Information ArchitecturebullSociological normalization

bullDatabase normalization

bullUsing Machine Learning On Social Networks To Figure Out What You Should Read On The Web

bullNoSQL Data Modeling Techniques

bullData amp Reality by William Kent

bull Bad Data Handbook Q Ethan McCallum

bullData Science of the Facebook World

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 112: Schemas for the Real World [Madison RubyConf 2013]

cczona

Relationships

bullDoes Facebook Hurt Relationships

bullFacebook Adds LGBT-Friendly Relationship Status Options

bullFacebook Targeting by Relationship Status amp Workplace

bullYour Facebook Relationship Status Its Complicated

bullGay Marriage The Database Engineering Perspective

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 113: Schemas for the Real World [Madison RubyConf 2013]

cczona

Sex amp Genders

bullldquoDisalienation Why Gender is a Text Field on Diasporardquo

bull ldquoGender amp Drop Down Menusrdquo

bullldquoSex amp Genderrdquo

bullldquoBucket Genderrdquo

bullRecommendations for Inclusive Data Collection of Trans People

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 114: Schemas for the Real World [Madison RubyConf 2013]

cczona

Names

bullFalsehoods Programmers Believe About Names

bullYour Last Name Contains Invalid Characters

bullW3C Internationalization Personal Names Around the World

bullSpanish Names

bullChinese Names

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 115: Schemas for the Real World [Madison RubyConf 2013]

cczona

More

bullRedesigning the Country Selector

bullAmerican Religious Identification Survey Summary Report 2009

bullLinguistic Potluck Crowdsourcing Internationalization in Rails

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 116: Schemas for the Real World [Madison RubyConf 2013]

cczona

Image CreditsPostsecret

Facebook

OKCupid

Google+

Metafilter

Diaspora

Flickr

FetLife

Kotangle

cutestpawcom

hdwallpapersin

xkcd

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 117: Schemas for the Real World [Madison RubyConf 2013]

cczona

Many ThanksChiu-Ki ChanEstelle Weyl

Heather RiversHeroku

Jeremy DunckJosh Susser

Michele TitoloNIRD LLC

Reneacutee De VoursneySarah Mei

San Francisco Sex InformationYoz Grahame

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona

Page 118: Schemas for the Real World [Madison RubyConf 2013]

cczona

Get in Touch

Carina C Zonacczona

httpcczonacom

cczonagmailcom

httpslidesharenetcczona

httplinkedincomincczona