orchestrating collective intelligence
TRANSCRIPT
![Page 1: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/1.jpg)
@josephreisinger @premisedata
![Page 2: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/2.jpg)
![Page 3: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/3.jpg)
WHAT PREMISE MEASURES
![Page 4: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/4.jpg)
Bringing visibility to the world’s hardest-to-see places. 130 cities, 30 countries.
![Page 5: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/5.jpg)
Modernizing Economic Measurement
![Page 6: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/6.jpg)
“I have been constantly surprised at how little quantitative information can be brought to bear on fundamental policy questions [...] This experience illustrates the need for flexibility in data collection, especially when policymakers consider extending new policies or need to evaluate them in real time for other reasons. Ideally, some sort of ‘rapid response’ data gathering capacity.”
— Alan Krueger, “Stress Testing Economic Data”
![Page 7: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/7.jpg)
“The collection of statistics needs to be modernized; it is time to use the new technologies to start collecting data.
…particularly important in developing countries where the prevalence of mobile phones now offers an unprecedented opportunity to measure the economy.”
— Diane Coyle, “GDP”
![Page 8: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/8.jpg)
![Page 9: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/9.jpg)
OMGWTFGDP
![Page 10: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/10.jpg)
“However, at this moment in survey research, uncertainty reigns. Participation rates in household surveys are declining throughout the developed world. Surveys seeking high response rates are experiencing crippling cost inflation. Traditional sampling frames that have been serviceable for decades are fraying at the edges.”
— Robert Groves, “Three Eras of Survey Research”
![Page 11: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/11.jpg)
![Page 12: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/12.jpg)
Orchestrating Collective Intelligence
![Page 13: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/13.jpg)
![Page 14: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/14.jpg)
PREMISE APP
Directed on-the-ground data acquisition
![Page 15: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/15.jpg)
Crowdsourcing vs Orchestration
![Page 16: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/16.jpg)
Crowdsourcing
survey
![Page 17: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/17.jpg)
Crowdsourcing
survey survey tasks
![Page 18: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/18.jpg)
Crowdsourcing
survey tasks workerssurvey
![Page 19: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/19.jpg)
Orchestration
survey
![Page 20: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/20.jpg)
Orchestration
survey survey tasks
![Page 21: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/21.jpg)
Orchestration
survey tasks workerssurvey
![Page 22: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/22.jpg)
Orchestration
survey tasks workerssurvey
![Page 23: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/23.jpg)
Orchestration
survey tasks workerssurvey
![Page 24: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/24.jpg)
Orchestration
survey tasks workerssurvey
![Page 25: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/25.jpg)
Orchestration
survey tasks workerssurvey
![Page 26: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/26.jpg)
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best answered by via actual, on-the-ground observation at scale.
Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc.Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced.
Contributors collect data in the field using Android phones…
… which are sent back to the Premise network.
QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks.
Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification.
end user
data contributor
PLATFORM
![Page 27: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/27.jpg)
Resource Scarcity and
Access Risk
![Page 28: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/28.jpg)
![Page 29: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/29.jpg)
![Page 30: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/30.jpg)
Average wait times are about ~10m longer in Maracaibo than in Caracas.
Police are present ~80% of the time in Maracaibo, but only 30-40% in Caracas.
![Page 31: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/31.jpg)
Machine Learning
![Page 32: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/32.jpg)
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best answered by via actual, on-the-ground observation at scale.
Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc.Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced.
Contributors collect data in the field using Android phones…
… which are sent back to the Premise network.
QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks.
Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification.
end user
data contributor
PLATFORM
![Page 33: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/33.jpg)
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best answered by via actual, on-the-ground observation at scale.
Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc.Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced.
Contributors collect data in the field using Android phones…
… which are sent back to the Premise network.
QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks.
Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification.
end user
data contributor
allocation
PLATFORM
![Page 34: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/34.jpg)
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best answered by via actual, on-the-ground observation at scale.
Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc.Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced.
Contributors collect data in the field using Android phones…
… which are sent back to the Premise network.
QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks.
Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification.
end user
data contributor
analytics
PLATFORM
![Page 35: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/35.jpg)
survey campaign
allocation quality control
analytics
end-user
data contributor
User poses a question that is best answered by via actual, on-the-ground observation at scale.
Question is translated into an internal “specification” of the data points needed to answer the question: type, location, frequency, coverage, etc.Inventory of data points automatically allocated to data contributor pool, taking into account budget, agent profiles and geography. Data points are dynamically priced.
Contributors collect data in the field using Android phones…
… which are sent back to the Premise network.
QC is a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks.
Automated capabilities to explore data and expose trends or patterns; hypothesize new features to explain variation; suggest specification refinement; improve automated verification.
end user
data contributor
quality control
PLATFORM
![Page 36: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/36.jpg)
Optimizing Task Allocation
![Page 37: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/37.jpg)
TASKS
![Page 38: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/38.jpg)
![Page 39: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/39.jpg)
locations
measurables
CAMPAIGN DEFINITION
![Page 40: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/40.jpg)
locations
measurables
CAMPAIGN DEFINITION
![Page 41: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/41.jpg)
locations
measurables
CAMPAIGN DEFINITION
![Page 42: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/42.jpg)
locations
measurables
CAMPAIGN DEFINITION
![Page 43: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/43.jpg)
locations
measurables
CAMPAIGN DEFINITION
![Page 44: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/44.jpg)
locations
measurables
survey period 1
CAMPAIGN DEFINITION
![Page 45: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/45.jpg)
locations
measurables
CAMPAIGN DEFINITION
survey period 1 survey period 2
![Page 46: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/46.jpg)
locations
measurables
survey period 2
CAMPAIGN DEFINITION
survey period 1
![Page 47: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/47.jpg)
locations
measurables
survey period 1
TASK ALLOCATION
user 1
user 2
user 3
allocation period:
1
![Page 48: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/48.jpg)
locations
measurables
survey period 1
TASK ALLOCATION
user 1
user 2
user 3
allocation period:
1 2
![Page 49: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/49.jpg)
locations
measurables
survey period 1
TASK ALLOCATION
user 1
user 2
user 3
allocation period:
1 2 3
![Page 50: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/50.jpg)
TASK COMPLETION RATE MODEL
payout
pTCR
“uptake risk”
Model features: user-history, task-history / location-history, task-user, location-user
Issues: data sparsity in marginal vs conditional, uptake counterfactuals (non-iid sampling), path-dependence / lock-in
Linear functional model
}
![Page 51: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/51.jpg)
Explorationvs
Survey Consistency
![Page 52: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/52.jpg)
![Page 53: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/53.jpg)
![Page 54: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/54.jpg)
![Page 55: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/55.jpg)
locations
measurables
period 1 period 2
![Page 56: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/56.jpg)
locations
measurables
period 1 period 2
![Page 57: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/57.jpg)
TASK REFINEMENT
![Page 58: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/58.jpg)
ITERATIVE LOCATION DISCOVERY
![Page 59: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/59.jpg)
Exploration vs Survey Consistency
- Campaign layers: separate discovery and survey
- Iteratively refine attribute and geospatial targeting
- Monitor correlation in item responses and appearance of new attributes
- Monitor residual endogeneity
![Page 60: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/60.jpg)
Fraud and Coalition
Formation
![Page 61: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/61.jpg)
![Page 62: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/62.jpg)
Coalitions vs Referrals
- Referrals are necessary to reach most remote areas
- However we need to be able to partition the Premise graph into independent subnetworks, e.g. for re-evaluation, experimentation and sample stratification.
![Page 63: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/63.jpg)
CONTRIBUTOR AFFINITY MODEL
Model features:
direct referralaccount featuresupload locationvisit historiesgeographic arearesponse correlation
Issues: bootstrapping affinity scores for new users, optimal scheduler is antagonistic for coalition discovery
Sampling from Large Graphs [Leskovec & Faloutsos; 2006]
weight
![Page 64: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/64.jpg)
RECAP
- Orchestrating collective intelligence
- Optimizing task allocation via dynamic scheduling and incentives
- Exploration and discovery while maintaining survey consistency
- Fraud and coalition formation in networks
![Page 66: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/66.jpg)
PROOF PROOFAUTO QC PROOFMANUAL QC MANUAL QCREVALIDATION
![Page 67: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/67.jpg)
“The problem of changing statistics is that you lose the ability to compare across time. The longer the time-series, the harder it is to change it, but you want to be able to compare. How do you replace GDP? And if you do, you lose the past sixty years of relevance. This has been a problem for centuries—take the Spanish silver trade. Anything you measure will become increasingly irrelevant over time.”
— Hans Rosling
[Zachary Karabell, The Leading Indicators]
![Page 68: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/68.jpg)
![Page 69: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/69.jpg)
![Page 70: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/70.jpg)
“You need to focus on quality. You’ll be better off with a small but carefully structured sample rather than a large sloppy sample.”
— Hal Varian, Google
![Page 71: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/71.jpg)
“Big Data is bullshit”
— Harper Reed
![Page 72: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/72.jpg)
Big Data, n.: the belief that any sufficiently large pile of shit contains a pony with probability approaching one
—@grimmelm
![Page 73: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/73.jpg)
![Page 74: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/74.jpg)
“dividing by bieber”
![Page 75: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/75.jpg)
![Page 76: Orchestrating Collective Intelligence](https://reader036.vdocument.in/reader036/viewer/2022062904/587d21401a28ab1c2f8b563b/html5/thumbnails/76.jpg)