game driven software development for npos

50
Game Driven Software Development for NPOs the Scientific Community Game (SCG)

Upload: mira

Post on 31-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

Game Driven Software Development for NPOs. the Scientific Community Game (SCG). NP Optimization Problems (NPOs). NPOs are approximated using (ensembles of) heuristics. Foster development and innovation of heuristics. Fostering Heuristics Development. Feedback! - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Game Driven Software Development for NPOs

Game Driven Software Development for NPOsGame Driven Software Development for NPOs

the Scientific Community Game (SCG)the Scientific Community Game (SCG)

Page 2: Game Driven Software Development for NPOs

NP Optimization Problems (NPOs)

• NPOs are approximated using (ensembles of) heuristics.

• Foster development and innovation of heuristics.

Page 3: Game Driven Software Development for NPOs

Fostering Heuristics Development

• Feedback!

• Analyze the performance of heuristics in a niche to form better ensembles.

• Parameter tuning.

• Bug fixes.

Page 4: Game Driven Software Development for NPOs

Fostering Heuristics Innovation

• Analyzing the niches within the problem domain.

• Constructing hard problems.

• Hints!

Page 5: Game Driven Software Development for NPOs

Game Driven Software Development

• A number of autonomous teams.

• Each team develops an agent that embodies their own heuristics.

• Agents participate in a contest.

• Contest winners get an egoistic boost.

• Teams develop their agents for the next contest.

Page 6: Game Driven Software Development for NPOs

Game Driven Development Has Worked

Before!

• Renaissance mathematicians.

• SAT competitions.

Page 7: Game Driven Software Development for NPOs

The SCG(X) GameThe SCG(X) Game

Page 8: Game Driven Software Development for NPOs

The SCG(X) Game

• X is a specific predefined NPO problem Domain. e.g. Boolean-MAX-CSP.

• In every round, agents must propose new hypotheses and oppose other agents’ hypotheses.

• Agents oppose hypotheses by either strengthening or challenging them.

Page 9: Game Driven Software Development for NPOs

The SCG(X) Game [cont.]

• Agents gain reputation when they strengthen hypotheses.

• Agents challenge hypotheses by engaging in a discounting protocol.

• Agents gain reputation when they discount hypotheses and lose reputation when they fail to do so.

Page 10: Game Driven Software Development for NPOs

The SCG(X) Game [cont.]

• All agents start with the same reputation. The sum of all reputations is preserved.

• Agent(s) with the highest reputation win(s).

Page 11: Game Driven Software Development for NPOs

We Didn’t Tell You ...

• How do hypotheses look-like?

• What is the discounting protocol?

• How much reputation do agents gain/lose when they strengthen/discount hypotheses?

Page 12: Game Driven Software Development for NPOs

Hypothesis

• Alice’ Hypothesis: There exists a problem P in niche N of X s.t. for all solutions SBob searched by the opponent Bob in T seconds. Quality(P, SBob) < AR * Quality(P, SAlice).

• Hypotheses have an associated confidence [0,1].

• Hypothesis: <N, AR, Confidence>.

SQ = Quality(P, SAlice)

Page 13: Game Driven Software Development for NPOs

Hypothesis [Example]

• 1in3 example.

Page 14: Game Driven Software Development for NPOs

X = Boolean MAXCSP

• Given a sequence of Boolean constraints formulated using a set R of Boolean relations, find an assignment that maximizes the fraction of satisfied constraints.

• Is an NPO for most R. Decision version is NP-complete for most R.

• Niche defined by R.

20

Page 15: Game Driven Software Development for NPOs

1in3 niche• Only relation 1in3 is used.

• 1in3 problem P:

v1 v2 v3 v4 v51in3( v1 v2 v3)1in3( v2 v4 v5)1in3( v1 v3 v4)1in3( v3 v4 v5)secret 1 0 0 1 0

Truth Table 1in3

000 0001 1010 1011 0100 1101 0110 0111 0

Secret quality SQ = 3/4

21

Page 16: Game Driven Software Development for NPOs

1in3 Hypothesis• 1in3 hypothesis H proposed by Alice:

exists P in 1in3 niche so that for all SBob that opponent Bob searches in time t (small constant) seconds: Quality(P,SBob) < 0.4 * Quality(P,SAlice).

• H = (niche = (1in3), AR =0.4, confidence = 0.8)

• Bob has clever knowledge that Alice does not have. He opposes the hypothesis H by challenging it using his randomized algorithm. 22

Page 17: Game Driven Software Development for NPOs

Bob’s clever knowledge4/9 for 1in3

• 4/9 for 1in3: For all P in 1in3 niche, exists S so that Quality(P,S) >= 0.444 * SQ.

• Proof: la(p)=3*p*(1-p)2 has the maximum 4/9.

• argmax p in [0,1] la(p) = 1/3.

• Without search, in PTIME.

• Derandomize

• Bob successfully discounts

• Alice gets a hint

• Was Bob just lucky?

Truth Table 1in3000 0001 1010 1011 0100 1101 0110 0111 0

23

Page 18: Game Driven Software Development for NPOs

1in3 Hypothesis

• Bob does not know whether 4/9 is best possible. Should check Semidefinite Programming.

• Bob only knows that the set of 1in3 problems having a solution satisfying 4/9 + eps, eps > 0, is NP-complete.

24

Page 19: Game Driven Software Development for NPOs

AR is too lowAR is too low AR is too highAR is too high

exists P for all S exists P for all S that opponent that opponent searches: searches: Quality(P,S) < Quality(P,S) < AR * SQAR * SQ

Challenge. Hypothesis proposer provides a problem. Opponent solves it.

Strengthen. Opponent proposes a hypothesis with a lower AR.

Opposing Hypotheses

Page 20: Game Driven Software Development for NPOs

Reputation Gain• Hypothesis have credibility [0,∞].

The credibility of a hypothesis is proportional to agent’s confidence in the hypothesis and agent’s reputation.

• Reputation gain is proportional to the discounting factor and the hypothesis credibility.

• The discounting factor [-1,1]. 1 means the hypothesis is completely discounted.

Page 21: Game Driven Software Development for NPOs

AR is too AR is too lowlow AR is too AR is too highhigh

exists P for all S exists P for all S that opponent that opponent searches: searches: Quality(P,S) < Quality(P,S) < AR * SQAR * SQ

Quality(P,S’) - AR * SQ

strengthens: AR - AR’.

Discounting Factor

Page 22: Game Driven Software Development for NPOs

Discounting Factor

• H1 = ((1in3), AR = 1.0, confidence = 1.0)

• H1 proposed by Alice: exists P in 1in3 niche so that for all S that opponent Bob searches: Quality(P,S) < 1.0 * SQ.

• This is a reasonable hypothesis if Alice is sure that her secret assignment is the maximum assignment when she provides a sufficiently big problem to Bob.

31

Page 23: Game Driven Software Development for NPOs

What we did not tell you so far

• A game defines some configuration constants:

• a maximum problem size

• For example, all problems in the niche can have at most 1 million constraints.

• A maximum time bound for all tasks (propose, oppose, provide, solve), e.g. 60 seconds.

• An initial reputation, e.g., 100. When reputation becomes negative, agent has lost.

32

Page 24: Game Driven Software Development for NPOs

Discounting Factor: ReputationGain for

Strengthening

• H1 = ((1in3), AR = 1.0, confidence = 1.0)

• H1 proposed by Alice: exists P in 1in3 niche so that for all S that opponent Bob searches: Quality(P,S) < 1.0 * SQ.

• Bob thinks he can strengthen H1 to H2 = (MAXCSP, niche = secret ExistsForAll (1in3), AR = 0.9, confidence = 1.0).

• DiscountingFactor 1.0-0.9 = 0.1.

• ReputationGain for Bob = 0.1 * 1.0 * AliceReputation.

• Alice gets her reputation back if she discounts H2.33

Page 25: Game Driven Software Development for NPOs

Discounting FactorReputationGain for

Discounting• H = ((1in3), AR = 0.4, confidence = 1.0)

• H proposed by Alice: exists P in 1in3 niche so that for all S that opponent Bob searches: Quality(P,S) < 0.4 * SQ.

• Bob knows he can discount H based on this knowledge: 4/9 for 1in3.

• Let’s assume he achieves 0.45 on Alice’ problem.

• DiscountingFactor 0.45 – 0.4 = 0.05 .

• ReputationGain for Bob = 0.05*1.0*AliceReputation. 34

Page 26: Game Driven Software Development for NPOs

Discounting FactorReputationGain for

Supporting• H = ((1in3), AR = 0.4, confidence = 1.0)

• H proposed by Alice: exists P in 1in3 niche so that for all S that opponent Bob searches: Quality(P,S) < 0.4 * SQ.

• Bob knows he can discount H based on this knowledge: 4/9 for 1in3.

• Let’s assume he achieves 0.3 on Alice’ problem. Bob has a bug somewhere!

• DiscountingFactor 0.3 – 0.4 = -0.1

• ReputationLoss for Bob = -0.1*1.0*AliceReputation.35

Page 27: Game Driven Software Development for NPOs

Mechanism Design

• The exact SCG(X) mechanism is still a work in progress.

• SCG(X) mechanism must be sound:

• Encourage productive behavior and discourage unproductive behavior of scientists.

• The agent with best heuristics wins.

Page 28: Game Driven Software Development for NPOs

Tools to facilitate use of SCG(X)

• Definition of X.

• Generate a client-server infrastructure for playing SCG(X) on the web.

• Administrator enforces SCG(X) rules: client.

• Baby agents: servers. They can communicate and play an uninteresting game.

• Baby agents get improved by their caregivers, register with Administrator and the game begins at midnight.

Page 29: Game Driven Software Development for NPOs

SCIENTIFIC COMMUNITYSCIENTIFIC

COMMUNITY

Page 30: Game Driven Software Development for NPOs

Productive Scientific Behavior (1)

• The agents propose hypotheses that are difficult to strengthen or challenge (i.e. non-trivial yet correct). Otherwise, they lose reputation to their opponents.

• Offer results that cannot be easily improved.

• Offer results that they can successfully support.

Page 31: Game Driven Software Development for NPOs

Productive Scientific Behavior (2)

• Agents are encouraged to propose hypotheses they are not sure about. But they need to fairly express their confidence in their hypotheses.

• If the confidence is inappropriately high, they lose too much reputation if the hypothesis is successfully discounted.

• If the confidence is inappropriately low, they don’t win enough reputation if the hypothesis is successfully supported.

• publish results of an experimental nature with an appropriate confidence level.

Page 32: Game Driven Software Development for NPOs

Productive Scientific Behavior (3)

• Agents stay active. In each “round”, they must propose new hypotheses and oppose other agents hypotheses.

• stay active and publish new hypotheses or oppose current hypotheses.

• Agents maximize their reputation.

• become famous!

Page 33: Game Driven Software Development for NPOs

Productive Scientific Behavior (4)

• When Alice loses reputation to Bob, Alice can learn from Bob:

• Alice has a bug in her software.

• Bob has skills superior to hers. Alice should try to acquire Bob’s skills.

• Learn from mistakes.

• Be careful how you oppose a Nobel Laureate. The risks are high.

Page 34: Game Driven Software Development for NPOs

Unproductive Scientific Behavior

• Cheating is forbidden: you can only succeed through good scientific behavior (by adding useful hypotheses or by successfully opposing hypotheses in the knowledge base).

Page 35: Game Driven Software Development for NPOs

Fair Scientific Community

• All agents start with the same initial reputation.

• The winner has the best skills in domain X within the set of participating agents.

Page 36: Game Driven Software Development for NPOs

ApplicationsApplications

Page 37: Game Driven Software Development for NPOs

Improving the research approach

• Problem to be solved: Develop the best practical algorithms for solving NPO X.

• Standard solution: Write hundreds of papers on the topic with isolated implementations. What are the best practical algorithms?

• Our solution: Use the virtual scientific agent community SCG(X) with a suitably designed hypotheses language to compare the algorithms. The winning agent has the best practical algorithms.

Page 38: Game Driven Software Development for NPOs

• Needed when agent caregiver is human.

• Knowledge about domain X needs to be developed by students or taught to them and understood and put into algorithms (propose-oppose(strengthen-challenge)-provide-solve) that go into the agent.

• This tests both whether the knowledge about X is understood as well as the programming skills.

Teaching: Survival Skills in

SCG(X)

Page 39: Game Driven Software Development for NPOs

• [Scientific Innovation in X] Agents get skills programmed into them by clever scientists in domain X. Scientists use data mining to learn from competitions and manually improve the agents.

• [Machine Learning Innovation in X] Agents get skills programmed into them by an agent caregiver programmed with learning skills and data mining skills for domain X. Agent gets updated automatically between competitions and they improve automatically.

Teaching: Survival Skills in

SCG(X) [cont.]

Page 40: Game Driven Software Development for NPOs

Possible Application Domain For DM/ML/AI

Possible Application Domain For DM/ML/AI

Page 41: Game Driven Software Development for NPOs

SCG(X) produces history

• Proposer’s reputation: 120

• Hypothesis10 proposer1 opposer2 confidence 1

• Problem delivered

• Solution found: discountFactor = 1

• Opposer: increase in reputation: 1 * 1 * 120 = 120

Page 42: Game Driven Software Development for NPOs

Blame assignment

• Where is the proposer to blame?

• Bad hypothesis that is discountable.

• Bug in problem finding algorithm.

• Bug in problem solving algorithm used to check proposed hypothesis.

Page 43: Game Driven Software Development for NPOs

Creating Agents

• An agent is composed of 6 components: Agent = <Prop, Opp, Str, Cha, Prov, Sol>.

• Components can refer to each other.

• Given a set of agents: Agent1 ... Agentn

• Composed agent is a 12-tuple: <PropI, PropO, OppI, OppO, StrI, StrO, ChaI, ChaO, ProvI, ProvO, SolI, SolO>.

• <Prop3, (01101),Opp4, (00000), …>

Propose, Oppose, Strengthen, Challenge, Provide,Solve

1=own0=other

Page 44: Game Driven Software Development for NPOs

Creating Agents [cont.]

• PropI, OppI, StrI, ChaI, ProvI, SolI ∈ [1..n].

• PropO consist of 5-bits, each denote one of the other components. The first bit describes whether to use the opposition component of agent PropI or agent OppI.

Page 45: Game Driven Software Development for NPOs

Conclusions

• We have shown how a virtual scientific community of agents can foster the development and innovation of heuristics for approximating NPOs.

• We need your input on how DM and ML could help with evolving the agents.

Page 46: Game Driven Software Development for NPOs

Questions?Questions?

Page 47: Game Driven Software Development for NPOs

7510/16/09 Can DM and ML help?

Discounting

• If Alice offers the belief (FourColorConjecture, confidence = 1.0), she must be ready to support it.–The opponent Bob gives Alice a planar graph.–Alice must deliver a 4-coloring.• If she does not, Bob has successfully discounted Alice’ belief and Alice loses reputation and Bob gains.• If she does, Alice has successfully defended her belief and Alice wins reputation and the opponent Bob loses.

–Note that discounting is different from finding a counterexample. If Alice loses she has a “fault” in her coloring algorithm.

Page 48: Game Driven Software Development for NPOs

7610/16/09 Can DM and ML help?

Beliefs: Four color conjecture

• FourColorConjecture: For all graphs g satisfying the predicate planar(g) there exists a 4-coloring of the nodes of g such that no two adjacent nodes have the same color.• ForAllExists belief: For all problems p satisfying predicate pred(p) there exists a solution s satisfying a property(p,s).

Page 49: Game Driven Software Development for NPOs

• Undiscounted beliefs represent the accumulated shared knowledge gained from the game. (Requires negation and reoffer of discounted beliefs?)

Page 50: Game Driven Software Development for NPOs

Improving the research approach

• Problem to be solved: Develop the best practical algorithms for solving NPO X.

• Standard solution: Write hundreds of papers on the topic with isolated implementations. What are the best practical algorithms?

• Our solution: Use the virtual scientific agent community SCG(X) with a suitably designed hypotheses language to compare the algorithms. The winning agent has the best practical algorithms.