grigory yaroslavtsev joint work with piotr berman and sofya raskhodnikova

20
-Testing Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Upload: marie-bond

Post on 01-Apr-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

-Testing

Grigory Yaroslavtsev

Joint work with Piotr Berman and Sofya Raskhodnikova

⇒ ⇒

Page 2: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Testing Big Data

• Q: How to understand properties of large data looking only at a small sample?

• Q: How to ignore noise and outliers?• Q: How to minimize assumptions about the

sample generation process?• Q: How to optimize running time?

Page 3: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Which stocks were growing steadily?

Data from http://finance.google.com

Page 4: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Property Testing [Goldreich, Goldwasser, Ron; Rubinfeld, Sudan]

NO

YES

Randomized Algorithm

Accept with probability

Reject with probability

⇒⇒

YES

NO

Property Tester

-close

Accept with probability

Reject with probability

⇒Don’t care

-close : fraction has to be changed to become YES

Page 5: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Tolerant Property Testing [Parnas, Ron, Rubinfeld]

-close : fraction has to be changed to become YES

YES

NO

Property Tester

-close

Accept with probability

Reject with probability

⇒Don’t care

Tolerant Property Tester

Accept with probability

Reject with probability

⇒Don’t care

NO

-close-close

YES

Page 6: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Which stocks were growing steadily?

Data from http://finance.google.com

Page 7: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

• = class of monotone functions

• -close:

Tolerant “Property Testing”

Tolerant “ Property Tester”

Accept with probability

Reject with probability

⇒Don’t care

NO

-close-close

YES

Page 8: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

8

New -Testing Model for Real-Valued Data

• Generalizes standard Hamming testing

• For still have a probabilistic interpretation:

• Compatible with existing PAC-style learning models (preprocessing for model selection)

• For Boolean functions, .

Page 9: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

9

Our Contributions

1. Relationships between -testing models2. Algorithms

– -testers for • monotonicity, Lipschitz, convexity

– Tolerant -tester for • monotonicity in 1D (sublinear algorithm for isotonic regression)

Our -testers beat lower bounds for Hamming testersSimple algorithms backed up by involved analysisUniformly sampled (or easy to sample) data suffices

3. Nearly tight lower bounds

Page 10: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

10

Implications for Hamming Testing

Some techniques/results carry over to Hamming testing

– Improvement on Levin’s work investment strategy• Connectivity of bounded-degree graphs [Goldreich, Ron ‘02]

• Properties of images [Raskhodnikova ‘03]

• Multiple-input problems [Goldreich ‘13]

– First example of monotonicity testing problem where adaptivity helps

– Improvements to Hamming testers for Boolean functions

Page 11: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Definitions

• (D = finite domain/poset)• , for • Hamming weight (# of non-zero values)• Property = class of functions (monotone,

convex, linear, Lipschitz, …)

Page 12: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Relationships: -Testing

(,) = query complexity of -testing property at distance

• (Cauchy-Shwarz)

Boolean functions

Page 13: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Relationships: Tolerant -Testing(,) = query complexity of tolerant -testing property with distance parameters

• No general relationship between tolerant -testing and tolerant Hamming testing

• -testing for is close in complexity to -testing

For Boolean functions =

Page 14: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Our Results: Testing Monotonicity• Hypergrid ()

• adaptive tester for Boolean functions

Upper bound

[Dodis et al. ’99,…, Chakrabarti, Seshadhri ’13]

Lower bound

[Dodis et al.’99…, Chakrabarti, Seshadhri ’13]

Non-adaptive 1-sided error

Page 15: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Monotonicity: Key Lemma

• M = class of monotone functions• Boolean slicing operator

if otherwise.

• Theorem:

Page 16: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Proof sketch: slice and conquer

1) Closest monotone function with minimal -norm is unique (can be denoted as an operator ).

2) and commute:

=

2) 3)1)

Page 17: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

-Testers from Boolean Testers Thm: A nonadaptive, 1-sided error -test for monotonicity of is also an -test for monotonicity of .Proof:• A violation :• A nonadaptive, 1-sided error test queries a random set and

rejects iff contains a violation.• If is monotone, will not contain a violation.• If then • W.p., set contains a violation for

12

𝒇 (𝒙) 𝒇 (𝒚 )¿

Page 18: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Distance Approximation and Tolerant Testing

[Saks Seshadhri 10]

Approximating -distance to monotonicity

• Time complexity of tolerant -testing for monotonicity is

– Better dependence than what follows from distance appoximation for

– Improves adaptive distance approximation of [Fattal,Ron’10] for Boolean functions

Page 19: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

-Testers for Other PropertiesVia combinatorial characterization of -distance to the property• Lipschitz property :

Via (implicit) proper learning: approximate in up to error , test approximation on a random -sample• Convexity :

(tight for ) • Submodularity

[Feldman, Vondrak 13]

Page 20: Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova

Open Problems• All our algorithms for for were obtained directly from -testers.Can one design better algorithms by working directly with -distances?• Our complexity for -testing convexity grows exponentially with d

Is there an -testing algorithm for convexity with subexponential dependence on the dimension?

• Our -tester for monotonicity is nonadaptive, but we show that adaptivity helps for Boolean range.

Is there a better adaptive tester?• We designed tolerant tester only for monotonicity (d=1,2).

Tolerant testers for higher dimensions? Other properties?