a 4 ‘ 1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

15
a 4 1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0 a 3 0 1 1 0 0 0 0 0 1 1 0 1 1 0 0 0 1 a 2 0 0 0 1 1 1 1 0 0 0 1 0 0 1 1 0 0 a 1 1 1 1 0 0 1 0 1 1 1 0 1 1 1 0 1 1 UF NNC Ptree Ex. 1 using 0-D Ptrees (sequences) a=a 5 a 6 a 1 ’a 2 ’a 3 ’a 4 ’=(000000) C' 1 1 0 0 1 0 1 1 1 0 1 1 0 0 1 1 0 d 1 d 2 t1 t2 t1 t3 t1 t5 t1 t6 t2 t1 t2 t7 t3 t1 t3 t2 t3 t3 t3 t5 t5 t1 t5 t3 t5 t5 t5 t7 t6 t1 t7 t2 t7 t5 a 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 a 2 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 a 3 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 a 4 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1 a 5 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 0 0 a 6 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 a 7 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 1 1 a 8 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 1 1 a 9 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 0 C 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 a 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 a 2 1 1 1 0 0 0 0 1 1 1 0 1 1 0 0 1 1 a 3 1 0 0 1 1 1 1 1 0 0 1 0 0 1 1 1 0 a 4 0 0 1 0 0 1 0 0 0 1 0 0 1 1 0 0 1 a 5 1 1 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 a 6 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 a 7 0 0 1 0 1 1 1 0 0 1 1 0 1 1 1 0 1 a 8 0 0 1 0 1 1 1 0 0 1 1 0 1 1 1 0 1 a 9 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 a 6 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 a 5 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 1 1 Identifying all training tuples in the distance=0 ring or 0ring, centered at a (exact matches ) as 1-bits of the Ptree, P= a 5 ^a 6 ^a 1 ^a 2 ^a 3 ^a 4 (we use _ for complement) There are no training points in a’ s 0ring! We must look further out, i.e., a’ s 1ring P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 Vote histogram (so far)

Upload: elaina

Post on 13-Jan-2016

63 views

Category:

Documents


0 download

DESCRIPTION

0 1. Vote histogram (so far). UF NNC Ptree Ex. 1 using 0-D Ptrees (sequences) a=a 5 a 6 a 1 ’a 2 ’a 3 ’a 4 ’=(000000). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

UF NNC Ptree Ex. 1 using 0-D Ptrees (sequences) a=a5 a6

a1’a2’a3’a4’=(000000)

C'11001011101100110

d1 d2

t1 t2t1 t3t1 t5t1 t6t2 t1t2 t7t3 t1t3 t2t3 t3t3 t5t5 t1t5 t3t5 t5t5 t7t6 t1 t7 t2t7 t5

a1

11110000000000100

a2 00001111111111000

a3

11111100000000111

a4

000000000 01111011

a5

0000111111000 0100

a6

00001100000000000

a7

11110000001111011

a8

11110000001111011

a9

00000011110000100

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

a5‘ 110100011001000 10

a6‘10000001000000010

a7‘ 00101110011011101

a8‘00101110011011101

a9‘01010000100100000

a6

11110011111111111

a5

11110000001111011

Identifying all training tuples in the distance=0 ring or 0ring, centered at a (exact matches ) as 1-bits of the Ptree, P=

a5^a6^a1’^a2’^a3’^a4’ (we use _ for complement)

There are no training points in a’s 0ring!We must look further out, i.e., a’s 1ring

P00000000000000000

0 1Vote histogram

(so far)

Page 2: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

C

11111111110000000

C

00000000001111111

UF NNC Ptree ex-1 (cont.) a’s 1ring? a=a5 a6 a1’a2’a3’a4’ = (000000)

C'11001011101100110

d1 d2

t1 t2

t1 t3

t1 t5

t1 t6

t2 t1

t2 t7

t3 t1

t3 t2

t3 t3

t3 t5

t5 t1

t5 t3

t5 t5

t5 t7

t6 t1 t7 t2

t7 t5

a1

11110000000000100

a2 00001111111111000

a3

11111100000000111

a4

000000000 01111011

a5

00001111110000100

a6

00001100000000000

a7

11110000001111011

a8

11110000001111011

a9

00000011110000100

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

a5‘ 110100011001000 10

a6‘10000001000000010

a7‘ 00101110011011101

a8‘00101110011011101

a9‘01010000100100000

Training pts in the 1ring centered at a are given by 1-bits in the Ptree, P, constructed as follows:

0 1P01000000000100000

The C=1 vote count = root count of P^C.The C=0 vote count = root count of P^C.(never need to know which tuples voted)

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

00001111110000100

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

0 0001100000000000

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

00011010001000100

a6

11110011111111111

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

1 1100001110110011

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

1 0011111001001110

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

a4‘

0 0100100010011001

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

OR

a5^a6^a1’^a2’^a3’^a4’a5^a6^a1’^a2’^a3’^a4’

(100000)

a5^a6^a1’^a2’^a3’^a4’

(010000)

a5^a6^a1’^a2’^a3’^a4’

(001000)

a5^a6^a1’^a2’^a3’^a4’

(000100)

a5^a6^a1’^a2’^a3’^a4’

(000010)(000001)

(a5 a6 a1’a2’a3’a4’)

Page 3: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

a’s 2-ring? a=a5 a6 a1’a2’a3’a4’ = (000000)

d1 d2

t1 t2

t1 t3

t1 t5

t1 t6

t2 t1

t2 t7

t3 t1

t3 t2

t3 t3

t3 t5

t5 t1

t5 t3

t5 t5

t5 t7

t6 t1 t7 t2

t7 t5

a5

0000111111000 0100

a6

00001100000000000

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

For each of the following Ptrees, a 1-bit corresponds to a training point in a’s 2-ring:Pa5a6a1’a2‘a3‘a4‘ Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2‘a3’a4‘ Pa5a6

a1‘a2‘a3‘a4’Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2‘a3’a4‘ Pa5a6 a1‘a2‘a3‘a4’Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2‘a3’a4‘ Pa5a6 a1‘a2‘a3‘a4’Pa5a6 a1‘a2‘a3’a4‘ Pa5a6 a1‘a2‘a3‘a4’Pa5a6 a1‘a2‘a3‘a4’ 1st line first:

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

0 0001100000000000

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

00011010001000100

a6

11110011111111111

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

1 1100001110110011

a1‘

11100101110111011

a6

11110011111111111

a4‘

11011011101100110

a3‘

1 0011111001001110

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a4‘

0 0100100010011001

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

00001111110000100

a5

00001111110000100

a5

00001111110000100

a5

00001111110000100

a5

00001111110000100

0 1

(110000)(101000)(100100)(100010)(100001)

Stop here? But the other 10 Ptrees should also be considered. The fact that the 2-ring includes so many new training points is “The curse of demensionality”.

Page 4: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

Enfranchising the rest of a’s 2-ring? a=a5 a6 a1’a2’a3’a4’ = (000000)

d1 d2

t1 t2

t1 t3

t1 t5

t1 t6

t2 t1

t2 t7

t3 t1

t3 t2

t3 t3

t3 t5

t5 t1

t5 t3

t5 t5

t5 t7

t6 t1 t7 t2

t7 t5

a5

0000111111000 0100

a6

00001100000000000

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

0 1

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

00011010001000100

a6

0 0001100000000000

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

11100001110110011

a1‘

11100101110111011

a6

0 0001100000000000

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

11100001110110011

a2‘

00011110001001100

a1‘

11100101110111011

a6

0 0001100000000000

a5

1111000000111 1011

a4‘

00100100010011001

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

0 0001100000000000

a5

1111000000111 1011

For each of the following Ptrees, a 1-bit corresponds to a training point in a’s 2-ring:Pa5a6a1’a2‘a3‘a4‘ Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2‘a3’a4‘ Pa5a6

a1‘a2‘a3‘a4’Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2‘a3’a4‘ Pa5a6 a1‘a2‘a3‘a4’Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2‘a3’a4‘ Pa5a6 a1‘a2‘a3‘a4’Pa5a6 a1‘a2‘a3’a4‘ Pa5a6 a1‘a2‘a3‘a4’Pa5a6 a1‘a2‘a3‘a4’ 2nd line:

Page 5: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

Enfranchising the rest of a’s 2-ring (cont.) a=a5 a6 a1’a2’a3’a4’ = (000000)

d1 d2

t1 t2

t1 t3

t1 t5

t1 t6

t2 t1

t2 t7

t3 t1

t3 t2

t3 t3

t3 t5

t5 t1

t5 t3

t5 t5

t5 t7

t6 t1 t7 t2

t7 t5

a5

0000111111000 0100

a6

00001100000000000

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

0 1

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

11100001110110011

a1‘

00011010001000100

a6

11110011111111111

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

10011111001001110

a2‘

00011110001001100

a1‘

00011010001000100

a6

11110011111111111

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

00011010001000100

a6

11110011111111111

a5

1111000000111 1011

For each of the following Ptrees, a 1-bit corresponds to a training point in a’s 2-ring:Pa5a6a1’a2‘a3‘a4‘ Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2‘a3’a4‘ Pa5a6

a1‘a2‘a3‘a4’Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2‘a3’a4‘ Pa5a6 a1‘a2‘a3‘a4’Pa5a6 a1‘a2’a3‘a4‘ Pa5a6 a1‘a2‘a3’a4‘ Pa5a6 a1‘a2‘a3‘a4’Pa5a6 a1‘a2‘a3’a4‘ Pa5a6 a1‘a2‘a3‘a4’Pa5a6 a1‘a2‘a3‘a4’ 3rd line:

Page 6: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

PNNC vote = 1/(1/d)

C'11001011101100110

d1 d2

t1 t2t1 t3t1 t5t1 t6t2 t1t2 t7t3 t1t3 t2t3 t3t3 t5t5 t1t5 t3t5 t5t5 t7t6 t1 t7 t2t7 t5

a1

11110000000000100

a2 00001111111111000

a3

11111100000000111

a4

000000000 01111011

a5

0000111111000 0100

a6

00001100000000000

a7

11110000001111011

a8

11110000001111011

a9

00000011110000100

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

a5‘ 110100011001000 10

a6‘10000001000000010

a7‘ 00101110011011101

a8‘00101110011011101

a9‘01010000100100000

d(p,q) = {wi : p & q differ at i; i in the relevant_attribute_set}

0-ring at 0 1 1 0 1 1 0 0 0 0 0 1 1 0 0 1 1 0 1

Weights:

0 1 1 0 1 1 0 0 0 0 0 1 1 0 0 1 1 0 1

One way to address the curse of dimensionality is to require that all relevant attribute weights be different (except for small groups (3?) of equally weighted attributes, and that the next weight always be larger than the previous weight-sum.

12 2 612 24a7‘ 00101110011011101

Pts in the 47-disk, eachGets at least a vote of 1/48

Vote Weight

.02 .02.02.02 .02.02 .02.02.02 .02

a5

0000111111000 0100

Vote Weight

.02 .06.06.06 .06.02 .02.02.06 .02

Pts in the 47-disk, eachGets at least a vote of 1/24

a3‘

10011111001001110

Pts in the 47-disk, eachGets at least a vote of 1/12

Vote Weight

.02 .12.12.12 .06.02 .02.02.12 .02

Page 7: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

a4‘

11011011101100110

a3‘

10011111001001110

a2‘

00011110001001100

a1‘

11100101110111011

PNNC using weights (about the only way to address the curse of dimensionality)“Gaussian” type of vote weighting = 1/e-dis2

C'11001011101100110

d1 d2

t1 t2t1 t3t1 t5t1 t6t2 t1t2 t7t3 t1t3 t2t3 t3t3 t5t5 t1t5 t3t5 t5t5 t7t6 t1 t7 t2t7 t5

a1

11110000000000100

a2 00001111111111000

a3

11111100000000111

a4

000000000 01111011

a5

0000111111000 0100

a6

00001100000000000

a7

11110000001111011

a8

11110000001111011

a9

00000011110000100

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

a5‘ 110100011001000 10

a6‘10000001000000010

a7‘ 00101110011011101

a8‘00101110011011101

a9‘01010000100100000

a6

00001100000000000

a5

11110000001111011

d(p,q) = {weighti : p & q differ at i}

P00000000000000000

1 5 2 2 9 4 4 9 9 2 1 6 6 9 9 9 9 9

Page 8: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

a’s 1ring? a=a5 a6 a1’a2’a3’a4’ = (010010)

C'11001011101100110

d1 d2

t1 t2

t1 t3

t1 t5

t1 t6

t2 t1

t2 t7

t3 t1

t3 t2

t3 t3

t3 t5

t5 t1

t5 t3

t5 t5

t5 t7

t6 t1 t7 t2

t7 t5

a1

11110000000000100

a2 00001111111111000

a3

11111100000000111

a4

000000000 01111011

a5

00001111110000100

a6

00001100000000000

a7

11110000001111011

a8

11110000001111011

a9

00000011110000100

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

a5‘ 110100011001000 10

a6‘10000001000000010

a7‘ 00101110011011101

a8‘00101110011011101

a9‘01010000100100000

P00000000000000000

a4‘

11011011101100110

a3‘

10011111001001110

a2‘

00011110001001100

a1‘

11100101110111011

a6

00001100000000000

a5

00001111110000100

a4‘

11011011101100110

a3‘

10011111001001110

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

attribute weights (1, 1, 3, 3, 3, 3) vote weight = 1/(1+distance) d(p,q) = {weighti : p & q differ at i}

(110010)(000010)

Page 9: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

a’s 2ring? a=a5 a6 a1’a2’a3’a4’ = (010010)

C'11001011101100110

d1 d2

t1 t2

t1 t3

t1 t5

t1 t6

t2 t1

t2 t7

t3 t1

t3 t2

t3 t3

t3 t5

t5 t1

t5 t3

t5 t5

t5 t7

t6 t1 t7 t2

t7 t5

a1

11110000000000100

a2 00001111111111000

a3

11111100000000111

a4

000000000 01111011

a5

00001111110000100

a6

00001100000000000

a7

11110000001111011

a8

11110000001111011

a9

00000011110000100

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

a5‘ 110100011001000 10

a6‘10000001000000010

a7‘ 00101110011011101

a8‘00101110011011101

a9‘01010000100100000

P00000000000000000

a4‘

11011011101100110

a3‘

10011111001001110

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

00001111110000100

Distance fctn: d(p,q) = {weighti : p & q differ at i} vote function: vote = 1/(1+distance)

(100010)

Page 10: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

a4‘

00100100010011001

a3‘

10011111001001110

a2‘

11100001110110011

a1‘

00011010001000100

C'11001011101100110

d1 d2

t1 t2t1 t3t1 t5t1 t6t2 t1t2 t7t3 t1t3 t2t3 t3t3 t5t5 t1t5 t3t5 t5t5 t7t6 t1 t7 t2t7 t5

a1

11110000000000100

a2 00001111111111000

a3

11111100000000111

a4

000000000 01111011

a5

0000111111000 0100

a6

00001100000000000

a7

11110000001111011

a8

11110000001111011

a9

00000011110000100

C11111111110000000

a1‘00011010001000100

a2‘ 11100001110110011

a3‘ 10011111001001110

a4‘ 00100100010011001

a5‘ 110100011001000 10

a6‘10000001000000010

a7‘ 00101110011011101

a8‘00101110011011101

a9‘01010000100100000

a6

00001100000000000

a5

0000111111000 0100

Appendix: scratch slides

Page 11: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

a4‘

00100100010011001

a3‘

10011111001001110

a2‘

11100001110110011

a1‘

00011010001000100

a6

00001100000000000

a5

0000111111000 0100

a4‘

00100100010011001

a3‘

10011111001001110

a2‘

11100001110110011

a1‘

00011010001000100

a6

00001100000000000

a5

0000111111000 0100

a4‘

00100100010011001

a3‘

10011111001001110

a2‘

11100001110110011

a1‘

00011010001000100

a6

00001100000000000

a5

0000111111000 0100

a4‘

00100100010011001

a3‘

10011111001001110

a2‘

11100001110110011

a1‘

00011010001000100

a6

00001100000000000

a5

0000111111000 0100

a4‘

00100100010011001

a3‘

10011111001001110

a2‘

11100001110110011

a1‘

00011010001000100

a6

00001100000000000

a5

0000111111000 0100

Appendix: scratch slides

a4‘

00100100010011001

a3‘

10011111001001110

a2‘

11100001110110011

a1‘

00011010001000100

a6

00001100000000000

a5

0000111111000 0100

Page 12: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

Appendix: scratch slides

a4‘

11011011101100110

a3‘

01100000110110001

a2‘

00011110001001100

a1‘

11100101110111011

a6

11110011111111111

a5

1111000000111 1011

Page 13: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

RRN

000102030405060708091011121314 1516

a0

11110000000000100

a1 00001111111111000

a2

11111100000000111

a3

000000000 01111011

a4

0000111111000 0100

a5

00001100000000000

B61

11110000001111011

B62

11110000001111011

B63

00000011110000100

C11111111110000000

B71

00011010001000100

B72

11100001110110011

C81 10011111001001110

C82 00100100010011001

C83 110100011001000 10

C91

10000001000000010

C921 00101110011011101

C922

00101110011011101

C93

01010000100100000

Appendix: Comprehensive Attribute types exampleR( a0, a1, a2, a3, a4, a5, B6,B7, C8,C9,C)

Bit-maps for categorical values from, possibly several, flat categorical attributes.

Numeric attributes: domains {0..7},{0..3}

Hierarchical categorical (e.g., leaf_weights; inode_weights=sums) C8 C9

dairy sundries / | \ / | \milk egg butter crafts knits toys / \ needles pins 1

1 1 1 2

1

2

53

Assume bfr=3, so RID = (3quotient,3remainder)

1

Class Label

a0

00001111111111011

a1 11110000000000111

a2

00000011111111000

a3

11111111110000100

a4

11110000001111011

a5

11110011111111111

B61

00001111110000100

B62

00001111110000100

B63

11111100001111011

B71

11100101110111011

B72

00011110001001100

C81 01100000111110001

C82 11011011101100110

C83 001011100110111 01

C91

01111110111111101

C921 11010001100100010

C922

11010001100100010

C93

10101111011011111

Page 14: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

RRN a0 a1 a2 a3 a4 a5 B6 B7 C8 C9 C00 1 0 1 0 0 0 6 1 {m, b} {c } 101 1 0 1 0 0 0 6 1 { b} { t} 102 1 0 1 0 0 0 6 1 { e } { n,p } 103 1 0 1 0 0 0 6 2 {m, b} { t} 104 0 1 1 0 1 1 0 2 {m } { n,p } 105 0 1 1 0 1 1 0 0 {m,e } { n,p } 106 0 1 0 0 1 0 1 2 {m } { n,p } 107 0 1 0 0 1 0 1 1 {m, b} {c } 108 0 1 0 0 1 0 1 1 { b} { t} 109 0 1 0 0 1 0 1 1 { e } { n,p } 110 0 1 0 1 0 0 6 2 {m } { n,p } 011 0 1 0 1 0 0 6 1 { b} { t} 012 0 1 0 1 0 0 6 1 { e } { n,p } 013 0 1 0 1 0 0 6 0 {m,e } { n,p } 014 1 0 1 0 1 0 1 2 {m } { n,p } 015 0 0 1 1 0 0 6 1 {m, b} {c } 016 0 0 1 1 0 0 6 1 { e } { n,p } 0

Distance fctn: d(p,q)={wi : p & q differ at i} must have: w61=2*w62=4*w63

w71=2*w72

w93=2*w921=2*w922=2*w91 w83=

w82=w81

Vote function: vote= 1/(1+distance)

RRN

0001020304050607080910111213141516

a0

11110000000000100

a1 00001111111111000

a2

11111100000000111

a3

000000000 01111011

a4

0000111111000 0100

a5

00001100000000000

B61

11110000001111011

B62

11110000001111011

B63

00000011110000100

C11111111110000000

B71

00011010001000100

B72

11100001110110011

C81 10011111001001110

C82 00100100010011001

C83 11010001100100010

C91

10000001000000010

C921 00101110011011101

C922

00101110011011101

C93

01010000100100000

a0

00001111111111011

a1 11110000000000111

a2

00000011111111000

a3

11111111110000100

a4

11110000001111011

a5

11110011111111111

B61

00001111110000100

B62

00001111110000100

B63

11111100001111011

B71

11100101110111011

B72

00011110001001100

C81 01100000111110001

C82 11011011101100110

C83 001011100110111 01

C91

01111110111111101

C921 11010001100100010

C922

11010001100100010

C93

10101111011011111

wi : 1 0 0 2 9 0 8 4 2 6 3 3 3 3 2 2 2 4

Page 15: a 4 ‘   1 1 0 1 1 0 1 1 1 0 1 1 0 0 1 1 0

Distance fctn: d(p,q)={wi : p & q differ at i} must have: w61=2*w62=4*w63

w71=2*w72

w93=2*w921=2*w922=2*w91 w83=

w82=w81

Vote function: vote= 1/(1+distance)

a0

11110000000000100

a1 00001111111111000

a2

11111100000000111

a3

000000000 01111011

a4

0000111111000 0100

a5

00001100000000000

B61

11110000001111011

B62

11110000001111011

B63

00000011110000100

C11111111110000000

B71

00011010001000100

B72

11100001110110011

C81 10011111001001110

C82 00100100010011001

C83 11010001100100010

C91

10000001000000010

C921 00101110011011101

C922

00101110011011101

C93

01010000100100000

a0

00001111111111011

a1 11110000000000111

a2

00000011111111000

a3

11111111110000100

a4

11110000001111011

a5

11110011111111111

B61

00001111110000100

B62

00001111110000100

B63

11111100001111011

B71

11100101110111011

B72

00011110001001100

C81 01100000111110001

C82 11011011101100110

C83 001011100110111 01

C91

01111110111111101

C921 11010001100100010

C922

11010001100100010

C93

10101111011011111

0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 1 1 0 unclassified sample 0-ring:

1 0 0 3 9 0 8 4 2 6 3 3 3 3 2 2 2 4 wi a0

00001111111111011

a3

000000000 01111011

a4

11110000001111011

B61

11110000001111011

B62

11110000001111011

B63

00000011110000100

B71

11100101110111011

B72

11100001110110011

C81

01100000111110001

C82

11011011101100110

C83

11010001100100010

C91

01111110111111101

C921

00101110011011101

C922

00101110011011101

C93

10101111011011111