ac#ve&documents&and&ac#ve&xml& - serge abiteboul · 2012-04-04 · 2 ∪&...

51
Ac#ve documents and Ac#ve XML Serge Abiteboul INRIA Saclay, Collège de France, ENS Cachan 4/4/12 1 4/4/12 1

Upload: others

Post on 01-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Ac#ve  documents  and  Ac#ve  XML  

Serge  Abiteboul  INRIA  Saclay,  Collège  de  France,  ENS  Cachan  

4/4/12   1  4/4/12   1  

Page 2: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Organiza#on  

Introduc#on  Modeling  data  intensive  distributed  systems  Query  op#miza#on  in  distributed  systems  Monitoring  in  distributed  systems  Task  sequencing  in  distributed  systems  Conclusion    

2  

Page 3: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Introduc#on  

Page 4: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Context:  Web  data  management  

Scale:  lots  of  servers,  large  volume  of  data  

Servers  are  autonomous  (heterogeneous  also)  

Data  may  be  very  dynamic,  heavy  update  rates  

Peers  are  possibly  moving    

 

 

   

 

4  

   

Rela#on   →   Tree  

Centralized     →   Distributed    

Precise  data   →   Incomplete,  probabilis#c    

Precise    schemas   →   Ontologies    

 The  focus  in  this  class  

Page 5: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

The  lesson  from  the  past  

The  success  of  the  rela#onal  model  with    2D-­‐tables  on  local  servers  

–  A  logic  for  defining  tables  –  An  algebra  for  describing  query  plans  over  tables  

We  should  do  similarly  for  trees  in  a  distributed  environment    –  A  logic  for  defining  distributed  trees  and  data  services  –  An  algebra  for  op#mizing  queries  over  trees/services  

5  

Page 6: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Roadmap  

1.  Modeling:  the  AXML  model  of  ac#ve  documents  

–  Views:  to  capture  inten#onal  data  –  Streams:  to  capture  exchanges  of  data  and  evolu#on    

 2.  Op#miza#on:  an  algebra  for  AXML  3.  Monitoring:  based  on  AXML  documents  4.  Task  sequencing:  A  workflow  based  on  AXML  documents  

–  In  the  spirit  of  business  ar#facts  

6  

Key  concept  for    Data  management  

Key  concept  for    distribu#on  and  

evolu#on  

Page 7: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Modeling  data  intensive  distributed  systems  

Ac#ve  XML  

Page 8: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Ac#ve  XML  

Based  on  Web  standards:        XML  +  Web  services  +  Xpath/Xquery    

Idea:  Exchange  XML  documents  with  embedded  func#on  calls                              XML:  Unordered,  unranked,  labeled                                        trees    

–  Internal  nodes  are  labeled  by  tags  –  Leaves  are  labeled  by  tags,  data  –  Set  seman#cs:    No  isomorphic  sibling  sub-­‐trees  

 The  func#ons  are  interpreted  as  calls  to  external  services  

–  Embedding  calls  in  data  is  an  old  idea  in  databases        

    8  

a  

b  

c   d  

b  

d  

Ac@ve                                                                                                                                      ,    evolving    

,  or  func@on  symbols  

Page 9: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Example  

9  

t

t2          m2            

root@p1  

!songs@p2   !songs@p3  t

t1            m1          

t

t3            m3          !f3  

songs  

Leads  to  evolving  trees  –  Inten#onal  data:  get  the  data  only  when  desired  –  Dynamic  data:  If  data  sources  change,  the  document  changes  –  Flexible  data:  adapt  to  the  needs  –  Func#on  in  push  &  pull  mode  

!songs@p1  t

t4        m4          

t

t5        m5          !f5  

Page 10: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Query  root/songs/t  

10  

t

t2          m2            

root@p1  

!songs@p2   !songs@p3  t

t1            m1          

t

t3            m3          !f3  

songs  

!songs@p1  t

t4        m4          

t

t5        m5          !f5  

Recursive  calls  

t tt

Page 11: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

root//t[//singer/“Brel”]  

11  

t

t2          m2            

root@p1  

!songs@p2   !songs@p3  t

t1            m1          

t

t3            m3          !f3  

songs  

!songs@p1  t

t4        m4          

Push  queries  to  data  sources  –  !songs@p3:  root//t[//singer/“Brel”]  –  !songs@p2  root//t[//singer/“Brel”]  –  !songs@p1:  root//t[//singer/“Brel”]  –  Distributed  query/subquery  (or  Magic  Set)  

 

t

Page 12: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

This  is  distributed  datalog  over  trees    

songs@p1(x,y)    :-­‐  t@p1(x,y)  songs@p1(x,y)  :-­‐  songs@p2(x,y)  songs@p1(x,y)  :-­‐  songs@p3(x,y)        songs@p2x,y)    :-­‐  t@p1(x,y)  songs@p2(x,y)  :-­‐  songs@p1(x,y)  songs@p2(x,y)  :-­‐  songs@p3(x,y)        songs@p3(x,y)    :-­‐  t@p1(x,y)  songs@p3(x,y)  :-­‐  songs@p1(x,y)  songs@p3(x,y)  :-­‐  songs@p2(x,y)    

             

12  

:-­‐  songs@p1(x,  y),  P(x)  

:-­‐  songs@p2(x,  y),  P(x)  

:-­‐  songs@p3(x,  y),  P(x)  

:-­‐  songs@p1(x,  y),  P(x)  

:-­‐  songs@p2(x,  y),  P(x)  

:-­‐  songs@p1(x,  y),  P(x)  

Page 13: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Fun  issues:  The  seman#cs  of  calls  

When  to  ac#vate  the  call?  –  Explicit  pull  mode:  ac#ve  databases    –  Implicit  pull  mode:  deduc#ve  databases    –  Push  mode:  query  subscrip#on  

What  to  do  with  its  result?  How  long  is  the  returned  data  valid?      Sending  an  AXML  documents:  evaluate  the  service  calls  before  sending  or  not?  

13  

Page 14: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Exchanging  AXML  data  

           Web  services  exchange  inten#onal  documents    Materializa#on  can  be  performed  

–  by  the  sender,  before  sending  a  document  or    –  by  the  receiver,  amer  receiving  it.  

  14  

   

GetEvents  

“Exhibits”  

newspaper  

@tle   date  

“Le  Monde”  “06/10/2003”  

GetTemp  

city  

“Paris”  

Tran

sfer

Matisse...

Matisse...

Matisse...

Page 15: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Tran

sfer

Exchanging  AXML  data  

           Web  services  exchange  inten#onal  documents    Materializa#on  can  be  performed  

–  by  the  sender,  before  sending  a  document  or    –  by  the  receiver,  amer  receiving  it.  

 

   

GetEvents  

“Exhibits”  

newspaper  

@tle   date  

“Le  Monde”  “06/10/2003”  

GetTemp  

city  

“Paris”  

Matisse...

Matisse...

15  

Page 16: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Some  reasons  for  not  materializing  data    before  sending  the  document  

Freshness  –  The  receiver  will  get  up-­‐to-­‐date  informa#on  when  needed  

Security    –  Only  the  receiver  has  the  creden#al  to  call  the  service  –  One  needs  to  record  who  is  actually  using  the  data  

Performance  –  To  save  on  the  bandwidth  of  the  sender  

To  delegate  work  to  someone  else    How  to  specify  it:  cas#ng  based  on  types  ☞  jewel  sec#on          

16  

Page 17: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Complex  issues  

Brings  to  a  unique  seong    

 distributed  db    deduc#ve  db    ac#ve  db  

         stream  data    warehousing  &  media#on    

 

This  seems  to  us  necessary  for  capturing  all  the  facets  of  data  management  in  distributed  systems  

This  is  unreasonable?  Yes!  

17  

Page 18: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Query  op#miza#on  in  distributed  systems  

Ac#ve  XML  Algebra  op#miza#on  

Page 19: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

AXML  system  

A  system  =  a  set  of  peers    –  Each  peer  provides  storage  and  query  

processing    –  Each  peer  hosts  ac#ve  documents  Extensional  data  Inten#onal  data  (query  calls  in  the  document)    

Problem:      Given  a  query  q  at  some  peer    evaluate  the  answer  to  q  with    op#mal  response  #me  

Query processor

Optimizer

Peer

Com

mun

icat

ion

AXML docs

Stats Workspace

19  

Page 20: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Local  and  global  query  processing  

Local  processing  ☛ Input/output  streams  Local  query  op#miza#on  

Global  processing  ☛  Streams  for  communica#ons  

Global  query  op#miza#on  ☛  Delegate  work  to  other  peers    

 

input stream

π

π

input stream

σ

output stream

p1 p2

p3 p4

20  

Page 21: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Example  1:  Local  and  global  op#miza#on  

Peer 1

rcv1 rcv2

∪  

σ

snd1

R

Peer 3

Peer 2

snd2

S

Peer 1

rcv1 rcv2

R

∪  

snd1

σ

Peer 3

Peer 2

snd2

σ

S

Global Rewriting:

Push selections to sources

p3 asks for σ ( R@p1 ∪ S@p2 )

Peer 1

rcv1 rcv2

σ

∪  

snd1

R

Peer 3

Peer 2

snd2

S

σ

Local Rewriting: Selection &

Union commute

21  

Page 22: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Example  2:  MapReduce  

Peer 1

rcv1 rcv2

snd1

R

Peer 3

Peer 2

snd2

S

Peer 1

rcv1 rcv3

Sn2

Middle- ware 1

Peer 2

snd1

R

map

snd4 snd3

snd5

rcv2 rcv4 Middle- ware 2

snd5

rcv5 Peer 3

R

map

22  

Page 23: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

The  Ac#ve  XML  algebra  

b

rcv2 q

root

rcv1

snd2

a

a

rcv2 b

Passive  nodes  Annotated  with  labels      

q

root a b

Query  nodes  

 Annotated  with  queries  

 For  instance  Tree-­‐Patern-­‐Queries  

   

 

 

Send/Receive  nodes  

 Annotated  with  channel  ids    

 

snd2 rcv2 rcv1

channel snd2

snd2

rcv2

rcv2 channel

rcv1

rcv1 Input

Internal channel   Input channel (no snd)  

23  

Page 24: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Evolu#on  of  a  system  

A  system  evolves  by  ac#va#ng:  –  a  query  node  –  a  send/receive  node  on  an  internal  channel  –  a  receive  node  on  an  input  channel  

24  

Page 25: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Equivalence  problem  for  AXML  systems  

No  query   TPQ   TPQ  with  XPath  joins  

TPQ  with  joins  

TPQ  with  constructor  

No  input   PTIME   PTIME   PTIME   Hard   Undecidable  

Input   PTIME   Hard   Hard   ?   Undecidable  

Complexity increases with: –  richer query language –  the presence of input

Axiomatization of equivalence in absence of queries  

25  

Page 26: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Op#miza#on  

As  usual  Use  algebraic  rewri#ng  rules  Use  simplis#c  es#mators  for  query  plans  Use  heuris#cs  to  prune  the  search  space  

26  

Page 27: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Examples  of  performance  op#miza#on  techniques  

Externalize  data  in  devices  with  limited  capabili#es  –  Cell  phone,  tablets,  home  appliances…  –  Limited  storage  space,  computa#onal  power,  network  bandwidth  

Replicate  documents  and  services  –  To  allow  for  “local”  computa#on    –  To  increase  parallelism  

27  27  

Page 28: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Externalize  and  replica#on  

28  

Page 29: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Monitoring  in  distributed  systems  

The  Axlog  system  

Page 30: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Monitoring  distributed  systems  

Distributed  applica#ons  are  omen  very  dynamic  –  Content  change  rapidly  –  Intense  communica#ons  –  Peers  some#mes  come  and  leave    

Complex  and  hard  to  control  such  systems  –  Many  peers  –  Peers  are  distributed  &  autonomous  –  Peers  are  some#mes  unreliable  and  selfish  

Goal:  monitor  such  systems  

30  

Page 31: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

                   Architecture              

31  

publishers    

Alerters  

Streams  

Stream  processors  

ac#ons  

RSS  

Axlog  processor  

Page 32: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Axlog  principle  =  ac#ve  document  &  query  

Incoming  streams  of  updates  The  outgoing  stream  is  defined  

by  a  query  Q  (e.g.  TPQ)  Each  #me  an  incoming  

message  arrives,  it  modifies  the  document  so  possibly  the  query  result  

The  output  stream  specifies  how  the  view  is  modified  

 Incremental  view  maintenance  

Query  

AXML  document    

Updates  

32  

Page 33: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Axlog  engine  

Datalog  is  used  to  evaluate  queries  with  benefit  from  –  Incremental  view  maintenance  in  datalog      Δ  technique  –  Query  op#miza#on  in  datalog        MagicSet  –  Constraint  query  languages        CQL  

Specific  techniques    –  Push  queries  to  the  sources  to  avoid  loading  irrelevant  data  –  Use  of  FSA  on  XML  inputs:  YFilter  

33  

Page 34: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Task  sequencing  in  distributed  systems  

Page 35: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Task  sequencing  and  verifica#on  

•  Task  sequencing  is  a  major  difficulty  for  distributed  systems  –  Difficulty  to  integrate  workflow  and  database  

systems  

•  Verifica#on  of  temporal  proper#es  is  hard  –  Typically  verifica#on  is  harder  than  evalua#on  

•  Evalua#ng  an  FO  query  is  p#me  data  complexity  •  Verifying  that  Q  ⊆  Q’  is  undecidable  

–  Verifica#on  will  be  the  topic  of  the  seminar  by    Victor  Vianu  

35  

 DBMSs  exchanging  data  

Workflow  systems  sequencing  tasks  

Page 36: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Example:    Dell  Supply  Chain  

Customer  Web  Store   Bank  

Plant  

Warehouse  

Shipping

Supplier  

36  

Page 37: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

AXML  as  business  ar2facts  

Concept  introduced  by  IBM  [Nigam  &  Caswell  03,  Hull  &  Su  07]  

Data-­‐centric  workflows  −  A  process  is  described  by  a  document  

(possibly  moving  in  the  enterprise)  −  The  behavior  of  an  ar#fact  is  specified  

by  some  constraints  on  its  evolu#on  

Vs.  state-­‐transi#on-­‐based  workflows  •  Based  on  some  form  of  state  transi#on  

diagrams  (BPEL,  Petri,…)  

•  Mostly  ignore  data  

 

webOrder id=7787780 Customer

Name: John Doe Address: Sèvres

Product: committed Ref: PC 456

Factory: Milano Parts: waiting orderDate: 2009/07/24 Site: http:// d555.com Payment: done

Bank-account … Delivery: not-active

37  

Page 38: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Axml  Ar#facts  move  between  peers  

webOrder id=7787780 Customer

Name: John Doe Address: Sèvres

Order selection: on-going Ref: PC 456

Factory: undecided Parts: not-active orderDate: 2009/07/24 Site: http://d555.com Payment: pending Delivery: not-active

webOrder id=7787780 Customer

Name: John Doe Address: Sèvres

Order selection : committed Ref: PC 456

Factory: Milano Parts: on-going orderDate: 2009/07/24 Site: http:// d555.com Payment: done

Bank-account … Delivery: not-active

webOrder id=7787780 Customer

Name: John Doe Address: Sèvres

Order selection : committed Ref: PC 456

Factory: Milano Parts: done orderDate: 2009/07/24 Site: http:// d555.com Payment: done Bank-account: CEIF-4457889 Delivery: on-going Address: Orsay

In  webStore                In  plant                        In  delivery    

38  

Page 39: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

catalogue  

WEBSTORE   PLANT   DELIVERY  

CREDIT  APPROVAL   WAREHOUSE   ARCHIVE  

39  

Page 40: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Sequencing  of  opera#ons  

Different  ways  of  expressing  sequencing  of  tasks  –  Guards:  precondi#ons  for  func#on  calls  –  Transi#on-­‐based  diagrams  –  Formulas  in  temporal  logic  

Study  how  they  can  simulate  each  other  using  some  “scratch  paper”  

40  

Page 41: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

A  jewel  of  ac#ve  documents  

Cas#ng  document  to  a  target  type  

Page 42: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

The  cas#ng  problem  

Given    –  An  ac#ve  document  I    –  The  signature  of  the  func#ons  –  And  a  target  type  T  

Which  func#ons  to  call  to  be  sure  to  reach  T?  2-­‐player  game  

–  Juliet  chooses  which  func#on  to  call  –  Romeo  chooses  a  value  within  the  domain  of  the  

func#on  Juliet  wins  if  she  can  reach  a  document  in  T         42  

Page 43: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

An  abstrac#on:  ac#ve  context-­‐free  games  

On  words  instead  of  trees  –  Game  (𝚺,R,T)  

•  𝚺  is  a  finite  alphabet  •  R  set  of  CF  rules  •  T  is  a  regular  target  language  

–  w  is  the  start  word  Output:  true  if  Juliet  has  a  winning  strategy    Alterna#on  of    

 ∃  states  (Juliet  pick  next  func#on  to  call)  and      ∀  states  (the  adversary  Romeo  picks  the  answer)  

43  

Page 44: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Examples  

•  Winning  

•  Start  word  aba  •  Strategy  

–  Call  the  second  a  –  Call  all  the  c’s  –  Obtain  a  word  in  Target  

•  Losing  

•  Start  word  ab  •  No  strategy  

–  Ini#ally    #(a)  –  #(b)  =  0  –  If  I  call  a  or  b,  #(a)  –  #(b)  <  0  

44  

a→abc*;  b→(ba)*b;  c→ab  Target  abab(ab)*  

Page 45: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Fun  rewri#ng  game  

The  problem  is  undecidable  in  general    Interes#ng  decidable  subcases  

–  MuschollSchwen#ckSegoufin  –  Juliet  has  to  traverse  the  string  from  lem  to  right  –  No  recursion  among  func#on  calls  –  Func#on  call  are  “linear”  

Also  in  prac#ce,  very  efficient  cas#ng  based  on  unambiguous  grammars      

45  

Page 46: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Conclusion  

Page 47: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Some  works  around  Axml  

The  Axml    system  –  open-­‐source    (on  server,  on  smartphone)  The  useful:  Replica#on  and  query  op#miza#on    

How  to  evaluate  a  query  efficiently  by  taking  advantage  of  replica#on  The  useful:  Lazy  query  evalua#on  

How  to  evaluate  a  query  without  calling  all  embedded  services    The  fun:  Cas#ng  problem  

Which  func#ons  to  call  to  “match”  a  target  type    Ac#ve  context-­‐free  games  

The  exo#c  –  Diagnosis  of  communica#on  systems  based  on  datalog  op#miza#on  –  Access  control  –  Distributed  design  –  Probabilis#c  genera#on  of  documents  

47  

Page 48: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

We  will  come  back  to  distribu#on  

Lesson  6:  datalog  -­‐  recursion  is  essen#al  Lesson  7:  distributed  data  management  in  general  Lesson  8:  distributed  knowledge  bases  

48  

Page 49: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

Acknowledgements  

With  many  colleagues,  in  par#cular:    –  Tova  Milo  (Tel  Aviv)      Victor  Vianu  (UCSD)  –  Luc  Segoufin  (INRIA)    Ioana  Manolescu  (INRIA)  –  Georg  Gotlob  (Oxford)    Alkis  Polyzo#s  (UCSC)  –  Angela  Bonifa#  (Lille)  Marie-­‐Chris#ne  Rousset  (Grenoble)  –  Balder  ten  Cate  (UCSC)  Yannis  Katsis  (UCSD)  

And  PhD  students  –  Omar  Benjelloun  (Google)  Bogdan  Marinoiu  (SAP)  –  Pierre  Bourhis  (INRIA)  Alban  Galland  (INRIA)  –  Marco  Manna  (Calabria)  Nicoleta  Preda  (Versailles)  –  Zoe  Abrams  (Google)  Emmanuel  Taropa  (Google)  –  Bogdan  Cau#s  (Telecom)    Spyros  Zoupanos  (Max-­‐Planck-­‐Ins#tut)  

And  others  

49  

Page 50: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

4/4/12   50  

Merci !

Page 51: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:

   Sta@c  Analysis  and  Verifica@on    Victor  Vianu,  U.C.  San  Diego    

PhD  from  USC  1983  Sabba#cals  INRIA,  ENS  Cachan,  Ulm,  

Telecom  Interests:    database  theory,  

computa#onal  logic,  Web  data  Co-­‐author  of  Founda2ons  of  databases  

–  Aka  the  Alice  book  Vianu  has  served  as    

–  General  Chair  of  SIGMOD,  PODS,  –  Program  Chair  of  PODS,  ICDT  

Editor-­‐in-­‐Chief  of  the  J.  ACM  ACM  Fellow  

51