haimonti dutta 1 and hillol kargupta 2 1 center for computational learning systems (ccls), columbia...
TRANSCRIPT
![Page 1: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/1.jpg)
Haimonti Dutta1 and Hillol Kargupta2
1Center for Computational Learning Systems (CCLS), Columbia University, NY,
USA.2University of Maryland, Baltimore County,
Baltimore, MD. Also affiliated to Agnik, LLC, Columbia, MD.
Distributed Linear Programming and Resource Management for Data Mining
in Distributed Environments
![Page 2: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/2.jpg)
Motivation
Support Vector (Kernel) Regression An illustration
Support Vector Kernel Regression
Find a function f(x)=y to fit a set of example data points
Problem can be phrased as constrained optimization task
Solved using a standard LP solver
![Page 3: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/3.jpg)
Motivation contd .. Knowledge Based Kernel RegressionIn addition to sample
points, give adviceIf (x ≥3) and (x ≤5)
Then (y≥5)Rules add constraints
about regionsConstraints added to LP
and a new solution (with advice constraints) can be constructed
Fung, Mangasarian and Shavlik,”Knowledge Based Support Vector Machine Classifiers”, NIPS, 2002.
Mangasarian, Shavlik and Wild, “Knowledge Based Kernel Approximation”, JMLR, 5, 1127 – 1141, 2005.
Figure adapted from McLain, Shavlik, Walker and Torrey, “Knowledge-based Support Vector Regression for Reinforcement Learning”, IJCAI, 2005
![Page 4: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/4.jpg)
Distributed Data Mining Applications – An example of Scientific Data Mining in Astronomy
Distributed data and computing resources on the National Virtual Observatory
P2P Data Mining on homogeneously partitioned sky survey
H Dutta, Empowering Scientific Discovery by Distributed Data Mining on the Grid Infrastructure, Ph.D Thesis, UMBC, Maryland, 2007.
Need for distributed optimization strategies
![Page 5: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/5.jpg)
Road MapMotivationRelated WorkFraming an Linear Programming problemThe simplex algorithmThe distributed simplex algorithmExperimental ResultsConclusion and Directions of Future Work
![Page 6: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/6.jpg)
Related WorkResource Discovery in Distributed EnvironmentsImantichi, “Resource Discovery in Large Resource
Sharing Experiments”, Ph.D. Thesis, University of Chicago, 2003.
Livny and Solomon, “Matchmaking: Distributed Resource Management for high throughput computing”, HPDC, 1998.
Optimization TechniquesYarmish, “Distributed Implementation of the Simplex
Method”, Ph.D. Thesis, CIS Polytechnic University, 2001.Hall and McKinnon, “Update procedures for parallel
revised simplex methods, Tech Report, University of Edinburg, UK, 1992
Craig and Reed, “Hypercube Implementation of the Simplex Algorithm”, ACM, pages 1473 – 1482, 1998.
![Page 7: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/7.jpg)
The Optimization Problem
7
Assumptions:n nodes in the networkThe network is staticDataset Di at node iProcessing Cost at i-th node – νi per recordTransportation Cost between i and j – μij
Amount of Data Transferred between nodes – xij
Cost Function Z = Σij μij xij + νi xij = Σij cij xij
![Page 8: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/8.jpg)
Framing the Linear Programming Problem: An illustration
Objective Function z = 6.03x12 +9.04x23 +6.52x15
+8.28x14 +14.42x25 + 9.58x34 + 12.32x45
Constraints C(X) = ∑ijµijxij + νjxij = ∑ijcijxij , Cij
= µij + νij x12 + x14 + x15 ≤ 300; x12 + x25 + x23 ≤ 600; x15+x25+x45 ≤ 300 ; x23+x34 ≤ 300; 0 ≤ x12 ≤ D1; 0 ≤ x23 ≤ D2; 0 ≤ x15 ≤ D1; 0 ≤ x14 ≤ D1; 0 ≤ x25 ≤ D2; 0 ≤ x34 ≤ D3; 0 ≤ x45 ≤ D4
5
600 GB
10.4
7.8
3
1
4
2
300 GB
300 GB
300 GB
300 GB
2.5
8.3
3.8
6.5
6.1
Node V
1 1.23
2 2.23
3 2.94
4 1.78
5 4.02
![Page 9: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/9.jpg)
The Simplex Algorithm
Find x1 ≥ 0, x2 ≥ 0, …. , xn ≥ 0 andMin z = c1 x1 + c2 x2 + …. + cn xn
Satisfying ConstraintsA1 x1 + A2 x2 + ….. + An xn = B
The Simplex Algorithm
a11 a12 …. a1n b1
a21 a22 …. a2n b2
…. …. …. …. ….
am1 am2 … amn bm
c1 c2 … cn zThe simplex tableau
![Page 10: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/10.jpg)
The Simplex Algorithm – Contd …
10
The ProblemMaximize z = x1 + 2x2 – x3
Subject to2x1+ x2+ x3 ≤ 14 4x1+2x2+3x3 ≤ 28 2x1+5 x2+5x3 ≤ 30
The Steps of the Simplex Algorithm (Dantzig)Obtain a canonical representation (Introduce Slack
Variables)Find a Column PivotFind a Row PivotPerform Gauss Jordan Elimination
![Page 11: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/11.jpg)
The simplex tableau and iterations
2 1 1 1 0 0 14
4 2 3 0 1 0 28
2 5 5 0 0 1 30
-1 -2 1 0 0 0 0
x1 x2 x3 s1 s2 s3 B
Pivot Column
Canonical Representation
14/1= 14
28/2=14
30/5= 6
Pivot Row
2 1 1 1 0 0 14
4 2 3 0 1 0 28
2 5 5 0 0 1 30
-1 -2 1 0 0 0 0
![Page 12: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/12.jpg)
Simplex iterations contd …Perform Gauss
Jordan EliminationThe Final Tableau
8/5 0 0 1 0 -1/5 8
16/5 0 1 0 1 -2/5 16
2/5 1 1 0 0 1/5 6
-1/5 0 3 0 0 2/5 12
0 0 -1/2 1 -1/2 0 0
1 0 5/16 0 5/16 -1/8 5
0 1 7/8 0 -1/8 4 4
0 0 49/16 0 1/16 3/8 13
![Page 13: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/13.jpg)
Road MapMotivationRelated WorkFraming an Linear Programming problemThe simplex algorithmThe distributed simplex algorithmExperimental ResultsConclusions and Future Work
![Page 14: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/14.jpg)
The Distributed Problem – An Example
14
Node1 Node 2
Node 5 Node 4
Node 3
x12+x15+x14+2x25≤300
x12+2x15-x25=2
300 GB
x12+x23+x25≤600
2x25-x12-x23=4
600 GB
x15+x25+x45≤300
x25-2x15-x45=5
300 GB
x34 +8 x25≤300
300 GB
x23+x34 ≤300
300 GB
Each site observes different constraints, but wants to solve the same objective function
z = 6.03x12 + 9.04x23 + 6.52x15 + 8.28x14 + 14.42x25 + 9.58x34 + 12.32x45
![Page 15: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/15.jpg)
Distributed Canonical Representation
15
An initialization stepNo of basic variables to add = Total no of
constraints in the systemBuild a spanning tree in the networkPerform a distributed sum estimation
algorithm Builds a canonical representation exactly
identical to the one if data was centralized
![Page 16: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/16.jpg)
The Distributed Algorithm for solving the LP problem
16
Steps involved:Estimate Column pivotEstimate Row pivot (requires
communication with neighbors)Perform Gauss Jordan elimination
![Page 17: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/17.jpg)
Illustration of the Distributed Algorithm
x12 x23 x15 x14 x25 x34 x45 s1 s2 s3 s4 s5 s6 s7 s8 B
1 0 1 1 2 0 0 1 0 0 0 0 0 0 0 300
1 0 2 0 -1 0 0 0 1 0 0 0 0 0 0 2
-6.03 -9.04 -6.52 -8.28 -14.42 -9.58 -12.32 0 0 0 0 0 0 0 0 0
Node1 Node 2
Node 5 Node 4
Node 3
x12 x23 x15 x14 x25 x34 x45 s1 s2 s3 s4 s5 s6 s7 s8 B
0 0 0 0 8 1 1 0 0 0 0 0 0 1 0 300
-6.03 -9.04 -6.52 -8.28 -14.42 -9.58 -12.32 0 0 0 0 0 0 0 0 0
Column pivot selection is done at each node
![Page 18: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/18.jpg)
Distributed Row Pivot selection
18
Protocol Push Min (gossip based)Minimum estimation problemIteration t-1: {mr} values sent to node i
mti = min {{mr} , current row pivot}Termination: All nodes have exactly the
same minimum value
![Page 19: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/19.jpg)
Analysis of Protocol Push Min
19
Based on spread of an epidemic in a large population
Suseptible, infected and dead nodesThe “epidemic” spreads exponentially fast
Node1 Node 2
Node 5 Node 4
Node 3
![Page 20: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/20.jpg)
Comments and Discussions
20
Assume η no of nodes in the networkCommunication Complexity is
O(no of iterations of simplex X η)Worst case Simplex may require
exponential no of iterations.For most practical purposes it is λ m (λ<4)
![Page 21: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/21.jpg)
Road MapMotivationRelated WorkFraming an Linear Programming problemThe simplex algorithmThe distributed simplex algorithmExperimental ResultsConclusion and Directions of Future Work
![Page 22: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/22.jpg)
Experimental Results
Artificial Data SetSimulated constraint matrices at each nodeUsed Distributed Data Mining Toolkit (DDMT)
developed at University of Maryland, Baltimore County (UMBC) for simulating the network structure
Two different metrics for evaluation: TCC (Total Communication Cost in the network)Average Communication Cost per Node (ACCN)
![Page 23: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/23.jpg)
Communication CostAverage Communication Cost Per Node
versus Number of Nodes in the network
![Page 24: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/24.jpg)
More Experimental Results ….TCC versus No of Variables at each node
TCC versus No of constraints at each node
![Page 25: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/25.jpg)
Conclusions and Future Work
Resource management and pattern recognition present formidable challenges on distributed systems
Present a distributed algorithm for resource management based on the simplex algorithm
Test our algorithm on simulated data Future WorkIncorporation of dynamics of the networkTesting the algorithm on a real distributed networkEffect of size and structure of network on the
mining results Examine the trade-off between accuracy and
communication cost incurred before and after using distributed simplex on a mining task like classification or clustering
![Page 26: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/26.jpg)
Selected BibliographyG.B.Dantzig, “Linear Programming and Extensions”.
Princeton University Press, Princeton, NJ, 1963Kargupta and Chan,”Advances in Distributed and
Parallel Knowledge Discovery”, AAAI Press, Menlo Park, CA, 2000.
A. L. Turinsky. “Balancing Cost and Accuracy in Distributed Data Mining”. PhD thesis, University of Illinois at Chicago., 2002.
Haimonti Dutta, “Empowering Scientific Discovery by Distributed Data Mining on the Grid Infrastructure”, Ph.D. Thesis, UMBC, 2007.
Mangasarian, “Mathematical Programming in Data Mining”, DMKD, Vol 42, pg 183 – 201, 1997.
![Page 27: Haimonti Dutta 1 and Hillol Kargupta 2 1 Center for Computational Learning Systems (CCLS), Columbia University, NY, USA. 2 University of Maryland, Baltimore](https://reader035.vdocument.in/reader035/viewer/2022062718/56649e705503460f94b6deec/html5/thumbnails/27.jpg)
Questions ?