apache hadoop for system administrators - usenix lisa: leonardo da vinci white elephant: ecce homo:...
TRANSCRIPT
![Page 1: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/1.jpg)
Apache Hadoop for System Administrators Allen Wittenauer
![Page 3: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/3.jpg)
Hadoop Deployed Now?
![Page 4: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/4.jpg)
Planning Hadoop Deployment?
![Page 5: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/5.jpg)
Needed some place to sit before lunch?
![Page 6: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/6.jpg)
An Extremely Quick & Incomplete Intro to Hadoop
![Page 7: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/7.jpg)
![Page 8: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/8.jpg)
Map (“transform”)– Perl:
– Python:
items = [1,2,3,4,5]def sqr(x) : return x**2
print list(map(sqr,items))[1, 4, 9, 16, 25]
@items=(1,2,3,4,5);sub sqr {return $_**2);print join(‘,’,map(sqr,@items));1,4,9,16,25
![Page 9: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/9.jpg)
Reduce (“compress” or “fold”)– Perl
– Python
from functools import reduceitems = [1,4,9,16,25]print reduce ((lambda x,y: x if
(x>y) else y), items)25
use List::Util qw/reduce/;@items=(1,4,9,16,25);print reduce {$a>$b ? $a:$b} @items;25
![Page 10: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/10.jpg)
NEVER GONNA
GIVE YOU UP
![Page 11: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/11.jpg)
Hadoop(‘common’ or ‘core’)
MapReduce HDFS
![Page 12: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/12.jpg)
Hadoop(‘common’ or ‘core’)
MapReduce S3
![Page 13: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/13.jpg)
Hadoop(‘common’ or ‘core’)
MapReduce Gluster
![Page 14: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/14.jpg)
Hadoop(‘common’ or ‘core’)
HBase HDFS
![Page 15: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/15.jpg)
NameNode
DataNode DataNodeDataNode
ext4D
ext4D
ext4D
ext4D
ext4D
ext4D
![Page 16: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/16.jpg)
JobTracker
M
TaskTracker
M R R M
TaskTracker
M R R M
TaskTracker
M R R
![Page 17: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/17.jpg)
JobTracker
M
TaskTracker
M R R M
TaskTracker
M R R M
TaskTracker
M R R
![Page 18: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/18.jpg)
NameNode
JobTracker
DN
TT
D D D DM M R R
DN
TT
D D D DM M R R
DN
TT
D D D DM M R R
DN
TT
D D D DM M R R
DN
TT
D D D DM M R R
![Page 19: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/19.jpg)
Hadoop isn’t designed for system administrators and/or support staff.
![Page 20: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/20.jpg)
“Hadoop is not a developer problem; it’s an operations problem.”
-- Hadoop vendor ex-employee
![Page 21: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/21.jpg)
![Page 22: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/22.jpg)
Don’t Make Assumptions
![Page 23: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/23.jpg)
tail’ing the logs won’t tell you the whole story.
![Page 24: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/24.jpg)
%
![Page 25: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/25.jpg)
Monitor the masters!
![Page 26: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/26.jpg)
![Page 27: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/27.jpg)
![Page 28: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/28.jpg)
![Page 29: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/29.jpg)
![Page 30: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/30.jpg)
LinkedIn’s Configuration– 30+ Health Checks per Grid Masters, canary report, daily fsck, etc
– 10+ Health Checks per DC LDAP, Kerberos, etc ...
– Cross-DC Nagios Server Checks
Warn: 5% down nodes Panic: 30% down HDFS: 20% Free Space Gateway home dir: 10% free
space ...
ComputeNodes
Nagios
NN JT AZ GWZK VD
![Page 31: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/31.jpg)
Health Check Script– “OK” - good status– “ERROR (message)” - bad status
mapred.healthChecker.script.path
Consider checking ...– critical software– ownership & permissions– network connection speed– drive count
– file system space– RO file systems– IO errors– missing memory
![Page 32: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/32.jpg)
![Page 33: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/33.jpg)
Use the tools most of your user’s code is written in!
Pig– testfile:
– Code:
– Output:
A = load 'testfile' using PigStorage(',') as (i: int);
B = foreach C generate i;C = distinct B;dump C;
100
(100)
![Page 34: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/34.jpg)
Reactive
![Page 35: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/35.jpg)
Proactive
![Page 36: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/36.jpg)
Resource Controls
![Page 37: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/37.jpg)
JobTracker Memory Resource Controls– Limit jobs stored in JT heap: mapred.jobtracker.completeuserjobs.maximum
– Limit total # of job tasks: mapred.jobtracker.maxtasks.per.job
Job Memory Resource Controls– Scheduler-level: mapred.cluster.*.memory.mb– TT-level: auto-calculated based upon MR slot counts & scheduler level settings– MR Job-level: mapred.job.*.memory.mb– Linux only: /proc memory calculator and task killer
![Page 38: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/38.jpg)
“I set the heap to 1G but my process ran out of memory?”
![Page 39: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/39.jpg)
Treat HDFS like any other multi-tenant FS
![Page 40: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/40.jpg)
Quota everything– Yes, including /tmp– No “show me all quotas” functionality
Be consistent:– /user/* all get same quota
Be flexible:– Make another dir for user’s to store big projects (e.g., /project)
Be smart:– Have a policy that content in /tmp gets deleted after X days. Automate this!– Build reporting that shows files that are replicated less than 3 times
dfsadmin -setQuotadfsadmin -setSpaceQuota
![Page 41: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/41.jpg)
Compute Node Disk Partitioning as Protective Measure
![Page 42: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/42.jpg)
root partitioning
non-root partinioning
20 GB /, ... 200 GB task space (rest) HDFS
5 GB swap 200 GB task space (rest) HDFS
![Page 43: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/43.jpg)
Security!
![Page 44: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/44.jpg)
Queue Level ACLs– users– groups– netgroups
Service Level ACLs– hosts– users– groups– netgroups
– Limitation: Web services are all or nothing! :(– Be aware: Hadoop uses ephemeral ports all over the place! :(
hadoop-policy.xml
mapred-queue-acls.xml
![Page 45: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/45.jpg)
Kerberos!
![Page 46: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/46.jpg)
![Page 47: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/47.jpg)
Corp ITActive Directory
@CORP
Client Node
Grid Realm@GRID
krbtgt/GRID@CORP
Hadoop Services
krbtgt/user@CORPkrbtgt/GRID@CORP
krbtgt/host@GRIDkrbtgt/service@GRID
Password
![Page 48: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/48.jpg)
![Page 49: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/49.jpg)
![Page 50: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/50.jpg)
![Page 51: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/51.jpg)
http://data.linkedin.com/opensource/white-elephant
![Page 52: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/52.jpg)
Fonzi: http://www.flickr.com/photos/elzey/7224689810 Captain Obvious by artist Stuart McGhee. http://stuartmcghee.com/ Ant on flower: http://www.flickr.com/photos/bolonski/6116358907 Ant Colony: http://www.flickr.com/photos/klearchos/2821230516 Ant Queen: http://commons.wikimedia.org/wiki/
File:Camponotus_crispulus_queen_ant.jpg Canary: http://www.flickr.com/photos/nathan_and_jenny/2454127424 Mona Lisa: Leonardo Da Vinci White Elephant: http://data.linkedin.com/opensource/white-elephant Ecce Homo:
– Elías García Martínez (original) – Cecilia Giménez (restored)
![Page 53: Apache Hadoop for System Administrators - USENIX Lisa: Leonardo Da Vinci White Elephant: Ecce Homo: …](https://reader030.vdocument.in/reader030/viewer/2022020315/5ac0489c7f8b9ac6688bf643/html5/thumbnails/53.jpg)
Contact: Twitter: @_a__w_ Email: [email protected]
More info: Quora: www.quora.com/user/allenwittenauer SlideShare: www.slideshare.net/allenwittenauer
Thanks!