how we prepared etu hadoop competition 2014
TRANSCRIPT
![Page 1: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/1.jpg)
How We Prepared Etu Hadoop Competition
2014
Study Hsueh!!
2014/06/26
那⼀一年,我們⼀一起追的Hadoop
![Page 2: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/2.jpg)
那⼀一年,我們怎麼僥倖贏的EHC
![Page 3: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/3.jpg)
Background• qrtt1
• Java & AWS Expert
• Study
• Java Fan
• Lu
• Machining Learning Beauty
![Page 4: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/4.jpg)
Hadoop Experience• qrtt1
• 從Hadoop 1.x就說要玩Hadoop,但⼀一直沒玩
• Study
• 裝過CDH、略懂Hadoop 1.x
• 介接過Hive、⽤用sqoop轉置過RDBMS資料
• Lu
• 聽⼈人家說過Hadoop
![Page 5: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/5.jpg)
初賽
![Page 6: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/6.jpg)
![Page 7: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/7.jpg)
初賽前分⼯工• qrtt1
• ⼿手⼯工架Hadoop環境
• Study
• 準備bigtop rpms (放在S3上⾯面)
• 改Vagrantfile
• 測試
• Lu
• 專⼼心學Linux與架Hadoop
![Page 8: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/8.jpg)
初賽當天分⼯工• qrtt1
• 分析送分程式
• Study
• 跑Vagrant script
!
![Page 9: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/9.jpg)
初賽結果• 漏掉設定hostname, 導致HBase異常,還好最後有進決賽:)
!
!
!
![Page 10: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/10.jpg)
決賽
![Page 11: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/11.jpg)
決賽說明會前分⼯工• qrtt1!
• ⼿手⼯工架Hadoop Cluster!• 架設KDC!• HA、Kerberos Setup & Usage!
• Study!• 準備與參賽環境相似的測試機!• 準備CDH & CentOS repository mirror!• 玩各種Hadoop distribution (CDH、HDP與BigTop)!• Performance Turning & Testing!• HA & Kerberos Usage!
• Lu!• ⼿手⼯工架Hadoop Cluster!• 測試Hadoop參數
![Page 12: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/12.jpg)
測試機 v1
• Type 1 Hypervisor: VMware ESXi 5.5
• CPU: Intel i5 760
• RAM: 16 GB
• HDD: 2 TB * 2
![Page 13: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/13.jpg)
![Page 14: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/14.jpg)
決定使⽤用的 Hadoop Distribution
• 採⽤用CDH
• Pros
• 容易修改&部署Hadoop參數
• Log位置固定
• Cons
• Cloudera Management Service⾮非常吃資源 (可以關掉)
• 安裝耗時
![Page 15: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/15.jpg)
決賽說明會後分⼯工• qrtt1
• Performance Testing
• Study
• 調整測試機,盡可能貼近⽐比賽環境
• 準備⽐比賽當天⽤用的VM
• Performance Testing
• Lu
• 測試Hadoop參數
![Page 16: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/16.jpg)
測試機 v2
• Host: CentOS 6.5 x86_64 Desktop
• Type 2 Hypervisor: Oracle VirtualBox 4.3.12
• CPU: Intel i5 760
• RAM: 32 GB
• HDD: 2 TB * 4
![Page 17: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/17.jpg)
![Page 18: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/18.jpg)
決賽前⼀一天...
• 準備得越多,越發現可以準備的東⻄西更多
• 累了
!
!
![Page 19: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/19.jpg)
決賽當天分⼯工• qrtt1
• KDC Setup • Watch Log • 執⾏行送分程式
• Study • 準備軟硬體環境 • 協助問題排除
• Lu • Hadoop參數調整
![Page 20: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/20.jpg)
Before The Final Game We Know
• 單⼀一台⼤大VM⽐比四台⼩小VM快上數倍
• CDH預設不允許使⽤用系統使⽤用者hdfs做某些操作
• VirtualBox
• JBOD無顯著效果
• ⽐比ESXi VM慢很多,且不時無回應
• Shared Folder權限更改無效
• VM互傳資料速度約30MB/s
![Page 21: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/21.jpg)
策略• 先求各項有分數
• 若有⼈人分數超前,才開始turning
• VM turning
• Hadoop parameter turning
• ramfs
• Make Hadoop cluster run like a single-node Hadoop
• JBOD
![Page 22: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/22.jpg)
決賽中遇到的問題
• VM異常的慢
• HDFS寫⼊入30 * 3G的資料,準備的VM硬碟配置只有80 GB
• HA Failover只等10秒,Namenode來不及切換
• HBase使⽤用系統使⽤用者hdfs執⾏行,導致出現權限錯誤
![Page 23: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/23.jpg)
Troubleshooting• VM異常的慢
• 原因:每個VM配置了過多的cores (12 cores)
• 解決⽅方法:每個VM改為4 cores
!
!
![Page 24: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/24.jpg)
Troubleshooting• HDFS寫⼊入30 * 3G的資料,我們準備的VM硬碟配置只有80 GB
• Mount new virtual disks
• Stop Kerberos
• Reformat HDFS
• Start Kerberos
• 最後把HBase弄掛了
• 使⽤用snapshot還原VM
![Page 25: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/25.jpg)
Troubleshooting• HA Failover送分程式只等10秒,Namenode來不及切換
• ⽤用Ctrl+z暫停送分程式
• 確認 Failover 完成,⽤用 fg 將送分程式喚醒
!
![Page 26: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/26.jpg)
Troubleshooting• HBase使⽤用系統使⽤用者hdfs執⾏行,導致出現權限錯誤
• 新增Kerberos user
• 賦予User執⾏行MapReduce、HBase與HDFS的權限
!
![Page 27: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/27.jpg)
結論• ⽐比賽中有很多取捨,最後很多準備的東⻄西都沒⽤用上
• ⺩王牌還沒出,⽐比賽就結束了
• 也許我們只是⼩小贏在 Linux ⽐比較熟
!
![Page 28: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/28.jpg)
⾨門外漢只要努⼒力,也有變成男⼦子漢的⼀一天!!
![Page 29: How We Prepared Etu Hadoop Competition 2014](https://reader033.vdocument.in/reader033/viewer/2022060120/5592a3d01a28ab74798b45c8/html5/thumbnails/29.jpg)
參考資料• Etu Hadoop Competition 2014
• http://ehc.etusolution.com/index.php/tw/
• ⾨門外漢的 Hadoop 部署⼤大賽(上)
• http://www.codedata.com.tw/social-coding/contest-of-hadoop-layman-1/
• ⾨門外漢的 Hadoop 部署⼤大賽(下)
• http://www.codedata.com.tw/social-coding/contest-of-hadoop-layman-2/