Download - HUG slides on NFS and ODBC
![Page 1: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/1.jpg)
1 ©MapR Technologies
Using Standard File-‐Based Applica4ons and SQL-‐Based
Tools with Hadoop
![Page 2: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/2.jpg)
2 ©MapR Technologies
Who am I?
§ Keys Botzum § [email protected] § Senior Principal Technologist, MapR Technologies
hBp://www.mapr.com/company/events/speaking/dc-‐hug-‐9-‐18-‐12
![Page 3: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/3.jpg)
3 ©MapR Technologies
The MapR Distribu4on for Apache Hadoop
§ The open, enterprise-‐grade distribuLon for Apache Hadoop – Open source components • Hive, Pig, Cascading, HBase, ZooKeeper, Oozie, Flume, Sqoop, Whirr, …
– Enhancements to make Hadoop more open and enterprise-‐grade
§ Growing fast and a recognized leader
![Page 4: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/4.jpg)
4 ©MapR Technologies
MapR in the Cloud
§ Available as a service with Amazon ElasLc MapReduce (EMR) – hBp://aws.amazon.com/elasLcmapreduce/mapr
§ Available as a service with Google Compute Engine
![Page 5: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/5.jpg)
5 ©MapR Technologies
MapR
Make Hadoop more open
Make Hadoop enterprise-‐grade
This presentaLon
• High Availability • Scalability • Management tools – Web, CLI, REST • Data ProtecLon – snapshots & mirroring • Performance
![Page 6: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/6.jpg)
6 ©MapR Technologies
Not All Applica4ons Use the Hadoop APIs
ApplicaLons and libraries that use files and/or SQL • These are not legacy
applicaLons, they are valuable applicaLons
ApplicaLons and libraries that use the Hadoop APIs
30 years 100,000s applicaLons
10,000s libraries 10s programming languages
![Page 7: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/7.jpg)
7 ©MapR Technologies
Hadoop Needs Industry-‐Standard Interfaces
• MapReduce and HBase applicaLons • Mostly custom-‐built
Hadoop API
• File-‐based applicaLons • Supported by most operaLng systems NFS
• SQL-‐based tools • Supported by most BI applicaLons and query builders
ODBC
![Page 8: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/8.jpg)
8 ©MapR Technologies
NFS
![Page 9: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/9.jpg)
9 ©MapR Technologies
Your Data is Important
§ HDFS-‐based Hadoop distribuLons do not (cannot) properly support NFS
§ Your data is important, it drives your business – make sure you can access it – Why store your data in a system which cannot be accessed by 95% of the world’s applicaLons and libraries?
§ Access to HDFS source code != access to your data
![Page 10: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/10.jpg)
10 ©MapR Technologies
The NFS Protocol
§ RFC 1813
§ Very simple protocol
§ Random reads/writes – Read count bytes from offset offset of file file
– Write buffer data to offset offset of a file file
§ HDFS does not support random writes so it cannot support NFS
WRITE3res NFSPROC3_WRITE(WRITE3args) = 7; struct WRITE3args { nfs_fh3 file; offset3 offset; count3 count; stable_how stable; opaque data<>; }; READ3res NFSPROC3_READ(READ3args) = 6; struct READ3args { nfs_fh3 file; offset3 offset; count3 count; };
![Page 11: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/11.jpg)
11 ©MapR Technologies
Hadoop Was Designed to Support Mul4ple Storage Layers
HDFS
o.a.h.hd
fs.Distrib
uted
FileSystem
NFS interface
Hadoop FileSystem API
S3
o.a.h.fs.s3n
aLve.NaL
veS3FileSystem
Local File System
o.a.h.fs.LocalFileSystem
FTP
o.a.h.fs.qp.FTPFileSystem
MapR storage layer
com.m
apr.fs.MapRFileSystem
o.a.h.fs.FileSystem Interface MapReduce
![Page 12: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/12.jpg)
12 ©MapR Technologies
One NFS Gateway
What about scalability and high availability?
![Page 13: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/13.jpg)
13 ©MapR Technologies
Mul4ple NFS Gateways
![Page 14: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/14.jpg)
14 ©MapR Technologies
Mul4ple NFS Gateways with Load Balancing
![Page 15: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/15.jpg)
15 ©MapR Technologies
Mul4ple NFS Gateways with NFS HA (VIPs)
![Page 16: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/16.jpg)
16 ©MapR Technologies
Customer Examples: Import/Export Data
§ Network security vendor – Network packet captures from switches are streamed into the cluster – New paBern definiLons are loaded into online IPS via NFS
§ Online measurement company – Clickstreams from applicaLon servers are streamed into the cluster
§ SaaS company – ExporLng a database to Hadoop over NFS
§ Ad exchange – Bids and transacLons are streamed into the cluster
![Page 17: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/17.jpg)
17 ©MapR Technologies
Customer Examples: Produc4vity and Opera4ons
§ Retailer – OperaLonal scripts are easier with NFS than HDFS + MapReduce • chmod/chown, file system searches/greps, perl, awk, tab-‐complete
– Consolidate object store with analyLcs
§ Credit card company – User and project home directories on Linux gateways • Local files, scripts, source code, … • Administrators manage quotas, snapshots/backups, …
§ Large Internet company recommendaLon system – Web server serve MapReduce results (item relaLonships) directly from cluster
§ Email markeLng company – Object store with HBase and NFS
![Page 18: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/18.jpg)
18 ©MapR Technologies
ODBC
![Page 19: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/19.jpg)
19 ©MapR Technologies
ODBC
§ ODBC – Open DataBase ConnecLvity – Open standard API for accessing a SQL-‐based backend – Developed by Microsoq and Simba Technologies in 1992
§ Flagship API for SQL-‐based BI and reporLng – Excel, Tableau, MicroStrategy, Crystal Reports, …
§ Advanced ODBC drivers use the latest 3.52 specificaLon
![Page 20: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/20.jpg)
20 ©MapR Technologies
MapR ODBC Driver
§ MapR provides a Hive ODBC 3.52 driver – Developed in partnership with ODBC inventor Simba Technologies – Compliant with latest ODBC 3.52 specificaLon • 32-‐ and 64-‐bit plavorm support • Windows and Linux
§ Enables direct SQL access to MapR-‐stored data by translaLng SQL to HiveQL
§ SQLizer enables seamless connecLvity – Provides ANSI SQL-‐92 front-‐end – Targeted for exisLng apps that generate standard SQL queries – Transforms SQL query into HiveQL query
![Page 21: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/21.jpg)
21 ©MapR Technologies
Example: Tableau
![Page 22: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/22.jpg)
22 ©MapR Technologies
Example: Open source query builder (Kaimon)
![Page 23: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/23.jpg)
23 ©MapR Technologies
Example: MicrosoW Excel
![Page 24: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/24.jpg)
24 ©MapR Technologies
In Summary
§ Open standards are important § SupporLng exisLng applicaLons and tools that support those standards is valuable – Preserves investment in tools – Preserves investment in custom applicaLons that proceeded Hadoop – Leverages skills you already have
![Page 25: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/25.jpg)
25 ©MapR Technologies
Join MapR
§ Join the fastest growing Hadoop company
§ Open posiLons in every discipline – Engineers – SoluLon Architects – Product Management
§ Email [email protected]
![Page 26: HUG slides on NFS and ODBC](https://reader033.vdocument.in/reader033/viewer/2022052906/558c1d94d8b42ac7528b469a/html5/thumbnails/26.jpg)
26 ©MapR Technologies
Time for Ques4ons
§ Download slides or send me an email – hBp://www.mapr.com/company/events/speaking/dc-‐hug-‐9-‐18-‐12
§ Download MapR to learn more – www.mapr.com/download