the datacenter needs an operating system - big data...
TRANSCRIPT
![Page 1: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/1.jpg)
THE DATACENTER NEEDS AN OPERATING SYSTEMMATEI ZAHARIA, BENJAMIN HINDMAN, ANDY KONWINSKI, ALI GHODSI, ANTHONY JOSEPH, RANDY KATZ, SCOTT SHENKER, ION STOICAUC BERKELEY
![Page 2: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/2.jpg)
THE DATACENTER IS THE NEW COMPUTERRunning today’s most popular consumer apps
• Facebook, Google, iCloud, etc
Needed for big data in business & science
Widely accessible through cloud computing
Our claim: this new computer needs an operating system
![Page 3: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/3.jpg)
WHY DATACENTERS NEED AN OSGrowing diversity of applications
• Computing frameworks: MapReduce, Dryad, Pregel, Percolator, Dremel
• Storage systems: GFS, BigTable, Dynamo, etc
Growing diversity of users• 200+ Hive users at Facebook
Same reasons computersneeded one!
![Page 4: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/4.jpg)
WHAT OPERATING SYSTEMS PROVIDE
Resource Sharing
Data SharingProgrammingAbstractions
Debugging & Monitoring
time-sharing, virtual memory, …
ptrace, DTrace, top, …
files, pipes, IPC, … libraries, languages
![Page 5: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/5.jpg)
WHAT OPERATING SYSTEMS PROVIDE
Resource Sharing
Data SharingProgrammingAbstractions
Debugging & Monitoring
time-sharing, virtual memory, …
ptrace, DTrace, top, …
files, pipes, IPC, … libraries, languages
Most importantly: an ecosystem
…enabling independently developedsoftware to interoperate seamlessly
![Page 6: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/6.jpg)
Today’s Datacenter OPERATING SYSTEMPlatforms like Hadoop well-aware of these issues
• Inter-user resource sharing, but at the level of MapReduce jobs (though this is changing)
• InputFormat API for storage systems (but what happens with the next hot platform after Hadoop?)
• InputFormat describes the input-specification
Other examples: Amazon services, Google stack
![Page 7: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/7.jpg)
Today’s Datacenter OPERATING SYSTEMPlatforms like Hadoop well-aware of these issues
• Inter-user resource sharing, but at the level of MapReduce jobs (though this is changing)
• InputFormat API for storage systems (but what happens with the next hot platform after Hadoop?)
Other examples: Amazon services, Google stack
The problems motivating a datacenter OS are well recognized, but solutions are narrowly targeted
Can researchers take a longer-term view?
![Page 8: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/8.jpg)
Tomorrow’s Datacenter OS
Resource Sharing
Data SharingProgrammingAbstractions
Debugging & Monitoring
time-sharing, virtual memory, …
ptrace, DTrace, top, …
files, pipes, IPC, … libraries, languages
![Page 9: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/9.jpg)
RESOURCE SHARING
To solve these interaction problems we would like to have a computer made simultaneously available to many users in a manner somewhat like a telephone exchange. Each user would be able to use a console at his own pace and without concern for the activity of others using the system.”
– Fernando J. Corbató, 1962
“
![Page 10: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/10.jpg)
RESOURCE SHARINGToday, cluster apps are built to run independentlyand assume they own a fixed set of nodes
Result: inefficient static partitioning
What’s the right interface for dynamic sharing?
App 1
App 2
App 3
![Page 11: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/11.jpg)
MEMORY MANAGEMENTMemory is an increasingly important resource
• In-memory iterative processing (Pregel, Spark, etc)• DFS cache for MapReduce cluster could serve
90% of jobs at Facebook (HotOS ‘11)
What are the right memory management algorithms for a parallel analytics cluster?
![Page 12: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/12.jpg)
PROGRAMMINGAND DEBUGGINGAlthough there are new programming models for applications, system programming remains hard
• Can we identify useful common abstractions? (Chubby, Sinfonia, Mesos are some examples)
• How much can languages (e.g. Go, Erlang) help?
Debugging is very hard• Magpie, X-Trace, Dapper are some steps here
Can a clean-slate design of the stack help?
![Page 13: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/13.jpg)
HOW RESEARCHERS CAN HELPFocus on paradigms, not only performance
• Industry is spending a lot of time on performance
Explore clean-slate approaches• Much datacenter software is written from scratch• People using Erlang, Scala, functional models (MR)
Bring cluster computing to non-experts• Most impactful (datacenter as the new workstation)• Hard to make a Google-scale stack usable without
a Google-scale ops team
![Page 14: THE DATACENTER NEEDS AN OPERATING SYSTEM - Big Data …prof.ict.ac.cn/DComputing/uploads/2012/DC_4_2_hotcloud_datacent… · THE NEW COMPUTER Running today’s most popular consumer](https://reader036.vdocument.in/reader036/viewer/2022090606/605c320b3f9a9e52c24b8a7e/html5/thumbnails/14.jpg)
CONCLUSIONDatacenters are becoming a major platform
To support a thriving software ecosystem like computers do, they need the equivalent of an OS
Researchers can take a long-term systems view to problems arising today to enable this