bringing afs into the 21 st century jeffrey altman, president your file system inc. 5 may 2010
TRANSCRIPT
Bringing AFS into the 21st Century
Jeffrey Altman, PresidentYour File System Inc.
5 May 2010
Jeffrey AltmanPresent and Former RolesKermit Developer, OS/2, Win95, NT, …
Kerberized Telnet, SSH, secure data transferMIT Kerberos Core Team member
Kerberos for Windows ArchitectOpenSSL contributorIETF Security Directorate participantInternet Access Methods, Chief Technology OfficerProject JXTA, Founding Board MemberOpenAFS, Gatekeeper and ElderSecure Endpoints, Inc., President and Founder
Network Identity Manager, KCA, Heimdal HSM KDCYour File System, Inc., President and Founder
OpenAFS 10 Years and CountingOpenAFS was formed on 1 Nov 2000The original Elders represented IBM, Intel, Morgan
Stanley, Carnegie Mellon, MIT, and UmichiganSince then ohloh.net says that OpenAFS has become
one of the largest open source projects233 developers since inception (51 active in the last
year)Nearly 1 million lines of source code and 100,000 lines
of user and developer documentationAll major operating systems (except mobile) are
supportedUntold millions of end users (no way to measure)
The Beginning of the Story …1983-88 Andrew Project at Carnegie Mellon
University creates AFS to support a distributed, heterogeneous, workstation based, computing environment
Funded by IBM to compete with DEC funded Athena at MIT
Unique properties of AFS includeFederated authentication model basedSophisticated Access Control modelLocation independent data access permits zero
downtime maintainanceHeterogeneous support including PCs
Commercialization1988 Transarc is formed
Founding member of the Open GroupMust beat SunOS in order to beat NFS (v3)Creates AFS4 which becomes the DCE File
System (DFS)1994 Transarc acquired by IBM
AFS3 becomes a legacy productDFS and the Encina Transaction Process
Monitor become the focus of IBM Pittsburgh Labs
AFS3 is bundled with WebSphere
Significant Deployments:Morgan Stanley1994 Morgan Stanley deploys Kerberos and AFS as part
of the Aurora projectThe challenge: To come up with a distributed systems
environment that would allow Morgan Stanley to centrally manage tens of thousands of systems spread out over more than 30 offices on virtually every continent on the globe in a fully production fashion. The solution: The Aurora System. (Unix only)
2002 WinAurora is developed and deployedToday, 80,000 Windows workstations and 20,000
application servers. 30,000 applications hosted in AFS.Makes use of registry virtualization and environment block
injection from AFS name space. No local configuration.
Significant Deployments:U.S. GovernmentDept of Energy LabsU.S. High Performance Computing centersNASAInternal Revenue ServiceU.S. Geological SurveyNaval Research Lab“Star Wars” Missile Defense Shield
Significant Deployments:Other commercialPictage Photography ServicesGeneral Electric Aircraft Engine DivisionGoldman SachsHitachiQualcommUnited AirlinesMany that are not publicly known
Significant Deployments:EducationCarnegie Mellon
UniversityCornell UniversityDukeHarvardMITPSUUC-Berkeley and UC-
Santa BarbaraStanfordUniv of MichiganU of Wisconsin-Madison
Iowa State UniversityCalifornia Inst of
TechnologyUniv of North Carolina
Chapel HillCharlotte (Mosaic)
Univ of StockholmKTHUniv of EdinburghMany many others
The Road to Open Source1996 to 2000, major educational institutions with source
code licenses continue to develop enhancements to AFS but are not permitted to share them
Lack of on-going development and increasing prices from IBM generates significant customer backlash
1998, Derrick Brashear begins to ask IBM for source code accessFirst rx code is released as public domainLater a commitment to release much of the AFS sources
Aug 2000, IBM announces a plan to open source AFS giving responsibility back to the educational community
The OpenAFS EraOct 31 2000, IBM publishes OpenAFS 1.0 to
the IBM DeveloperWorks web siteNov 1 2000, OpenAFS.org is bornNov 5 2000, OpenAFS 1.0 shipsNov 2001, OpenAFS 1.2 shipsNov 2005, OpenAFS 1.4 ships
OpenAFS 1.4The last major update of OpenAFS (1.4) was
announced on its 5th birthday, 1 November 2005 four years after 1.2.
The release took almost four years to develop and included:Significant performance and stability improvementsServer support for mobile clients and NATsAudit loggingvos copy, vos convertROtoRW, parallel attach on
restartWindows clients that workedAIX5, HPUX11.23, Solaris 10, Linux 2.6, MacOS 10.4
Where does UNCC Mosaic fit in?Originally built on Transarc Windows AFS Client
3.4a (1996)First AFS cache for Windows in AFS 3.5 (1999)AFS 3.6 (March 2000)/OpenAFS 1.0 (November
2000)By 2003, 900 workstations, 25 servers, 150+ Sun
Solaris apps, 80+ Windows XP apps, and 4700 user accounts. 2006, Introduction of the Windows Remote Desktop Web System Utilizing AFS
Today, more of everything but pretty much the same architecture200+ applications on Windows
My Introduction to OpenAFSI started working on OpenAFS in late 2003
A five day project to add Kerberos v5 authentication to afscreds.exe became 5 weeks, then 5 months and now 7 years
Rodney’s presentation at the 2004 AFS Workshop at SLAC was followed by my first presentation to the community on the state of the Windows clientA new Sherriff is in townPrior to November 2003 the OpenAFS Windows had
received no love.Frustration in the user community was boiling over.
“Why don’t you promote OpenAFS more?- Rodney Dyer, 2006 WorkshopOpenAFS’s 5th Anniversary saw the release of 1.4I refused to promote OpenAFS because I didn’t want to
make promises I couldn’t keep about code quality or performance
In the first five years, the OpenAFS community did very little but put out firesDistributed systems are hardMulti-threaded systems are hardHeterogeneous systems are hardKernel development is hardDoing all is nearly impossibleOpenAFS sources included just about every mistake
imaginable
Windows Client Read Performance:Cached Data
oafw
1.2
.10
ibm
3.6
.2.5
3
oafw
1.3
.80
oafw
1.5
.14
oafw
1.5
.51
oafw
1.5
.57
oafw
1.5
.59
oafw
redi
r
050
100150200250300350400
crypt on
crypt oncrypt off
Windows Client Read Performance:Uncached Data
oafw
1.2
.10
ibm
3.6
.2.5
3
oafw
1.3
.80
oafw
1.5
.14
oafw
1.5
.51
oafw
1.5
.57
oafw
1.5
.59
oafw
redi
r
05
10152025303540
crypt on
crypt oncrypt off
Windows Client Write Performance
oafw
1.2
.10
ibm
3.6
.2.5
3
oafw
1.3
.80
oafw
1.5
.14
oafw
1.5
.51
oafw
1.5
.57
oafw
1.5
.59
oafw
redi
r
0
5
10
15
20
25
30
35
crypt on
crypt oncrypt off
OpenAFS Roadmap? Or Wish List?Every Workshop a roadmap is presented
but its not a roadmapNo commitmentsNo delivery datesHow are you supposed to plan your rollout
schedule?The problem is lack of resourcesGatekeepers/Elders compile lists of requests
but have little influence on what people work on
YFS Inc. Founded to Drive Demand Globally Accessible File SystemsOpen source projects are funded by organizations
that are dependent upon the technologiesThe benefits of AFS are lost of the vast majority of the
worldMobileMe, BigVault, DropBox, and similar sync and
access cloud storage services are far behind the capabilities of AFS
YFS will provide services to direct to home, small business, and enterprise users and indirect through telecommunication companies
With hundreds of millions of users, there is a business case for enhancing the software on a regular basis
The MissionDevelop, deploy, and operate “Write once, Access
anywhere” global storage solutionsSupport the on-going development of utilized open
source technologiesAttempt to correct the HTTP mistake
The world wide web is wonderful but HTTP is a horrible protocol implemented at the wrong layer in the OS stack
URLs equate to global file system pathsStatic web pages equate to filesWeb service APIs equate to distributed named pipe RPCsAFS Access Control and Federated Authentication is
decades ahead of the web
U.S. Department of Energy Small Business Innovative Research GrantThe DoE labs are large users of AFS to
support their HEPiX researchYFS Inc. applied for a grant in 2007In 2008, received $99,000 to fund Rx
improvements and a feasibility studyAugust 2009 was awarded $650,000 to
standardize, design, and implement core protocol enhancements
All grant funded work will be open sourced
Your File System Requirements Server scalability (~60,000 clients per server vs ~1000) Networking Improvements
10GBit networks IPv6 TCP and/or SCTP in addition to UDP communications
Optimized file change notification protocol Read/write replication in addition to read-only replication Server based virtual query volumes Directory improvements
Internationalization, Extended Attributes, Multiple Data Streams per Object Mandatory locking End-to-end Security
AES-256 encryption Both Kerberos and X.509 certificates for authentication Per Service Keys Anonymous Client Access is Protected
YFS Phase I SuccessSee openafs-info archive 10/2/2008 e-mailRx Packet Management Issues addressed in
1.4.8 and 1.5.531.4.8 Rx stack is capable of 124MB/sec over a
10Gbit link
YFS Phase II First Year Road MapRx Improvements
Path MTU DiscoveryLarge Data BuffersNew JumbogramsWindow Size
NegotiationDynamic Retransmit
CalculationMax Call NegotiationAsync APITCP transport
Protection ServiceAnonymous
Machine AccountsUbik enhancementsRxGKClient Improvements
Byte Range LockingDirect and
Synchronous I/ODemand Prefecting
YFS Phase II Second Year Road MapServer
ImprovementsEvent driven
workflowPosix EA backendService Port
IndependenceSplit Horizon
SupportVolume Release
OptimizationsRead Write
Replication
Extended AttributesPartition UUIDsLong Volume NamesPer File ACLsModern Directory
Format RxTCP IPv6 Support
OpenAFS Roadmap! Not a Wish ListAt Fall HEPIX OpenAFS committed to a road
map of deliverables over the next two years.1.6 Spring/Summer 20101.8 Fall/Winter 20102.0 Spring/Summer 20112.x Fall/Winter 2011
An aggressive schedule to say the least. Especially given the commitments but it can be done.
OpenAFS 1.6Its been more than four years. 1.4.x releases
have received many bug fixes and even some new features and performance improvements but major change has all been held back for 1.6.
Other than Windows which is always using the 1.5.x series for production.
What has taken so long?Source Code Quality and Demand Attach File
Service
1.6: Source Code QualityWhen 1.5 was branched there were close to 20,000
warnings produced as part of the x86 MacOS X buildToday it is possible to build the entire source tree
excluding 21 files without warningsIn the process hundreds of real bugs were fixedAs was evident from 1.2 instability, there were many
lock safety issues resulting in race conditions. Today there are many fewer.
Prior to the release of 1.6, YFS Inc. will complete a regression test harness that will permit the testing of failure cases in addition to those that are expected to succeed.
1.6: Rx Performance ImprovementsPacket leaks, free packet queue managementMTU size negotiation failuresRTT calculation errorsUnnecessary lock contention
Rx statisticsNewCall vs EndCallAll Write and Read paths
Races due to improper lockingWindow size errorsTransmit queues dumped packets on the floorNAT Keep-alive support> 260MB/second per Rx connection
File Server performance restricted by global locks above the rx layer
1.6: Linux Cache ManagerPerformance improvementsDynamic allocation of AFS kernel cache
entries to support inotify()-pinned entriesPath MTU detection
Linux Cache read performance:AFS should match ext3 below 1GB
1.6: MacOS X Cache ManagerMany Finder Improvements
Authentication events now refreshInsert only dropboxes
Improved installation experienceGUI queries for local cell information
AFS Command Preferences PaneKerberos v5 ticket renewal
Growl notification service integrationSignificant Rx event handling improvementsBulk-stat RPC support for faster directory
enumeration
1.6: Demand Attach File Servicean enhanced volume management library that
supports:lock-less I/Oon-demand attachment of volumesparallel shutdown of the file serveron-line salvaging of volumesautomatic detachment of inactive volumes
a new salvageserver daemon which can salvage volumes on-demand
a modified bos and bosserverfileserver state saving and restoration
host and callback state
1.6: OtherMajor Documentation ImprovementsNFS -> AFS translator for LinuxDNS SRV record support (replaces AFSDB
records)/afs/.:mount/cell:volume[:vnode:uniq] direct
object accessLarger than 2TB partitions (1.4 backport)Tivoli X/Open Backup APILibuafs (userland afs cache manager library)AIX6, FreeBSD7.x,8.x, Solaris11, …
1.6: Microsoft WindowsNothing new for 1.6. Everything is already
in 1.5.74Support for all existing operating systems
from Windows 2000 to Win7/2008-R2Fine grained locking everywherePerformance is bound by the SMB
implementationUnicode character set support
Native client running on my Win7 laptop to be integrated into 1.7.
What happens Post 1.6?When 1.6 branch is cut for release candidates, the
master branch becomes 1.7All major submissions ready for 1.8 will begin to merge
onto the masterIn order for this to happen in an orderly fashion, projects
must be able to break their code into small patch sets for submission to http://gerrit.openafs.org/One change per patchsetEach patchset reviewable in less than an hourNo patchset may break the build or reduce stability
Documentation to reviewers describing the protocol changes, architecture, and patch submission plan is strongly advised.
1.8 Feature TargetsHeimdal crypto replaces OpenAFS cryptorxk5 security classObject storageNative AFS redirector client for Microsoft Windows
(no support for Windows 2000)Rx UDP performance improvements
Window Size Negotiation*Dynamic Retransmit Calculation*Path MTU DiscoveryLarge Data BuffersImproved Jumbograms Max Call Negotiation
1.8: More Feature TargetsPTS authentication name extensions
Kerberos v5 and extendible to other name forms (GSS, X.509, SCRAM, …)
Extended callbacksSignificant reductions in network traffic
More Linux Cache Manager enhancementsByte Range LockingDirect and Synchronous I/ODemand Prefetching
Pthreaded Ubik servers
2.0: Feature Targetsrxgk security class
Kerberos v5, X.509 and SCRAM authenticationProtection of anonymous connectionsProtection of the server to client callback
connectionPermitting full use of Extended Callbacks
Metadata changes can be sent from server to clients as part of the notification avoiding even more network traffic and reducing cross-client change contention
File server coordinated byte range lockingWhatever else is ready based on work from
YFS,Inc and others
Unfunded Wish List Many things are not funded and not on the roadmap
Direct vicep access for Lustre or dCache dCache as an OSD backend Faster metadata performance in the file server backend Improved Fetch/Store Data RPCs
Scatter / gather variants Fetch Data with Hash
Avoid retransmitting data that is already valid in the cache Multiple writers use-case
More File Servers per cell Unix CM Profiling and use of Fine Grained Locking to improve concurrency Direct to object mount points On-the-fly volume splitting and / or striping LDAP backend for Protection Server Native Windows client
Initial version in 1.8 but there are many improvements that can be implemented AFS Explorer Shell integration AFS PAGs for MacOS X ZFS specific backend for AFS File Server Disconnected AFS Usability Improvements Performance Monitoring Instrumentation Extended Attributes and Multiple Data Streams
How to Move from Wish List to Road Map Targets?There is not enough money nor developers to
implement all of the functionality in the next two yearsImplementation designs and Cost/Time estimates for
each of the proposals must be developedPriorities need to be determined not only by the funders
desires but should include what the OpenAFS leadership believes is necessary to further adoption
This must include client side usability improvementsUser Shell integration (Explorer, Finder, Gnome, …)Porting Network Identity Manager to Linux and MacOS
OpenAFS Governance is Key Incorporation or Joining an Umbrella organization is blocked by the
IBM trademarks of “AFS” and “OpenAFS”Once the necessary permissions for use are obtained, the not-for-
profit corporation must be formed so that funds can be raised and pooled efficiently
Priorities would be set via a Technical Advisor Board (TAB) consisting of all large contributors, representatives of medium sized contributors, and representatives of individual users and developers
Gatekeepers would be advisors to the TAB providing expert review of proposals and producing architecture design documents
The corporation would issue RFQs to find developers to implement the approved designs, communicate with the standards communities, and manage the contractors
The Gatekeepers would be compensated for their time and an Executive Director would be hired to handle administrator functions
Network Identity Manager
Heimdal KerberosMIT has for all practical purposes abandoned
the Windows platformSecure Endpoints is porting Heimdal to
WindowsIncluding the KDC, KCA and Administration
ServicesIntends to support a Hardware Secure KDC
option
WinAurora TechnologiesYour File System, Inc. will be migrating
WinAurora to the Windows 7 platformMorgan Stanley has agreed in principal to
open source the underlying technologiesYFS hopes to build a public database of
application configurations that will permit organizations to quickly deploy applications to thousands of desktops from AFS
Registry VirtualizationA kernel driver virtualizes the registry on a per
application basis using hives that are stored within AFS
At process startup, an environment block configuration is injected
Custom Access Control Entry strings are stored in the process default access control list (DACL) to identify the virtual registry associated with the process
Local Procedure Calls communicate the DACL to the executing service thread permitting the registry configuration to be used during all stages of process execution
OpenAFS End User ExperienceThe biggest bang for the buck comes from
upgrading end user experienceImproving the end user experience will
increase the demand for the serviceUsers do not ask for particular technologies
No user ever said they wanted WebDav storageHere are some ways that the AFS experience
can be improved for end users on Microsoft WindowsSimilar improvements can be made on other
operating system environments
Explorer Shell Integration:AFS Column Selection
Explorer Shell Integration:AFS MetaData within Columns
Explorer Shell Integration:File and Directory Properties
Explorer Shell Integration:Volume and ACL Properties
Explorer Shell Integration:Mountpoint and Symlink Properties
Explorer Shell Integration:Meta Data Display
You have questions?I have answers!
Take your best shot
Contact InfoJeffrey AltmanPresidentYour File System [email protected]+1 212 769-9018