november 19, 2009coms w41561 coms w4156: advanced software engineering prof. gail kaiser...
TRANSCRIPT
November 19, 2009 COMS W4156 1
COMS W4156: Advanced Software Engineering
Prof. Gail Kaiser
http://bank.cs.columbia.edu/classes/cs4156/
November 19, 2009 COMS W4156 2
Topics covered in this lecture
• Open Source and Free Software
November 19, 2009 COMS W4156 3
Open Source Software
November 19, 2009 COMS W4156 4
What is Open Source Software?
• Open source usually refers to a program whose source code is available to the general public for use and/or modification free of charge, with no restrictions.
• Open source code is typically created as a collaborative effort in which programmers improve upon the code and share the changes with the general public.
• The rationale for this “movement” is that a larger group of programmers not concerned with proprietary ownership or financial gain will produce a more useful and bug-free product for everyone to use.
• The concept relies on peer review to find and eliminate bugs in the code, in contrast to proprietary programs.
November 19, 2009 COMS W4156 5
Technical Case• Central part of engineering tradition, part of working
method almost by instinct, for Internet and Unix hackers – Who were primarily students and faculty in an era in which the
federal government decreed that everything sponsored by government grants was “public domain” (this changed in 1980)
• The running gears of the Internet are astonishingly reliable relative to their nearest commercial equivalents. – TCP/IP, DNS, sendmail, …
November 19, 2009 COMS W4156 6
Economic Case
• The use value of a program is its economic value as a tool.
• The market value of a program is its value as a saleable commodity.
• The monopoly value is the value you gain not just from having the use of a program but from having it be unavailable to your competitors.
November 19, 2009 COMS W4156 7
“Open-Source Doomsday”
• The market value and monopoly value of software goes to zero because of all the free sources out there.
• Use value alone doesn't attract enough consumers to support software development.
• The commercial software industry collapses. • Software engineers starve or leave the field.• Doomsday arrives when the open-source culture
itself (dependent on the spare time of pros and amateurs) collapses, leaving nobody who can program competently.
November 19, 2009 COMS W4156 8
Shaky Assumption #1: Software engineering will collapse if
software has no market value• Proportion of all code written in-house at companies
other than software vendors is large (estimates vary)• Includes most MIS: financial- and database-software
customizations • Also includes OEM software like device drivers and
embedded code• Most vertical code is integrated with its environment
in ways that make reusing or copying it very difficult. • This is true whether the “environment” is a business
office's set of procedures or the fuel-injection system of a combine harvester.
November 19, 2009 COMS W4156 9
Combine Harvester
November 19, 2009 COMS W4156 10
Shaky Assumption #1 (cont): Software engineering will collapse if
software has no market value• Thus, as the environment changes (e.g., new business
processes, new product models), there is a lot of work continually needed to keep the software in step.
• “Maintenance” makes up the vast majority of what software engineers get paid to do.
• And it will still need to be done, even if/when most software is open-source.
• Between originating, customizing and maintaining vertical code (and related tasks like system administration and troubleshooting), the use value of software would still support the millions of good jobs even if all “horizontal” or stand-alone software were free.
November 19, 2009 COMS W4156 11
Shaky Assumption #2: Open-source software has no market value
• Some of the world’s most successful software-based businesses “give away” software source code (e.g., google, sun, ibm, even microsoft)
• Many smaller companies sell handholding and support for “free” software - a place to go when you have problems.
November 19, 2009 COMS W4156 12
Shaky Assumption #3: Open-source software has no monopoly value
• Adopting or even just studying someone else's software is not a costless, frictionless process; you need to dedicate skilled time to it.
• As product cycle times drop, coattail-riding gets less attractive, because the payoff period shrinks relative to the time you had to dedicate.
• And time your skilled people spend studying someone else's “monopoly” code is time you're spending getting to where the competition used to be (rather than where they are now).
November 19, 2009 COMS W4156 13
Business Case
• High reliability • Open-source software is (in principle) peer-reviewed software; it
is thus more reliable than closed, proprietary software. • Mature open-source code is as bulletproof as software ever gets.• Development Speed• Lower Overhead• Closeness to the Customer • Broader Market• Grab “Mind Share” (e.g., for startups)• …
November 19, 2009 COMS W4156 14
Investor Case
• Support Sellers: give away the software product, but sell distribution and after-sale service.
• Loss Leaders: give away open-source as a loss-leader and market positioner for proprietary (closed) software.
• Widget Frosting: a hardware company goes open-source in order to get better drivers and interface tools cheaper.
• Accessorizing: selling accessories -- books, compatible hardware, complete systems with open-source software pre-installed (plus t-shirts and stuffed animals )
November 19, 2009 COMS W4156 15
Customer Case
• Open-source model applies even to internally developed software.– You are your developer’s customer!
• Freedom from legal entanglements such as tracking copies and usage.– Very hard to do accurately.
• Higher Security - security through obscurity just does not work. • Proprietary sources create a false sense of security.• The bad guys will always find the holes.• It is harder to distribute trustworthy fixes when a hole is revealed.
November 19, 2009 COMS W4156 16
Marketing Case
• Why not call it, as we traditionally have, free software? • The term “free software” has a load of fatal baggage; to
a businessperson, it's too redolent of fanaticism, flakiness and strident anti-commercialism
• In marketing appearance is reality. The appearance that we're willing to climb down off the barricades and work with the corporate world counts for as much as the reality of our behavior, our convictions, and our software.
November 19, 2009 COMS W4156 17
Free Software
November 19, 2009 COMS W4156 18
Free Software Foundation
• The Free Software Foundation (FSF) is dedicated to eliminating restrictions on copying, redistribution, understanding, and modification of computer programs.
• “Free software” is a matter of liberty, not price.
• Think “free speech”, not “free beer”.
November 19, 2009 COMS W4156 19
Free Software Tenets
1. The freedom to run the program, for any purpose.2. The freedom to study how the program works, and
adapt it to your needs. Access to the source code is a precondition for this.
3. The freedom to redistribute copies so you can help your neighbor.
4. The freedom to improve the program, and release your improvements to the public, so that the whole community benefits. Access to the source code is a precondition for this.
November 19, 2009 COMS W4156 20
Example Open Source License
• Gnu General Public License (GPL), developed by Richard Stallman and the Free Software Foundation starting in 1985
• Certified by OSI
November 19, 2009 COMS W4156 21
Open Source Initiative
November 19, 2009 COMS W4156 22
Open Source Initiative (OSI)
http://www.opensource.org/• Open source doesn't just mean
access to the source code. • The distribution terms of open-
source software must comply with the following criteria: [upcoming slides]
November 19, 2009 COMS W4156 23
Open Source Definition
1. Free RedistributionThe license shall not restrict any party from
selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.
November 19, 2009 COMS W4156 24
Open Source Definition2. Source CodeThe program must include source code, and must allow
distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost, preferably downloading via the Internet without charge. …
The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed
November 19, 2009 COMS W4156 25
Open Source Definition
3. Derived Works
The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.
November 19, 2009 COMS W4156 26
Open Source Definition
4. Integrity of The Author's Source CodeThe license may restrict source-code from being
distributed in modified form only if the license allows the distribution of "patch files" with the source code for the purpose of modifying the program at build time. …
The license must explicitly permit distribution of software built from modified source code. The license may require derived works to carry a different name or version number from the original software.
November 19, 2009 COMS W4156 27
Open Source Definition
5. No Discrimination Against Persons or Groups
The license must not discriminate against any person or group of persons.
November 19, 2009 COMS W4156 28
Open Source Definition
6. No Discrimination Against Fields of Endeavor
The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
November 19, 2009 COMS W4156 29
Open Source Definition
7. Distribution of License
The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.
November 19, 2009 COMS W4156 30
Open Source Definition
8. License Must Not Be Specific to a Product
The rights attached to the program must not depend on the program's being part of a particular software distribution. …
November 19, 2009 COMS W4156 31
Open Source Definition
8. License Must Not Be Specific to a Product
… If the program is extracted from that distribution and used or distributed within the terms of the program's license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution.
November 19, 2009 COMS W4156 32
Open Source Definition
9. License Must Not Restrict Other Software
The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software.
November 19, 2009 COMS W4156 33
Open Source Definition
10. License Must Be Technology-Neutral
No provision of the license may be predicated on any individual technology or style of interface.
November 19, 2009 COMS W4156 34
Open Source Licenses
Open Source Licenses (by name or by category) comply with the Open Source Definition and are listed by OSI after going through their approval process.
November 19, 2009 COMS W4156 35
Eric Raymond
November 19, 2009 COMS W4156 36
The Cathedral and the Bazaar
• Eric S. Raymond (esr)
• first presented May 1997, ongoing revision through September 2000
http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/
November 19, 2009 COMS W4156 37
The Cathedral
• Draws an analogy between traditional closed source development and a “cathedral”, in which there is a rigid hierarchy among developers, managers, testers, etc.
• esr originally believed that the most important software (operating systems and really large tools like the emacs programming editor) needed to be built like cathedrals, carefully crafted by individual wizards or small bands of mages working in splendid isolation, with no beta to be released before its time.
November 19, 2009 COMS W4156 38
The Bazaar
• Likens open source projects to Middle Eastern bazaars, where numerous merchants hawk their wares loudly to passersby.
• Little hierarchy among contributors. • Contributors compete to have their
modifications inserted into the next release, bringing recognition and reputation.
November 19, 2009 COMS W4156 39
Based Primarily on Linux
• Describes Linus Torvalds' style of development as: release early and often, delegate everything you can, be open to the point of promiscuity.
November 19, 2009 COMS W4156 40
Open Source Lessons (1)
1. Every good work of software starts by scratching a developer's personal itch.
2. Good programmers know what to write. Great ones know what to rewrite (and reuse).
3. “Plan to throw one away; you will, anyhow.” (Fred Brooks, The Mythical Man-Month, Chapter 11)
November 19, 2009 COMS W4156 41
Open Source Lessons (2)
4. If you have the right attitude, interesting problems will find you.
5. When you lose interest in a program, your last duty to it is to hand it off to a competent successor.
6. Treating your users as co-developers is your least-hassle route to rapid code improvement and effective debugging.
November 19, 2009 COMS W4156 42
Open Source Lessons (3)
7. Release early. Release often. And listen to your customers.
8. Given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone.
9. Smart data structures and dumb code works a lot better than the other way around.
November 19, 2009 COMS W4156 43
Open Source Lessons (4)10.If you treat your beta-testers as if they're your
most valuable resource, they will respond by becoming your most valuable resource.
11.The next best thing to having good ideas is recognizing good ideas from your users. Sometimes the latter is better.
12.Often, the most striking and innovative solutions come from realizing that your concept of the problem was wrong.
November 19, 2009 COMS W4156 44
Open Source Lessons (5)
13.Perfection (in design) is achieved not when there is nothing more to add, but rather when there is nothing more to take away.
14.Any tool should be useful in the expected way, but a truly great tool lends itself to uses you never expected.
November 19, 2009 COMS W4156 45
Necessary Preconditions for the Bazaar Style
• One cannot code from the ground up in bazaar style.
• One can test, debug and improve in bazaar style, but it would be very hard to originate a project in bazaar mode.
• Your nascent developer community needs to have something unable and testable to play with.
November 19, 2009 COMS W4156 46
Necessary Preconditions for the Bazaar Style (cont)
• To start community-building, what you need to be able to present is a plausible promise.
• Your program doesn't have to work particularly well. It can be crude, buggy, incomplete and poorly documented.
• What it must not fail to do is (a) run, and (b) convince potential co-developers that it can be evolved into something really neat in the foreseeable future – e.g., through strong, attractive basic design.
November 19, 2009 COMS W4156 47
Where did “open source” come from?
November 19, 2009 COMS W4156 48
Who Invented Open Source?
• No one knows• Some say Linus Torvalds, initial developer of
Linux (c. 1992)• Some say Richard Stallman, founder of GNU
Project (c. 1984)• But lots of earlier software is public domain,
e.g., original implementations of TCP/IP, DNS, sendmail, various other networking software
November 19, 2009 COMS W4156 49
Why Didn’t Open Source Happen Earlier?
• Well, it did…• That’s where Unix and the Internet came from…• But in a relatively small academic-oriented circle• And the pace was very slow• Legal constraints of various licenses, trade
secrets, and commercial interests (e.g., wrt Unix)• The Internet wasn't (yet) good enough – egoless
programming could only work in geographically compact communities.
November 19, 2009 COMS W4156 50
Web and ISP Industry
• Linux was the first project to make a conscious and successful effort to use the entire world as its talent pool.
• The gestation period of Linux coincided with the birth of the World Wide Web, and Linux left its infancy during the same period in 1993-1994 that saw the takeoff of the ISP industry and the explosion of mainstream interest in the Internet.
November 19, 2009 COMS W4156 51
Open Source Today
• Individual companies provide some of their software as open source (IBM, Microsoft, Sun)
• Numerous managed projects (GNU, Linux, Apache, Mozilla, OpenOffice, MySQL, …)
• Unmanaged hosting available– Sourceforge– Google Code– Launchpad– Codeplex– others
November 19, 2009 COMS W4156 52
Final Notes
November 19, 2009 COMS W4156 53
Next Assignment
• Midterm Individual Assessment due Friday, November 20th
November 19, 2009 COMS W4156 54
Upcoming Deadlines
• Second Iteration Plan due November 24th
• Code Inspection “week” November 23rd – December 2nd, including during class time (schedule with your TA)
• Second Iteration Progress Report due December 3rd
November 19, 2009 COMS W4156 55
COMS W4156: Advanced Software Engineering
Prof. Gail Kaiser
http://bank.cs.columbia.edu/classes/cs4156/