version control systems (part 1)(part 1) devin j. pohly cmpsc 311: introduction to systems...
TRANSCRIPT
CMPSC 311: Introduction to Systems Programming Page 1
Institute for Networking and Security ResearchDepartment of Computer Science and EngineeringPennsylvania State University, University Park, PA
Systems and Internet Infrastructure Security
i
i
Version Control Systems(Part 1)
Devin J. Pohly <[email protected]>
Page 2CMPSC 311: Introduction to Systems Programming
A version control system is a system for keeping track of the changes made to a document (or collection of
documents) over time
Version control systems
• Any kind of document...‣ Resumes‣ Source code‣ TPS reports
Page 3CMPSC 311: Introduction to Systems Programming
Why?
• It’s a time machine!‣ Look at old versions‣ Never lose anything‣ Revert your mistakes‣ Code fearlessly!
• Collaboration‣ Work in parallel‣ Merge changes‣ Social coding
Page 4CMPSC 311: Introduction to Systems Programming
A word about words
• These terms all refer to the same thing:‣ Version control system‣ Revision control system‣ Source code/control
management‣ VCS/RCS/SCM
• For this lecture: “version control” and VCS
Page 5CMPSC 311: Introduction to Systems Programming
Basic concepts• Revision: one meaningful
change or set of changes
• Repository: where all of the revisions are stored
• Working copy: copy of one revision, where the user makes changes
• Check out: get a working copy from repository
• Check in/commit: add a new revision to repository
Page 6CMPSC 311: Introduction to Systems Programming
Basic concepts
• Branches: parallel lines of development
• Trunk/master: main development branch
• Tip/head: latest revision on a branch
• Tag: special name given to an important revision‣ Often used for numbered
releases like “v4.0”
Page 7CMPSC 311: Introduction to Systems Programming
The first generation
• Local VCSes
• 1970s and 80s
• SCCS, RCS
• Repository stored in a shared local directory
• User must lock a file before making changes
• Lock-edit-unlock model
Page 8CMPSC 311: Introduction to Systems Programming
RCS usage
• Extremely simple
• Check out (read-only)‣ rcs co foo.h
• Check out and lock‣ rcs co -l foo.c
• Check in and commit changes‣ rcs ci foo.c
Page 9CMPSC 311: Introduction to Systems Programming
How RCS works
• Each file has its own repository in the RCS directory, in which all of that file’s revisions are stored
Repository
main.c1.1–1.50
foo.c1.1–1.12
foo.h1.1–1.5
Alice Bob
Page 10CMPSC 311: Introduction to Systems Programming
How RCS works
• Anyone can check out a read-only working copy of a file.
Repository
main.c1.1–1.50
foo.c1.1–1.12
foo.h1.1–1.5
Alice Bob
foo.h1.5
foo.h1.5
rcs co foo.h rcs co foo.h
Page 11CMPSC 311: Introduction to Systems Programming
How RCS works
• If Alice wants to make changes to foo.c, she must lock the file for writing when she checks it out.
Repository
main.c1.1–1.50
foo.c1.1–1.12
foo.h1.1–1.5
Alice Bob
foo.c1.12
foo.h1.5
foo.h1.5
rcs co -l foo.c
Page 12CMPSC 311: Introduction to Systems Programming
How RCS works
• Once Alice has locked foo.c, nobody else may lock it.
• Alice can now safely edit her local copy of the file.
Repository
main.c1.1–1.50
foo.c1.1–1.12
foo.h1.1–1.5
Alice Bob
foo.c1.12
foo.h1.5
foo.h1.5
rcs co -l foo.cvim foo.c
Page 13CMPSC 311: Introduction to Systems Programming
How RCS works
• You can only commit changes to a file if you hold the lock.
• Committing foo.c checks in Alice’s changes as a new revision, then unlocks the file so others can lock it.
Repository
main.c1.1–1.50
foo.c1.1–1.12
foo.h1.1–1.5
Alice Bob
foo.c1.12
foo.h1.5
foo.h1.5
rcs ci foo.c vim foo.hrcs ci foo.h
Page 14CMPSC 311: Introduction to Systems Programming
How RCS works
• When Alice commits her modified foo.c, the repository creates the new revision number 1.13.
Repository
main.c1.1–1.50
foo.c1.1–1.13
foo.h1.1–1.5
Alice Bob
foo.h1.5
foo.h1.5
Page 15CMPSC 311: Introduction to Systems Programming
The second generation
• Centralized VCSes
• 1980s to 2000s
• CVS, Subversion (SVN)
• Still widely used
• Repository on a server with many clients
• Copy-modify-merge model
Page 16CMPSC 311: Introduction to Systems Programming
How Subversion works
• Spot the differences!
Repository1–56
foo.cfoo.h
main.c
Alice Bob
Page 17CMPSC 311: Introduction to Systems Programming
How Subversion works
• Files are stored in one repository rather than individual ones.
• Repository and users can all be on different hosts.
Repository1–56
foo.cfoo.h
main.c
Alice Bob
Page 18CMPSC 311: Introduction to Systems Programming
How Subversion works
• Checkout does the same thing: gets a working copy of the latest revision (which now contains all the files) from the repository.
• You don’t have to lock files to change them.
Alice Bob
foo.c foo.c
foo.h
main.c
foo.h
main.c
svn checkout URL svn checkout URL
Repository1–56
foo.cfoo.h
main.c
Page 19CMPSC 311: Introduction to Systems Programming
How Subversion works
• If Alice modifies some files and commits her changes...
Alice Bob
foo.c foo.c
foo.h
main.c
foo.h
main.c
vim foo.cvim foo.hsvn commit
Repository1–56
foo.cfoo.h
main.c
Page 20CMPSC 311: Introduction to Systems Programming
How Subversion works
• ... a new revision of the repository (r57) is created.
Alice Bob
foo.c foo.c
foo.h
main.c
foo.h
main.c
Repository1–57
foo.cfoo.h
main.c
Page 21CMPSC 311: Introduction to Systems Programming
How Subversion works
• If Bob makes changes that don’t overlap with Alice’s, Subversion can merge them automatically.
• This is what happens most of the time.
Alice Bob
foo.c foo.c
foo.h
main.c
foo.h
main.c
vim main.csvn commit
Repository1–57
foo.cfoo.h
main.c
Page 22CMPSC 311: Introduction to Systems Programming
How Subversion works
• If Subversion can’t merge the changes automatically, it notifies Bob that there is a merge conflict.
Alice Bob
foo.c foo.c
foo.h
main.c
foo.h
main.c
vim foo.csvn commit
Repository1–57
foo.cfoo.h
main.c
Page 23CMPSC 311: Introduction to Systems Programming
How Subversion works
• So Bob updates his repository with Alice’s changes, merges them with his, and tries to commit again.
Alice Bob
foo.c foo.c
foo.h
main.c
foo.h
main.c
svn updatevim foo.csvn commit
Repository1–57
foo.cfoo.h
main.c
Page 24CMPSC 311: Introduction to Systems Programming
How Subversion works
• Bob’s commit succeeds this time, creating revision 58, which is then stored in the repository.
Alice Bob
foo.c foo.c
foo.h
main.c
foo.h
main.c
Repository1–58
foo.cfoo.h
main.c
Page 25CMPSC 311: Introduction to Systems Programming
Commit graphs
• By convention, the arrow points from the child revision to the parent revision.
• Every branch in Subversion or RCS has an entirely linear commit graph.
• (Branches are linearized when you merge them.)
56 57 58SVN repository
Page 26CMPSC 311: Introduction to Systems Programming
The third generation
• Distributed VCSes
• 2000s to today
• Bazaar, Git, Mercurial
• Seeing widespread use
• Everyone has a full repository
• Highly collaborative‣ Linux development‣ GitHub and other “social
coding” sites
Page 27CMPSC 311: Introduction to Systems Programming
Sharing your revisions
• When you commit your changes, the revision is stored in your local repository
• All communication is between repositories‣ You push local revisions
to a remote repository...‣ and you pull revisions
from a remote repository into the local one.
Page 28CMPSC 311: Introduction to Systems Programming
How Git works
• Spot the differences
“Official”repository
..A
foo.cfoo.h
main.c
Alice Bob
Page 29CMPSC 311: Introduction to Systems Programming
How Git works
• Alice and Bob will both have their own repositories, no different from the “official” one.
“Official”repository
..A
foo.cfoo.h
main.c
Alice Bob
Page 30CMPSC 311: Introduction to Systems Programming
How Git works
• First step is to clone the repository, not check it out.
• This gives you a local clone of the entire repo!
git clone URL git clone URL
“Official”repository
..A
foo.cfoo.h
main.c
Alice’srepository
..A
foo.cfoo.h
main.c
Bob’srepository
..A
foo.cfoo.h
main.c
Alice Bob
Page 31CMPSC 311: Introduction to Systems Programming
How Git works
• Alice can now work locally (and offline), without worrying about other repositories.
AliceA
foo.c
foo.h
main.c
git checkout
Alice’srepository
..A
foo.cfoo.h
main.c
Page 32CMPSC 311: Introduction to Systems Programming
How Git works
• Alice edits foo.c as usual, adds her changes to the new revision, and commits it.
AliceA
foo.c
foo.h
main.c
vim foo.cgit add foo.cgit commit
Alice’srepository
..A
foo.cfoo.h
main.c
Page 33CMPSC 311: Introduction to Systems Programming
How Git works
• Revision B, based on A, is now in Alice’s repository.
AliceB
foo.c
foo.h
main.c
Alice’srepository
..B
foo.cfoo.h
main.c
Page 34CMPSC 311: Introduction to Systems Programming
How Git works
• In our Git example, Alice has committed revision B “onto” revision A.
A BAlice’s repository
Page 35CMPSC 311: Introduction to Systems Programming
How Git works
• In our Git example, Alice has committed revision B “onto” revision A.
• She can then commit another revision C onto that.
A B CAlice’s repository
Page 36CMPSC 311: Introduction to Systems Programming
Push and pull in Git
• Alice can push her new revisions to another repository, such as the “official” one.
• The revision being pushed must be a descendant of the remote one.
git push
A..C “Official”
repository..A
foo.cfoo.h
main.c
Alice’srepository
..C
foo.cfoo.h
main.c
Bob’srepository
..B'
foo.cfoo.h
main.c
Alice Bob
Page 37CMPSC 311: Introduction to Systems Programming
Push and pull in Git
• The official repository now contains Alice’s changes.
• Notice Bob has also committed B' but not yet pushed!
“Official”repository
..C
foo.cfoo.h
main.c
Alice’srepository
..C
foo.cfoo.h
main.c
Bob’srepository
..B'
foo.cfoo.h
main.c
Alice Bob
Page 38CMPSC 311: Introduction to Systems Programming
Push and pull in Git
• Bob’s commit graph has his new revision, but none of Alice’s.
A B’Bob’s repository
Page 39CMPSC 311: Introduction to Systems Programming
Push and pull in Git
• Bob cannot push his changes yet, because B' is not a descendant of C.
git push
“Official”repository
..C
foo.cfoo.h
main.c
Alice’srepository
..C
foo.cfoo.h
main.c
Bob’srepository
..B'
foo.cfoo.h
main.c
Alice Bob
Page 40CMPSC 311: Introduction to Systems Programming
Push and pull in Git
• Bob needs to get Alice’s new changes and merge them.
• First he pulls the revisions from the official repo...
git pull
“Official”repository
..C
foo.cfoo.h
main.c
Alice’srepository
..C
foo.cfoo.h
main.c
Bob’srepository
..B'
foo.cfoo.h
main.c
Alice Bob
A..C
Page 41CMPSC 311: Introduction to Systems Programming
Git merges
• ... and they are added to his repository.
• He can now merge Alice’s changes with his.
A B
B'
CBob’s repository
Page 42CMPSC 311: Introduction to Systems Programming
Git merges
• This creates a new merge revision D which has two parents: B' and C.
A B
B'
C
D
Bob’s repository
Page 43CMPSC 311: Introduction to Systems Programming
Coming full circle
• Since D is a descendant of C, Bob can now push!
git push
“Official”repository
..C
foo.cfoo.h
main.c
Alice’srepository
..C
foo.cfoo.h
main.c
Bob’srepository
..D
foo.cfoo.h
main.c
Alice Bob
Page 44CMPSC 311: Introduction to Systems Programming
Coming full circle
• The official repository now has revision D, which contains both Alice’s and Bob’s changes.
“Official”repository
..D
foo.cfoo.h
main.c
Alice’srepository
..C
foo.cfoo.h
main.c
Bob’srepository
..D
foo.cfoo.h
main.c
Alice Bob
Page 45CMPSC 311: Introduction to Systems Programming
Coming full circle
• Developers can also collaborate directly.
• Here Alice gets Bob’s latest revisions directly from Bob himself rather than from the “official” repository.
git pull BOB_URL
“Official”repository
..D
foo.cfoo.h
main.c
Alice’srepository
..C
foo.cfoo.h
main.c
Bob’srepository
..D
foo.cfoo.h
main.c
Alice Bob
C..D
Page 46CMPSC 311: Introduction to Systems Programming
Coming full circle
• This could be used, for instance, to collaborate on experimental features that aren’t ready for prime-time.
“Official”repository
..D
foo.cfoo.h
main.c
Alice’srepository
..D
foo.cfoo.h
main.c
Bob’srepository
..D
foo.cfoo.h
main.c
Alice Bob
Page 47CMPSC 311: Introduction to Systems Programming
Trends in version control
• Isolated to collaborative
• Serial to concurrent
• Linear to branching
• Centralized to distributed
• Limited workflows to many possibilities