Based on a Wikipedia
article
Last edited on October 11th, 2006 by Garrett Rooney and Guido Haarmans
Summary: a short overview of Version Control, based on a Wikipedia article but edited for Subversion users.
Version Control (also known as Revision Control) is the management of multiple versions of the same unit of information. It is most commonly used in engineering and software development to manage ongoing evolution of digital documents like source code, blueprints or electronic models and other critical information that may be worked on by a team of people. Changes to these documents are identified by incrementing an associated number or letter code, termed the "version number", "version level", or simply "version" and associated historically with the person making the change. A simple form of version control, for example, has the initial issue of a drawing assigned the version number "1". When the first change is made, the version number is incremented to "2" and so on.
Software tools for version control are increasingly recognized as being necessary for most software development projects.
Overview
Engineering version control developed from formalized
processes based on tracking versions of early blueprints. Implicit in
this control was the option to be able to return to any earlier state
of the design, for cases in which an engineering dead-end was reached
in iterating any particular engineering design. Likewise, in computer
software engineering, version control is any practice which tracks and
provides controls over changes to source code. Software developers
sometimes use version control software to maintain documentation and
configuration files as well as source code. In theory, version control
can be applied to any type of information record. However, in practice,
the more sophisticated techniques and tools for version control have
rarely been used outside software development circles (though they
could actually be of benefit in many other areas).
As software is developed and deployed, it is extremely
common for multiple versions of the same software to be deployed in
different sites, and for the software's developers to be working
privately on updates. Bugs and other issues with software are often
only present in certain versions (because of the fixing of some
problems and the introduction of others as the program evolves).
Therefore, for the purposes of locating and fixing bugs, it is vitally
important for the debugger to be able to retrieve and run different
versions of the software to determine in which version(s) the problem
occurs. It may also be necessary to develop two versions of the
software concurrently (for instance, where one version has bugs fixed,
but no new features, while the other version is where new features are
worked on).
At the simplest level, developers can simply retain
multiple copies of the different versions of the program, and number
them appropriately. This simple approach has been used on many large
software projects. Whilst this method can work, it is inefficient (as
many near-identical copies of the program will be kept around),
requires a lot of self-discipline on the part of developers, and often
leads to mistakes. Consequently, systems to automate some or all of the
version control process have been developed.
Traditionally, version control systems have used a
centralized model, where all the version control functions are
performed on a shared server. A few years ago, systems began using a
model where developers work directly with their own local working
copies and check in code only when needed. There are two mechanisms
that ensure that developers do not overwrite each others work when
checking in code.
The
Lock-Modify-Unlock Solution
In most software development projects, multiple
developers work on the program at the same time. If two developers try
to change the same file at the same time, without some method of
managing access the developers may well end up overwriting each other's
work. Most version control systems solve this in one of two ways.
Many version control systems use a lock-modify-unlock
model to address the problem of many authors clobbering each other's
work. In this model, the repository allows only one person to change a
file at a time. This exclusivity policy is managed using locks. Harry
must "lock" a file before he can begin making changes to it. If Harry
has locked a file, then Sally cannot also lock it, and therefore cannot
make any changes to that file. All she can do is read the file, and
wait for Harry to finish his changes and release his lock. After Harry
unlocks the file, Sally can take her turn by locking and editing the
file.
The
Lock-Modify-Unlock Solution
In most software development projects, multiple
developers work on the program at the same time. If two developers try
to change the same file at the same time, without some method of
managing access the developers may well end up overwriting each other's
work. Most version control systems solve this in one of two ways.
Many version control systems use a lock-modify-unlock
model to address the problem of many authors clobbering each other's
work. In this model, the repository allows only one person to change a
file at a time. This exclusivity policy is managed using locks. Harry
must "lock" a file before he can begin making changes to it. If Harry
has locked a file, then Sally cannot also lock it, and therefore cannot
make any changes to that file. All she can do is read the file, and
wait for Harry to finish his changes and release his lock. After Harry
unlocks the file, Sally can take her turn by locking and editing the
file.

The
Copy-Modify-Merge Solution
Subversion and other version control systems
additionally can use a copy-modify-merge model as
an alternative to locking. In this model, each user's client contacts
the project repository and creates a personal working copy—a
local reflection of the repository's files and directories. Users then
work in parallel, modifying their private copies. Finally, the private
copies are merged together into a new, final version. The version
control system often assists with the merging, but ultimately a human
being is responsible for making it happen correctly.
Here's an example. Say that Harry and Sally each create
working copies of the same project, copied from the repository. They
work concurrently, and make changes to the same file A within their
copies. Sally saves her changes to the repository first. When Harry
attempts to save his changes later, the repository informs him that his
file A is out-of-date. In other words, that file
A in the repository has somehow changed since he last copied it. So
Harry asks his client to merge any new changes
from the repository into his working copy of file A. Chances are that
Sally's changes don't overlap with his own; so once he has both sets of
changes integrated, he saves his working copy back to the repository.
But what if Sally's changes do
overlap with Harry's changes? What then? This situation is called a conflict,
and it's usually not much of a problem. When Harry asks his client to
merge the latest repository changes into his working copy, his copy of
file A is somehow flagged as being in a state of conflict: he'll be
able to see both sets of conflicting changes, and manually choose
between them. The copy-modify-merge model may sound a bit chaotic, but
in practice, it runs extremely smoothly. Users can work in parallel,
never waiting for one another. When they work on the same files, it
turns out that most of their concurrent changes don't overlap at all;
conflicts are infrequent. And the amount of time it takes to resolve
conflicts is far less than the time lost by a locking system.
Reviewers
Some systems attempt to manage who
is allowed to make changes to different aspects of the program, for
instance, allowing changes to a file to be checked by a designated
reviewer before being added.
Delta
Compression
Most version control software use
delta compression, which retains only the differences between
successive versions of files. This allows more efficient storage of
many different versions of files. Subversion has this capability.
Integration
with other tools
Some of the more advanced version
control tools offer many other facilities, allowing deeper integration
with other tools and software engineering processes. Plugins are often
available for IDEs such as Eclipse, the NetBeans IDE and Vistual
Studio. Version Control Systems are also often at the heart of
Application Lifecycle Management Solutions such as CollabNet Enterprise
Edition.
Vocabulary
Atomic Commit:
A collection of modifications either goes into the repository
completely, or not at all. This allows developers to construct and
commit changes as logical chunks, and prevents problems that can occur
when only a portion of a set of changes is successfully sent to the
repository.
Baseline: An approved version of a document
or source file from which subsequent changes can be made.
Change: A
change (or diff, or delta) represents a specific modification to a
document under version control. The granularity of the modification
considered a change varies between version control systems.
Change List: On
many version control systems with atomic multi-change commits, a
changelist (or change set) identifies the set of changes made in a
single commit. This can also represent a sequential view on the source
code, allowing source to be examined as of any particular changelist ID.
Check-Out: A
check-out (or checkout or co) creates a local working copy from the
repository. Either a version is specified, or the latest is used.
Commit: A
commit occurs when a copy of the changes made to the working copy is
made to the repository.
Conflict: A
conflict occurs when two changes are made by different parties to the
same document or place within a document. Since the software may not be
intelligent enough to decide which change is 'correct', a user is
required to resolve the conflict.
Directory Versioning:
abilty of modern version control system to not only version individual
files but also track changes to whole directory trees over time. Files
and directories are versioned.
Export: An
export is similar to a check-out except that it creates a clean
directory tree without the version control metadata used in a working
copy. Often used prior to publishing the contents.
Import: An
import is the action of copying a local directory tree (not a working
copy) into the repository.
Merge / Integration:
A merge or integration brings together (merges) concurrent changes into
a unified version.
Resolve: The
act of user intervention to address a conflict between different
changes to the same document.
Repository: The
repository is where the file data is stored, often on a server.
Version: A
version or version is one version in a chain of changes.
Versioned metadata:
ability to add arbitrary key/value pairs to files and directories,
including the tracking of versions to these values over time.
Update: An
update (or sync) copies the changes that were made to the repository
(e.g. by other people) into the local working directory.
Working copy:
The working copy is the local copy of files from a repository, at a
specific time or version. All work done to the files in a repository is
done on a working copy, hence the name.
Most content from this article was
derived from the Wikipedia article "Version Control",
licensed under the GNU Free Documentation License.
Additional content was derived from "Version Control with Subversion",
licensed under the Creative Commons Attribution License.