Version Control

Michael L. Collard, Ph.D.

Department of Computer Science, The University of Akron

SCM

  • Software Configuration Management
  • Tracking and controlling changes to files used in software development
  • Based on revision control (version control)
  • Used for build management
  • Also used for accounting and auditing of process and products

diff and patch

  • Distribute changes efficiently
  • Simplistic form of handling versions
  • diff utility creates patch file
  • patch utility can be used to create new file

Ex: Create Patch

Create Patch

# download original source
curl https://gist.githubusercontent.com/mlcollard/b96cd1c481ed61c56205df3a8446e2d3/raw/659298d446fc4437c9db6c75cdfe676261425914/hello.cpp -o original/hello.cpp
# make edits 
...
# create patch
diff original/hello.cpp modified/hello.cpp > hello.patch

Apply Patch

# download original source
curl https://gist.githubusercontent.com/mlcollard/b96cd1c481ed61c56205df3a8446e2d3/raw/659298d446fc4437c9db6c75cdfe676261425914/hello.cpp -o others/hello.cpp
# apply patch
patch others/hello.cpp -i hello.patch

Revision Control (Version Control)

  • Essential to coordination of changes among multiple developers
  • Maintains history of changes
  • Management of branches and product families
  • Defines workflow
  • Most important tool (after editor and compiler) for software development
  • More time/energy/effort may be spent in version control then any other aspect of the system

Common Features

  • Versioning down to file level
  • Text and binary files
    • Text Files: Only lexical knowledge (file of characters)
    • Binary Files: Stays at file level
  • No understanding of syntactic structure of code

Management Model: File Locking

  • Only one developer at a time has access to a file/resource
  • Lock-Modify-Unlock
  • One developer at a time has the “token”, other developers have to wait
  • Library model
  • Advantage: No merging problems
  • Disadvantage: Prevents other developers from working
  • Disadvantage: Impractical in distributed development due to time/space differences

Management Model: Version Merging

  • No restrictions on access
  • Developers can work simultaneously
  • Copy-Modify-Merge
  • Advantage: No restrictions on working
  • Disadvantage: Merge issues

Centralized Version Control

  • e.g., Subversion, ClearCase, Vault
  • Single central repository, local working copies
  • Access controlled by server
  • One sequence of version numbers
  • Traditional approach

Distributed Version Control

  • e.g., Git, Bazaar, Darcs, Mercurial, Monotone, SVK
  • Peer-to-peer, no central repository, all are repository copies
  • No one sequence of version “numbers” (Why?)
  • Access controlled by server

SVN

  • Versions identified by monotonically increasing numbers
  • URLs identify both location of central repository and directories/files in the repository
  • Each commit has an author
  • Support for per-directory permissions, however commit messages are accessible to those with any permission to access the repository

SVN Workflow

  • Update the working copy
  • Perform changes in local repository (“working copy”)
  • Commit changes

Common SVN Issues

  • Need access to server to create shared repository
  • No distinction between private and public changes
  • Merging is difficult
  • Branching creates problems

SVN View

Git

  • Distributed revision control and SCM (Source Code Management) system
  • Created in 2005 by Linus Torvalds for Linux kernel development
  • Versions are identified by SHA1 ids (160-bit numbers in hexadecimal)
  • URL only identifies location of repository. Always have branches and tags, default branch is the “master”
  • Each commit has an author and a committer

Git View

Git Characteristics

  • peer-to-peer
  • Each copy is a full-fledged repository, and can be worked on without network access
  • Each user clones the repository, makes changes and pushes the changes

Git Benefits

  • Records changes, not revisions
  • Local and remote repository
  • Tracks merged data
  • Staging changes into several patches

Git Comparison

  • Advantages: fast, flexible, powerful, multiuser
  • Disadvantages: complex, can be difficult to learn, GUI tools not as developed
  • Despite disadvantages, Git is becoming a standard tool for software engineering
  • Also used as data format for applications

GitHub

  • One issue with Git is that for others to see your code, you must make them public (to the other users)
  • GitHub is a hosting service for software development projects using Git
  • Provides a GUI application

Git Tools

SVN/Git Command Comparison

SVN Git
svn checkout url git clone url
svn update git pull
svnadmin create repo; svn import file://repo git init; git add .; git commit
svn status git status
svn revert path git reset –hard path
svn add file; svn rm file; svn mv file git add file; git rm file; git mv file
svn commit git commit -a

Git Workflow Overview

  • Create issue, e.g., issue1
  • Create branch for working on issue
  • Work in branch on that issue (can also work on other issues in other branches)
  • Finish issue in branch
  • Pull request from issue branch to master
  • Close issue
  • Once pulled, can even delete branch

Git Structure

  • View what we have done overall:
    • git log --graph
  • To see what commits belong to master:
    • git log -b master
  • And what commits belong to issue1:
    • git log -b issue1
  • Now could delete the branch at GitHub, and also locally:
    • git branch -d issue1

Branch Lifetime

  • hotfix or issue branch
    • Lifetime is only while changes are being made, and until pull request
  • Long-running branches
    • Lifetime continues, periodically pull into master

Not Always master

  • master is default
  • Other common long-running branches:
    • develop
    • build
  • Changes are pulled as they reach a state of acceptance, need, etc.
  • Branches become a way to organize development, and organize stages of acceptance/verification

Git Complexity

  • Mixture of plumbing and porcelain commands
  • Over 160 commands
  • To fully understand, need to understand how git is implemented
  • However, extremely powerful tool, in widespread use