Tag: git

How To Use Git-SVN as the Only Subversion Client You’ll Need

I’ve been using git as my favorite version control tool for quite a while now. One of its numerous distinguishing features is an optional component called git-svn, which serves as a bi-directional “bridge” that enables native git repositories to interact with a Subversion repository, performing all the normal operations you would need to use svn for. In other words, since you can checkout, commit to, and query the logs of Subversion repositories (among other things) using git-svn, git can serve as your all-in-one Subversion client.

One reason why you might use git-svn because your project actually resides in a Subversion repository and other people need to access it using Subversion-only tools. Another might be because you have multiple projects, some that use git and others that use Subversion, and you’re tired of switching between svn and git commands—like me. For us, it’s far easier to simply use git as a Subversion client and never have to call svn directly.

As an important aside, please note that I would strongly discourage people who are new to git from learning about it by using git-svn. Although you may think that moving to git from Subversion would be eased by using the git-svn bridge, I really don’t think that’s the case. You’re much, much better off simply using git by itself right off the bat, and you can do this even if your fellow committers are using subversion.

Also, I’m going to assume you’ve already got a Subversion repository set up somewhere.

First, checkout the subversion repository. In Subversion you would do this:

svn checkout http://example.com/path/to/svn/repo

With git-svn, you do this:

git svn clone http://example.com/path/to/svn/repo

This will cause git-svn to create a new directory called repo, switch to it, initialize a new git repository, configure the Subversion repository at http://example.com/path/to/svn/repo as a remote git branch (confusingly called git-svn by default, although you can specify your name by passing a -Rremote_name or --svn-remote=remote_name option), and then does a checkout.

The output of this command will be a little awkward. Here’s a sample from one my repositories:

r14 = dbd7266f328ef2ad061ea4532f39ce7cebaba0c5 (git-svn)
	M	trunk/Chapter 6/Chapter 6.doc
	M	trunk/Chapter 6/code examples/6.1.html
	A	trunk/Chapter 6/code examples/6.2.html
r15 = 4cca08341ab0600069cece77ce67afc449caca68 (git-svn)
	M	trunk/Chapter 6/Chapter 6.doc
	A	trunk/Chapter 6/code examples/print.css
	A	trunk/Chapter 6/code examples/screen.css
	M	trunk/Chapter 6/code examples/6.1.html
	M	trunk/Chapter 6/code examples/6.2.html
r16 = 7b2f3e0ccfd79be61b527b6ba325f8689475dc01 (git-svn)
	M	trunk/Chapter 5/Chapter 5.doc
r17 = a319764855361d92bb6e006cfd18a51319046cae (git-svn)
	M	trunk/Chapter 5/Chapter 5.doc
r18 = 4cd5cb43d33b2dd45bd39b9a2b7ea9416f9e3d8f (git-svn)
	M	trunk/Chapter 6/Chapter 6.doc
	M	trunk/Chapter 6/code examples/screen.css
	M	trunk/Chapter 6/code examples/6.1.html

As you can see, git-svn is associating specific Subversion revisions with particular git commit objects. Due to this required mapping, the initial cloning process of a Subversion repository may take some time. This is a good opportunity for your morning coffee break.

When this process is done, you’ll have a typical git repository with a local master branch and one remote branch for the Subversion repository:

Perseus:repo maymay$ git branch
* master
Perseus:repo maymay$ git branch -r

You can now treat the Subversion repository as though it were a remote branch of sorts. Say you’ve done a bunch of work and, as you typically do with git, you commit this work to your topic branch.

Perseus:repo maymay$ git checkout -b awesome-feature
Switched to a new branch "awesome-feature"
Perseus:repo maymay$ vim awesome-feature-stylesheet.css
Perseus:repo maymay$ git add awesome-feature-stylesheet.css 
Perseus:repo maymay$ git commit -m "Now I'm perty."
Created commit 07ee832: Now I'm perty.
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 awesome-feature-stylesheet.css

Right now your changes are still in the topic branch (called awesome-feature in the above example). To get them to Subversion, you merely need to say git svn dcommit:

Perseus:repo maymay$ git svn dcommit
Committing to http://example.com/path/to/svn/repo ...

Note that pesky extra “d” in the command. This is the equivalent of Subversion’s svn commit, but the commit message used is the one from the previous command, which in this case was git commit -m "Now I'm perty.". Also interesting to note here is that because Subversion doesn’t understand git branches, any change on any branch can be “pushed” to Subversion at any time using git svn dcommit—the git commits don’t have to be on any specific branch, since all git-svn does is map a git commit object to a Subversion revision and vice versa.

Similarly, you can at any time run the equivalent of svn update to get the latest changes from the Subversion repository into your Subversion branch.

  • To do this, without affecting your working tree—that is, to only fetch the latest changes but not write them to the filesystem, just to the git-svn metadata area and the remote git branch—use git svn fetch. To apply these changes to your local branch, you simply merge: git checkout master; git merge git-svn.
  • If you do want to write out the changes to the filesystem (as svn update would do), use git svn rebase, which automatically linearizes your local git commit history after the commit history of the incoming Subversion changesets. Very slick.

If your fetching/rebasing causes a conflict, you’ll be notified and will have to resolve it as per usual. If your “pushes” to the svn repo causes a Subversion conflict, you’ll be notified and you should again edit the appropriate files to resolve it, but this time make sure you run a git svn rebase before you try dcommit-ing again (since, remember, Subversion can only handle linear commit history).

As always, saying man git-svn or git help svn to your shell will give you all the other details. Among these, the most likely you’ll probably want to learn about is how to track multiple Subversion branches as normal git branches.

Git Fundamentals in 30 Minutes or Less

I did a brief talk on Git at SyPy recently. I had a great time learning about the differences between Git, Bazaar, Mercurial, and even some other tools like BitKeeper that got mentioned and were before my time. Both my co-presenter, Alistair (who I sadly have no personal web address for!), and Martin Pool had some really interesting things to say about DSCM tools and Bazaar in particular.

So anyway, my talk was pretty dense and unfortunately I had major issues with Keynote (WTF happened to presenter notes?!) while giving the presentation. I don’t feel like I did as well as I could have. That, and I’ve learned the lesson of practice, practice, practice before doing a live demo. Ugh.

That said, I did actually prepare a bunch of slides so I figured I’d share them with everyone here. The slides are available as a downloadable PDF with my presenter notes, or a ZIP archive of the Keynote file, if you’ve got that installed on your Mac.

I got some fantastic feedback from the great folks at the SyPy meeting. One particular piece of advice I thought was exceptionally poignant was that in the context of a “which tool to use” presentation, my presentation is very technical—probably too technical. Instead, I should have said more about the different ways and applications I used Git with or for.

I could have talked about how I use git as a core tool in the change management process for server configurations. Since git’s big selling point is scalability, this process also turns out to be really useful for larger server deployments. When (appropriately privileged) coworkers need configuration changes to a particular machine, they can actually send me a pull request and I can review their configuration change. I also could have talked about the various different binary file types I often store in git repositories, such as image, Flash, and other video assets for web development purposes. Git handles binaries exceptionally well!

None of these things made it into the presentation slides, so I tweaked the title to reflect the more fundamental technical thrust of its contents. Perhaps this means another git talk is in the works. Or maybe a sequel to this one called Git Fundamentals in 45 Minutes or Less. In any event, if you have any feedback or suggestions, constructive comments are always appreciated. :)

Download presentation

Are you missing the point of using a version control tool?

The other day I gave a brief (and overly-hyper) talk about git, the (very) dumb, (very) fast version control system. It was part of SyPy‘s Git vs. Hg vs. Bzr night. Rather than be flamingly competitive, however, I had a lot of fun that night learning about the differences between the DSCM tools, which was especially interesting since I’ve only ever used Git in real life scenarios.

Since I’m a Subversion refugee, my only experience with different version control systems is mostly with the distinctions between the centralized versus the distributed models, not between the various tools you can use in either paradigm. What struck me when I first began using git was how conceptually similar it felt to using Subversion when I was using it by myself (as a lone developer) but how radically different it suddenly felt the moment I was sharing my code with someone else.

Now, I’m a die-hard individualist. I want things to happen my way as much as possible, and I don’t really care what happens for anyone else as long as when I interact with other people those interactions are as mutually beneficial as they can possibly be. That’s why I love DSCM tools so much.

Distributed source code management systems feel much more like translator tools between the ways in which people work as opposed to feeling like a dogma of workflow management processes, like centralized systems do. This paradigm appeals both to my preferred way to work and, as it turns out, helps more people stay more productive all at the same time.

This is also why I’m a firm believer that most of the people I’ve worked with in the past completely missed the point of using version control systems. It seems to me that most developers I’ve worked with have thought of SCM tools as “the ‘Save As…’ button on steroids.” While these developers are technically correct, their narrow view of what a VCS does means they aren’t taking advantage of the full potential of the concept.

The power of a version control system isn’t just in that it gives you the ability to easily hit the proverbial “Save As…” button as much as you want, but rather in that you get to retrieve those other versions when you’re ready for them, regardless of what your fellow developers are doing to the code on their machines. This means that a version control system’s real purpose is to insulate you from changes of any sort until you’re ready to deal with them. A good tool also does this reciprocally; it will insulate your fellow developers from the changes you’re making until they’re ready for them.

Admittedly, that’s not a very concrete “feature.” It’s more like a fundamental philosophical principle, which is probably why it’s so hard to encode into the physical manifestation of a tool. Then on top of all of that complicatedness you have to add things like usability and interoperability and resource efficiency. That’s where I learned about the majority of the distinctions between the various DSCM tools discussed in SyPy’s presentation.

However, for me, all of those things ultimately get evaluated against the following question: Does Feature X help insulate me from change (does it help in persisting my view of the state of the world until I’m ready for it to change), or not?

For example, Bazaar’s interesting notion of “nested commits” with dotted revision numbers is really intriguing because it’s much (much) more user-friendly than git’s notion of exposing SHA-1 hashes to (mere mortal) end user’s eyes. Yet, while it’s certainly less painful than copying-and-pasting hashes all over the place, there’s little fundamental difference in the way these mechanisms actually portray the state of the world to me. Any given SHA-1 will always be the exact same commit object. Any given dotted revision number will also always be the same commit (within one’s own unchanged repository).

In contrast, I learned from Martin Pool that Bazaar has a “push over SFTP” feature to let you “export” or “archive” a version of code by transmitting it over an SFTP connection. Now that really caught my attention because it’s an example of the version control tool acting like that translator I was mentioning earlier; the interoperability helps people not need to change until they want to. In this case, it means you never have to install Bazaar on a remote server to get your content there via the tool. That’s very cool—much cooler than the mundane technical fact that bzr supports the SFTP protocol out of the box.

Of course, it’s technically pretty trivial to write an expect or shell script wrapper to enable git (or whatever other tool you want to use) mimic this behavior. And that’s exactly the point: technology is always the easy part. It’s doing it right at a fundamental level that’s actually really difficult to do correctly.

How to use HTTP Basic Authentication with git

Coming right on the heels of my need to set up a git repository on shared hosts, I next wanted to see if I could use HTTP authentication for such a repository. Of course, HTTP authentication is an extremely insecure protocol, but it typically is enough to dissuade the casual user (such as Googlebot) from peeking at things you don’t want available on the public Internet, so it has its uses.

Note that with the set up described in the above-linked previous post, you can only pull over HTTP. This is usually what you want. If you want to be able to push over HTTP as well, git must be compiled with the USE_CURL_MULTI flag.

This is, as it turns out, because git seems to use curl for its HTTP operations, which also obviously means you must have curl installed on your workstation if you don’t already and it also implies that it’s curl, not git which you need to configure. In other words, accessing a git repository that is behind HTTP authentication is exactly the same as accessing one without it, and so is publishing a git repository to an HTTP server. The rest of this short tutorial assumes you have published your repository at http://example.com/git/public-repo.git and are using the Apache web server.

Step 1: Create an HTTP Basic Authentication username and password file

First, you’ll need to create a file that lists the usernames who are permitted to access your repository over HTTP Basic authentication. This is easily accomplished with the htpasswd utility (or your host’s custom web UI, if one is provided). Let’s create a file called .git-htpasswd to store these usernames and passwords.

From your shell, run the following command:

htpasswd -c /path/to/DOCUMENT_ROOT/.git-htpasswd username

where /path/to/DOCUMENT_ROOT is the full path to the root directory of your web site and username is the username you want to add. If you want to add subsequent users to this file, run the same command again without the -c, like this:

htpasswd /path/to/DOCUMENT_ROOT/.git-htpasswd another_username

You’ll then be prompted to enter a password, and then prompted again to verify that you’ve typed it correctly.

Step 2: Configure HTTP Basic Authentication on Apache

Next, configure standard HTTP Basic Authentication on Apache. In most shared hosting environments, you’ll be allowed to configure per-directory passwords using .htaccess files. Some hosts provide web UI interfaces for creating “protected folders,” which is basically the same thing. Make certain that the kind of protection you select is “Basic,” because curl will require that.

To do that, create a new file named .htaccess in your DOCUMENT_ROOT/git directory if one does not already exist with the following contents:

AuthType Basic
AuthName "Git"
AuthUserFile /path/to/DOCUMENT_ROOT/.git-htpasswd
Require valid-user

This tells Apache to look for usernames and passwords in the file named .git-htpasswd we created in step 1.

If everything is set up correctly, you should now be able to access http://example.com/git/public-repo.git in your Web browser and you should be presented with a login dialogue box.

Step 3: Configure curl on your (client) workstation computer

Next, configure your local curl client. git-pull will call curl with its --netrc-optional switch for HTTP operations. This means curl will look for a file named .netrc in your home directory and will read authentication configurations from that file. The format of this file is incredibly simple:

machine yourserver.example.com
username your_username
password your_password

To check if this is working correctly, run curl yourself to access the current HEAD of the public repository and see if you get the expected result:

curl --netrc --location -v http://example.com/git/public-repo.git/HEAD | grep 'ref: refs/heads'

If you see a line of output then you know this is working, otherwise you should double check your work.

Step 4: There is no step four

You’re done. With this configuration, you can git-pull as you normally would, and git will automatically use your .netrc file to enable curl‘s HTTP authentication schemes.

How to install git on a shared web host’s server

Tonight I found myself with the need to host my own git repository on one of my own servers. This time, for the first time, it was a server I don’t actually have administrative access to and it was one where git wasn’t pre-installed. Thankfully, with a bit of help from Blue Static, I built and installed git from scratch in literally ten minutes. Here’s the short version of how I did it, which may even be generic enough that you can copy and paste this into a bash shell prompt on your server to do the same thing:

cd ~/                          # change to home directory
test -d ~/src || mkdir ~/src   # if there isn't already a ~/src directory, create it
cd ~/src                       # then change to that directory
curl -O http://www.kernel.org/pub/software/scm/git/git- # download
tar -xvzf git-   # and extract the git source code
cd git-                 # change to the source code directory
./configure --prefix=$HOME     # configure build to install into $HOME
make                           # do the build
make install                   # move the built binaries to the right places
echo "export PATH=\$PATH:$HOME/bin" >> ~/.bashrc # make sure non-interactive shells can find git

Of special note is the last line, which sets up the necessary $PATH specifically for non-interactive bash shells for use with git-push or git-pull. With out that, you’ll run into the infamous “bash: git-receive-pack: command not found” error.

Also, of course, lines 4 through 6 are referring to version of the git tarballs, so you may want to change these to point to whatever is now the most recent version.

How to import CVS code repositories into Git using `git cvsimport`

This should be straightforward, but it’s not. To import (not track, but just import) code from a remote CVS repository to a local git repository, you need to do the following:

  1. Be certain you have the git-core package installed on your system and that this package includes the git-cvsimport command. You can run git help -a | grep cvsimport to verify this.
  2. Be certain you have the cvsps command-line tool installed. This does not come with the git suite of tools, so you’ll need to get it separately. If you’re a lazy Mac OS X user, like me, you can use MacPorts: sudo port install cvsps. Otherwise, get it from the source.
  3. Prepare your CVS login information for the remote server before you run git cvsimport. You need to do this so that the git tool will be able to log you in to the CVS server automatically. The command for this looks like:
    CVSROOT=:cvs-login-method:cvs-user-name@cvs.server.name:/path/to/CVS/root cvs login

    For example, if you’re pulling code from the anonymous CVS server that runs on Drupal.org, you might use this: CVSROOT=:pserver:anonymous@cvs.drupal.org:/cvs/drupal-contrib cvs login. This command will prompt you for the password for the user you specified at the server you specified (for anonymous access, the password is almost always anonymous) and will hash this in the ~/.cvspass file for future use by CVS

  4. Finally, run the git cvsimport tool, and specify the proper options. Using the Drupal example above, your command might look like this:
    git cvsimport -v -d :pserver:anonymous@cvs.drupal.org:/cvs/drupal-contrib contributions/modules/module-name

    This would login to cvs.drupal.org using the CVS’s pserver login method, provide the username anonymous and the password you specified in the previous step that is hashed in ~/.cvspass, set the CVS document root to /cvs/drupal-contrib, and pull the code located at contributions/modules/module-name into the current working directory as a git repository.

This works pretty nicely, and creates a git repository just as though you’d created it with git init in the current working directory.

If you get an error that looks like this:

AuthReply: cvs [pserver aborted]: descramble: unknown scrambling method

then you’ve most likely specified the CVS document root incorrectly. Most notably, git cvsimport does not understand a CVS document root wherein the password is specified in the document root URL itself. So, for example, git cvsimport -d :pserver:password:username@cvs.server.name:/path/to/CVS/root code/to/checkout will not work. Omitting the password and the separating colon from the URL should fix it.

HowTo: Use git for personal development when everyone else is using Subversion (part 2)

When we left off, you had just finished transforming a remote Subversion repository into a git repository and optimizing it to save you some space. Now that you have a git repository, what do you do?

First things first. Once you have an idea of what work you want to do, you should give yourself a space to do this work without disturbing anybody else’s work. Do this by making a new, personal branch. Unlike Subversion and some other centralized version control systems, git makes it possible to do make all kinds of changes to your repository, including making branches, and even save those changes without having to republish everything back to the central repository server at each step. In other words, if you thought Subversion branches were “cheap,” you’ll love git’s branches.

Also unlike Subversion, which stores its branches in completely separate pathnames, git keeps all branches in the same filesystem tree separated only with metadata (in .git/refs/remotes for remote branches and in .git/refs/heads for local branches, to be a bit more precise), so you don’t have to create lots of different directories for all your branches (unless you want to). With no branches defined, you’re working in the “master branch,” or “the trunk” by default.

git branch
* master

By saying git branch you ask git to print a list of all (local) branches. The one with the asterisk marks the current branch, the one you’re using at the moment you run the command. Since there are no branches, you’re currently in the “master” branch. But we don’t want to make changes here, we want to make changes in our own private branch, so we’ll make a new one.

git checkout -b new_branch_name

This will create a new branch with the name new_branch_name and immediately switch to it. Notice that you’ve made absolutely no changes to the filesystem itself; only the git metadata has been altered. Saying git branch again will show you the change:

git branch
* new_branch_name

Also note that since we haven’t comitted any changes, we don’t need a commit message (or “log message”) for creating this branch. We’ll add one later, when we need it. Now go ahead and write some code in your new branch. At any time, you can create a new branch in the same way you added the first. Each new branch is created at the HEAD (“latest”) revision of whatever branch you’re currently working in.

git checkout -b another_branch
Switched to a new branch "another_branch"
git branch
* another_branch

To switch back to any other branch, simply git checkout that branch again:

git checkout master
Switched to branch "master"

If you want to delete a branch you don’t like, that’s easy too:

git branch -d another_branch
Deleted branch another_branch.

Keep in mind that throughout all of these branch creation and deletion actions, the only thing that’s being altered is the git metadata. That’s why it’s so cheap to create new branches. If you ever have a new idea you’re working on, it’s recommended that you create a branch for it, even if that branch is so short-lived it never gets published.

So you have a new local branch, and you’ve been working as you normally do for a few minutes, creating files, editting them, and so on. Running git status now will ask git to show you the changes you’ve made to your filesystem. If you’ve created any files in new directories that git doesn’t know about, it will simply report that directory. If you’ve made new files in directories git does know about, it will list all those files explicitly.

git status
# On branch cartoon_contests
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#	sites/default/modules/cartoon_contests/
#	sites/default/modules/factiva/.factiva.module.swp
#	sites/default/modules/testfile
#	sites/default/settings.php
nothing added to commit but untracked files present (use "git add" to track)

In the above sample output, I’m working on a new Drupal module for a web site. I’ve created a new directory, sites/default/modules/cartoon_contests/ (note the trailing slash), and I have several untracked files. One is my vim swap file for a different module, .factiva.module.swp, one’s an unimportant testfile, and the last is the Drupal configuration file, settings.php.

The only thing I want to commit is the new cartoon_contests directory, and all the files within it. Like Subversion, I have to tell git that I want to track this directory, which is done simply by saying git add cartoon_contests. Unlike Subversion, future invocations of git status let me see everything that git is going to go in the next commit.

git status
# On branch cartoon_contests
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#	new file:   sites/default/modules/cartoon_contests/cartoon_contests.info
#	new file:   sites/default/modules/cartoon_contests/cartoon_contests.module
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#	sites/default/modules/factiva/.factiva.module.swp
#	sites/default/modules/testfile
#	sites/default/settings.php

The “Changes to be committed” section of the output is called the staging area, or the index. In this way, I can prepare all the changes I want to commit before I do so, making sure they’re perfect before I actually commit them to the git repository. At any time before I commit, I can make additional modifications, such as git adding more files or directories, git reseting to unstage all (or some) of my changes, etc. git help reset also has a number of handy explanations with examples for different things you might need to do at this point.

If you were using svn:ignore, equivalent functionality exists in git. Simply append file glob patterns, one per line, to the $GIT_DIR/info/exclude file in your git repository. Like so:

echo -e "*.swp\nsites/default/settings.php" >> .git/info/exclude ; git status
# On branch cartoon_contests
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#	new file:   sites/default/modules/cartoon_contests/cartoon_contests.info
#	new file:   sites/default/modules/cartoon_contests/cartoon_contests.module

If you’ve got many of these, you can use git svn show-ignore >> .git/info/exclude to search through your old Subversion repository and look for any and all ignores, automatically adding them to git’s exclude list. (Checkout Tsuna’s blog entry on learning git for more tips like this.)

Finally, after you’ve done some of your work and you’ve finished staging your changes, you’re ready to commit them to the repository. On the other hand, if you hate what you’ve done and want to undo it all, you can say git reset --hard HEAD to throw away all your local changes. To throw away changes to just a single file, just checkout that file again by saying git checkout filename. This is the equivalent of Subversions svn revert filename command.

Before you actually make your first commit, however, you should properly introduce yourself to git. You don’t have to do this because git will try to figure out who you are by itself (details explained in detail here), but you probably should at least create global defaults for your user (which git will store in ~/.gitconfig). If you want to, you can also create per-repository defaults (which git will store in .git/config), or even system-wide defaults for all users of this computer (which git will store in /etc/gitconfig). To do so, say this:

git config --global user.name "Your Full Name"
git config --global user.email "you@example.com"

This will create the file ~/.gitconfig if it doesn’t alreay exist and will write your name into it. You can also just edit the file directly yourself instead of using git config commands.

Like Subversion, committing will create a saved, fixed point-in-time that marks the changes you have made to your files. Like branches, commits are also very cheap in git, so go ahead and commit at any time you like. Remember to stage your files (by git adding them), and then just git commit.

git add filepattern
git commit -m "My very frst git commit!"
Created commit ef483c1: My very frst git commit!
 5 files changed, 58 insertions(+), 12 deletions(-)

If you forget to specify a commit log message (-m "log message") on the command line, or if you want to enter a multi-line commit log, git will prompt you for it in your favorite $EDITOR. You can view a history, including all the log messages, for your project with git log. You can even the view logs in a number of pretty formats. Check git help log for more information.

If you want to change anything about the commit you just made, such as the author, you can just run git commit again with the --amend flag added to the command. Notice the typo in the commit log? Fixing it is really easy:

git commit --amend -m "My very first git commit!"
Created commit 88602f6: My very first git commit!
 5 files changed, 58 insertions(+), 12 deletions(-)

Finally, with your new code commited to your local git branch, it’s time to share that code with your colleagues who are (for some inexplicable reason) still using Subversion. This is also extremely simple. Just say git svn dcommit. (That’s right, dcommit, not just commit. I don’t know why….)

Not sharing your changes via Subversion, but with a patch instead? git diff -p will generate patches for you. See git help diff-files and look for the “GENERATING PATCHES WITH -P” section.

If you found this helpful, you may also enjoy an alternative tour for beginners from Carl Worth.

HowTo: Use git for personal development when everyone else is using Subversion (part 1)

Let’s just jump into it.

Using Mac OS X, first make sure you have the git core installed as well as the git-svn components. Unlike other version control systems, git is not one monolithic program but a collection of smaller utilities, and the Subversion conduits are a subset of these utilities. The git-Subversion utilities themselves depend on having the Perl::Subversion bindings installed for Perl 5.

For the lazy MacPorts user, simply run:

sudo port install git-core +svn

This will install the git-core, the git-svn packages, as well as all the documentation for git and any required dependencies you don’t already have. Note that the documentation (man pages) is installed in /opt/local/man, which may not be in your default $MANPATH, so be sure to add that directory if man git returns a “No manual entry for git” error.

Alternatively, if you don’t want to use MacPorts, you can download a pre-compiled Mac OS X binary that includes git, the git docs, and the git-svn package from the git web site that comes complete with a standard GUI installation procedure.

Next, initialize a new git repository that tracks a remote Subversion one. This allows you to work privately using git, but to collaborate with other people who are using Subversion. Running

git svn init svn://my.svn.server/path/to/svn/repo workingcopy

will give you an empty git working copy named workingcopy configured to use the remote Subversion repository at svn://my.svn.server/path/to/svn/repo. This step is analogous to the need to run svnadmin create repo_db, to initialize a new repository database (where all the file versioning information will be stored). Unlike Subversion, git’s distributed database means that the working copy itself is also the location of the repository database, so there’s no need to deal with two filesystem paths anymore.

Next, change directory to your working copy, and run

cd workingcopy
git svn fetch

to populate your new, empty git working copy (and repository) with all the files from the remote Subversion repository.

Now that you have filled your git repository with a lot of data, if you want to, you can now also save a significant chunk of disk space by repacking that data into a more single, efficient, native packed git archive format (a .pack file). The git-repack -a command is used to do this, and its manual page says:

Instead of incrementally packing the unpacked objects, pack everything referenced into a single pack. Especially useful when packing a repository that is used for private development and there is no need to worry about people fetching via dumb protocols from it.

According to some sources, this can turn a 353MB Subversion repository into 31MB of git pack. Say:

git repack -a -d -f
Generating pack...
Counting objects: 284
Done counting 4475 objects.
Deltifying 4475 objects...
 100% (4475/4475) done
Writing 4475 objects...
 100% (4475/4475) done
Total 4475 (delta 1876), reused 0 (delta 0)
Pack pack-a115c320ff9c9968248bd250bdfea3110d0f0c1a created.
Removing unused objects 100%...

to perform the compression. For even greater savings, increase the compression by stipulating a high --window option, such as git repack -a -d -f --window=100. (--window defaults to 10—the higher the window, the harder git-repack will try to compress your stuff.) This turned a 36MB Subversion repository in one of my projects to a 20MB git pack. As always, your mileage may vary.

Congratulations, you have transformed your old Subversion repository to a git repository. All that’s left to do now is to get to the coding. Perhaps start by making a local (cheaper than cheap!) git branch….

That was easy, right? Most things in git are, in fact, that easy. If you’ve never used other version control systems before, be grateful. If you have, you can breathe a sigh of relief. Still not convinced? Don’t take my word for it…ask Linus Torvalds (or Randall Schwartz).