Category: Unix/Linux

How to work around “sorry, you must have a tty to run sudo” without sacrificing security

While working on $client‘s Linux server last week, I found myself installing a cron job that ran as root. The cron job called a custom bash script that, in turn, called out to various custom maintenance tasks client had already written. One task in particular had to run as a different user.

During testing, I discovered that the odd-ball task failed to run, and found the following error in the system log:

sudo: sorry, you must have a tty to run sudo

I traced this error to a line trying to invoke a perl command as a user called dynamic:

sudo -u dynamic /usr/bin/perl run-periodic-tasks --load 5 --randomly

A simple Google search turned up an obvious solution to the error: use visudo to disable sudo’s tty requirement, allowing sudo to be invoked from any shell lacking a tty (including cron). This would have solved my problem, but it just felt wrong, dirty, and most troublingly insecure.

One reason why sudo ships with the requiretty option enabled by default is, among other reasons, to prevent remote users from exposing the root password over SSH. Disabling this security precaution for a simple maintenance task already running as root seemed totally unnecessary, not to mention irresponsible. Moreover, client‘s script didn’t even need a tty.

Thankfully, there’s a better way: use su --session-command and send the whole job to the background.

su --session-command="/usr/bin/perl run-periodic-tasks --load 5 --randomly" dynamic &

This line launches a new, non-login shell (typically bash) as the other user in a separate, background process and runs the command you passed using the shell’s -c option. Sending the command to the background (using &) continues execution of the rest of the cron job.

A process listing would look like this:

root     28109     1  0 17:10 ?        00:00:00 su --session-command=/usr/bin/perl run-periodic-tasks --load 5 --randomly dynamic
dynamic  28110 28109  0 17:10 ?        00:00:00 bash -c /usr/bin/perl run-periodic-tasks --load 5 --randomly

Note the parent process (PID 28109) is owned by root but the actual perl process (PID 28110) is being run as dynamic.

This in-script solution that replaces sudo -u user cmd with su --session-command=cmd user seems much better than relying on a change in sudo‘s default (and more secure) configuration to me.

How to use HTTP Basic Authentication with git

Coming right on the heels of my need to set up a git repository on shared hosts, I next wanted to see if I could use HTTP authentication for such a repository. Of course, HTTP authentication is an extremely insecure protocol, but it typically is enough to dissuade the casual user (such as Googlebot) from peeking at things you don’t want available on the public Internet, so it has its uses.

Note that with the set up described in the above-linked previous post, you can only pull over HTTP. This is usually what you want. If you want to be able to push over HTTP as well, git must be compiled with the USE_CURL_MULTI flag.

This is, as it turns out, because git seems to use curl for its HTTP operations, which also obviously means you must have curl installed on your workstation if you don’t already and it also implies that it’s curl, not git which you need to configure. In other words, accessing a git repository that is behind HTTP authentication is exactly the same as accessing one without it, and so is publishing a git repository to an HTTP server. The rest of this short tutorial assumes you have published your repository at http://example.com/git/public-repo.git and are using the Apache web server.

Step 1: Create an HTTP Basic Authentication username and password file

First, you’ll need to create a file that lists the usernames who are permitted to access your repository over HTTP Basic authentication. This is easily accomplished with the htpasswd utility (or your host’s custom web UI, if one is provided). Let’s create a file called .git-htpasswd to store these usernames and passwords.

From your shell, run the following command:

htpasswd -c /path/to/DOCUMENT_ROOT/.git-htpasswd username

where /path/to/DOCUMENT_ROOT is the full path to the root directory of your web site and username is the username you want to add. If you want to add subsequent users to this file, run the same command again without the -c, like this:

htpasswd /path/to/DOCUMENT_ROOT/.git-htpasswd another_username

You’ll then be prompted to enter a password, and then prompted again to verify that you’ve typed it correctly.

Step 2: Configure HTTP Basic Authentication on Apache

Next, configure standard HTTP Basic Authentication on Apache. In most shared hosting environments, you’ll be allowed to configure per-directory passwords using .htaccess files. Some hosts provide web UI interfaces for creating “protected folders,” which is basically the same thing. Make certain that the kind of protection you select is “Basic,” because curl will require that.

To do that, create a new file named .htaccess in your DOCUMENT_ROOT/git directory if one does not already exist with the following contents:

AuthType Basic
AuthName "Git"
AuthUserFile /path/to/DOCUMENT_ROOT/.git-htpasswd
Require valid-user

This tells Apache to look for usernames and passwords in the file named .git-htpasswd we created in step 1.

If everything is set up correctly, you should now be able to access http://example.com/git/public-repo.git in your Web browser and you should be presented with a login dialogue box.

Step 3: Configure curl on your (client) workstation computer

Next, configure your local curl client. git-pull will call curl with its --netrc-optional switch for HTTP operations. This means curl will look for a file named .netrc in your home directory and will read authentication configurations from that file. The format of this file is incredibly simple:

machine yourserver.example.com
username your_username
password your_password

To check if this is working correctly, run curl yourself to access the current HEAD of the public repository and see if you get the expected result:

curl --netrc --location -v http://example.com/git/public-repo.git/HEAD | grep 'ref: refs/heads'

If you see a line of output then you know this is working, otherwise you should double check your work.

Step 4: There is no step four

You’re done. With this configuration, you can git-pull as you normally would, and git will automatically use your .netrc file to enable curl‘s HTTP authentication schemes.

How to install git on a shared web host’s server

Tonight I found myself with the need to host my own git repository on one of my own servers. This time, for the first time, it was a server I don’t actually have administrative access to and it was one where git wasn’t pre-installed. Thankfully, with a bit of help from Blue Static, I built and installed git from scratch in literally ten minutes. Here’s the short version of how I did it, which may even be generic enough that you can copy and paste this into a bash shell prompt on your server to do the same thing:

cd ~/                          # change to home directory
test -d ~/src || mkdir ~/src   # if there isn't already a ~/src directory, create it
cd ~/src                       # then change to that directory
curl -O http://www.kernel.org/pub/software/scm/git/git-1.5.6.4.tar.gz # download
tar -xvzf git-1.5.6.4.tar.gz   # and extract the git source code
cd git-1.5.6.4                 # change to the source code directory
./configure --prefix=$HOME     # configure build to install into $HOME
make                           # do the build
make install                   # move the built binaries to the right places
echo "export PATH=\$PATH:$HOME/bin" >> ~/.bashrc # make sure non-interactive shells can find git

Of special note is the last line, which sets up the necessary $PATH specifically for non-interactive bash shells for use with git-push or git-pull. With out that, you’ll run into the infamous “bash: git-receive-pack: command not found” error.

Also, of course, lines 4 through 6 are referring to version 1.5.6.4 of the git tarballs, so you may want to change these to point to whatever is now the most recent version.

One minute Mac tip: Schedule off-hours downloads by enabling `at`, `batch` UNIX job scheduling commands

In a lot of places in the world, many people still have to pay for bandwidth costs. I’m one of those people who just can’t afford to download lots of stuff during peak hours when my bandwidth might quickly get shaped or, worse, I’ll get charged. Nevertheless, there are often plenty of legit reasons to initiate huge downloads.

In these cases, it makes sense to be smart about when I initiate these downloads. Being something of a UNIX-head myself, I wanted to use the age-old at command to download a Linux ISO during off-peak hours, which my ISP says starts at 2 AM. Much to my chagrin, I found that at doesn’t work by default on Mac OS X and, worse, the Leopard man page leads to a dead end (though it didn’t back in Tiger…).

Turns out that the system daemon that is responsible for checking up on at jobs has been wrapped with a launchd job. This makes enabling at on your system really easy:

sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.atrun.plist

Once you’ve done this, you can now use at as you normally have done. For instance, I could now schedule my downloads to happen during the off-peak hours:

Perseus:Fedora maymay$ at 2:15am tomorrow # now press return 
curl -LO http://download.fedoraproject.org/pub/fedora/linux/releases/9/Fedora/x86_64/iso/Fedora-9-x86_64-DVD.iso
# now press CTRL-D.
job 1 at Tue Jul 15 02:15:00 2008
Perseus:Fedora maymay$ atq
1	Tue Jul 15 02:15:00 2008

This is also incredibly handy for scheduling just about any resource-intensive task that you don’t have to do right now. To take it one step further, you can even let the computer itself choose when to run these resource-heavy tasks by using the batch command, which will execute commands much like at but will check the system load average instead of the system clock to determine if it should start the job.

Note that with the com.apple.atrun job loaded /usr/libexec/atrun is started every 30 seconds (unless you change the StartInterval key in the plist file). Since the atrun command checks a file on disk (that it places in the /usr/lib/cron/jobs directory) to see if there is any work to do, this will probably prevent your disks from ever sleeping, which could be a major concern for battery life on portables. Also, obviously, your computer needs to be turned on and awake for the job to actually launch.

For more information, check out the result of typing man at and man launchctl at a Terminal prompt. There’s also a really good Google Tech Talk about Launchd that will teach you a lot more about job scheduling on Mac OS X.

One minute Mac tip: Create the illusion that Bonjour works over a VPN

If you’re a Mac user who often uses VPN connections, you’ll notice one very disappointing thing about connecting to your corporate or personal network over such tunneled connections: typically, Bonjour-style addresses (such as “computer-name.local”) don’t work. This is because multicast DNS (or mDNS) doesn’t work over a tunnel. Though there are ways to get it functional, they are pretty complicated and require that you have a lot of esoteric networking knowledge.

However, if the services you typically access via Bonjour use static IP addresses, then there is one age-old networking technique you can use to simulate Bonjour-style naming conventions without actually using Bonjour. This, of course, is the /etc/hosts file.

The /etc/hosts is a simple, static, text-based mapping of computer names to IP addresses. It does exactly what Bonjour does except it doesn’t keep itself up to date when things change. Of course, if you’re using static IPs for the services you want access to, you can pretty safely assume that things aren’t going to be changing frequently anyway. Long-time sysadmins will laugh at this, but I say let them laugh. This is remarkably useful and very easy to implement.

Let’s assume I’m running a personal web server on my home network, and I can access my home network via a VPN. On my home network, my web server’s IP address is, say, 192.168.2.100, and I usually access it as http://server.local/. All I need to do is open a Terminal prompt and run the following commands as an administrative user:

sudo echo "192.168.2.100	server.local" >> /etc/hosts

That’s it. What this does is hard-wire the name server.local so that it always resolves to the IP address 192.168.2.100. Now, anytime anything on my computer tries to access server.local, it’ll always access 192.168.2.100 directly instead of ever needing to make an mDNS query on the network. The net effect is that we can trick our computer into thinking that Bonjour is working, even when it’s not—such as over a VPN connection.

Note that in default cases, hard-wiring an IP address like this completely prevents your computer from ever asking other computers (such as DNS servers) what the current IP address for this name is. That means if the IP address of the remote server changes, you won’t be notified, and things will just not work. So be mindful that you’ve made this change, and revert it as a first step in troubleshooting procedures.

By the way, Windows users can do the very same thing simply by editing their etc/hosts. They can find this file at C:\WINDOWS\system32\drivers\etc\hosts and can edit it with Notepad. They will also need to install Bonjour for Windows to get Bonjour working in the first place, of course.

Quick ‘N’ Dirty Drupal Module SVN Tagging Script

In a (rather beastly) project at work today, I found myself needing to import a significant number of contributed Drupal modules into Subversion vendor branches to prepare for custom development. To do so manually would have been quite the hassle, so after downloading the appropriate tarballs and creating a module_name/current directory under my vendor/drupal/modules vendor branch directory, I concocted this little (relatively untested) script to handle the mass tagging operations I needed to perform.

for i in *; do
    v=`grep 'version = "' "$i/current/$i/"*.info |
      cut -d ':' -f 2 |
        sed -e 's/^version = "/v/' -e 's/"$//'`
    svn cp "$i/current" "$i/$v"
done;

It’s a bit buggy for some modules that have multiple .info files, but I’m sure a few more pipeline stages can fix that. (Which, because I’m done with this at the moment, I will leave as an exercise to the reader.)

Chalk this one up as another testament to the power of shell scripting and how it can help every developer get their job done faster.

How To: Move all pages in an Apple WikiServer Group to a new Group

As I’ve been blogging about, I’ve been playing a lot with Apple’s new WikiServer (or “Teams Server”) at work. We’re still evaluating what we’d like to use it for, but as part of the experiments, I’ve been finding myself having to do some pretty crazy things with the WikiServer. This one is pretty bizarre, and is probably not only very dangerous for the content of your group’s wiki and blog, but also almost certainly not a best-practice.

With that caveat out of the way, here’s how I managed to move all the pages in an Apple WikiServer group wiki and blog from one group to another.

Dear God, please have a backup!

Okay, step 1 is mundane, but seriously, please of please have a backup of your data before you do any of these things. To make a back up of your entire WikiServer’s data store, simply:

sudo tar -cvzf backup-file-name.tgz /Library/Collaboration

If anything goes wrong, you can restore from your backup just as simply:

sudo tar -C /Library/Collaboration -xvzf backup-file-name.tgz

Okay, with that out of the way, next make sure absolutely nobody is using any of the Group services for the group you are going to perform the move from, or to. People can work on other group’s wikis, it’s just the ones you’ll be touching you want people to avoid. You can enforce this with some Apache redirects, which is left as an exercise to the reader.

Step 1: Rename or create a new Group in Workgroup Manager

First, you need to either rename or create a new group in Workgroup Manager. If you make a new group, be sure to enable all the services that the old group used.

If you’re simply renaming a group, then you might not even have to go through this trouble. You merely need to change the group’s “Name” (as opposed to its “Short Name”) and then stop and start the Web Service from Server Admin to see the group’s name change. However, if you also want to change the group’s “Short Name” (i.e., the group’s POSIX group account symbolic name), then you will need to perform these steps.

Once you have the new group ready to go in Workgroup Manager and you have stopped and started your Web Service in Server Admin, continue to the next step.

Step 2: rsync all your files from the old group to the new group

This is simple. Just run:

sudo rsync -avzE --progress /Library/Collaboration/Groups/old-group/ /Library/Collaboration/Groups/new-group/

Note that the trailing slashes on the directory names in this command are quite important, as they tell rsync to take the contents of the first directory and place those items as the contents of the second directory. Without the trailing slashes, rsync will make extraneous directories for you, which Apple WikiServer won’t understand. See man rsync for more information.

Step 3: Update the plists for all your WikiServer pages in the new group

Now, WikiServer has all the content of your old group but all of the internal references are wrong, since they still point to the old group. What you need to do is rewrite all those references so that they point to the new group address. The group references are stored in property list files. See my older blog entries for details about the filesystem structure of Apple’s WikiServer data storage layout.

To do this is a relatively simple procedure. As root, do the following, and do it carefully because as root you can easily mess up (and there is no undo button on the command line!):

sudo su -
cd /Library/Collaboration/Groups/new-group
grep -ri "old-group" * | grep '^[^B]' | cut -d ':' -f 1 | grep 'plist$' | sort | uniq > /tmp/list_of_plist_files_to_edit
for i in `cat /tmp/list_of_plist_files_to_edit`; do sed -e 's/<string>groups\/old-group/<string>groups\/new-group/' $i > $i.new; mv "$i.new" "$i"; chown teamsserver:teamsserver "$i"; chmod o-rwx "$i" done;

In English, this means:

  1. Become root.
  2. Change to the /Library/Collaboration/Groups/new-group directory
  3. Find all plist files that have the the string old-group in them, sort them, and write this list of files to the /tmp/list_of_plist_files_to_edit file.
  4. For each of the files listed in the /tmp/list_of_plist_files_to_edit, find the text string <string>groups/old-group and replace that text with <string>groups/new-group, and save this change as the name of the file with .new appended. Finally, replace the original file with the file we modified, and give them the appropriate permissions.

Step 4: Create an Apache redirect to make sure no HTML links are broken

Okay, at this point your new group is up and running and it should be working. However, if you had any links at all in any of your group’s pages, they are now all broken because they still point to the old group. Rather than going through the HTML itself and cleaning this up right now (because that’s very error-prone indeed, even with automated tools), it’s much easier to just tell Apache to redirect all requests for the old group to the new group.

To do this, edit the Apache configuration file of whatever Virtual Host you have been serving the WikiServer from. Most of the time, this will be at /etc/httpd/sites/0000_any_80_.conf.

The end of that file probably looks something like this::

#       Include /etc/httpd/httpd_users.conf
#       Include /etc/httpd/httpd_directory.conf
        Include /etc/httpd/httpd_groups.conf
        Include /etc/httpd/httpd_teams_required.conf
        LogLevel warn
        ServerAlias *
</VirtualHost>

Right above these Include directives, simply add the following:

        <Location /groups>
            <IfModule mod_alias.c>
                Redirect 301 /groups/old-group http://your.server.address/groups/new-group
            </IfModule>
        </Location>
#       Include /etc/httpd/httpd_users.conf
#       Include /etc/httpd/httpd_directory.conf
        Include /etc/httpd/httpd_groups.conf
        Include /etc/httpd/httpd_teams_required.conf
        LogLevel warn
        ServerAlias *
</VirtualHost>

There you have it. New Teams Server group, old group’s data.

Caveats

Note that this does not update any of the SQLite databases used to store things like revision history and so forth. These things are, for the most part, not really necessary to update but it would be ideal if the old plist revisions could be changed in there, too. That’s not so much more extra work, really, but I’ve found it typically unnecessary except in fringe cases, so I leave that as an exercise to the reader.

If you do this to your WikiServer and do not update the SQLite databases as well, just be mindful of that fact so that you’re not surprised if something goes wonky down the line.

Using Calendars from the Command Line

If you’re anything like me, you always have a terminal window open. One of the reasons I do this, of course, is because it’s fast. If I want to know anything at all about my computer, all I need do is type the question. The answer, because it’s always text-based, comes back immediately. I don’t have to wait for a window to open or for a pane to scroll. Everything comes at me from a single visual direction, the bottom of my terminal window.

However, there are some occasions when a text-based response to a complicated question isn’t very helpful because it requires so much extra work to understand. For me, the most common example of this sort of issue has always been in looking at time-based information, and more specifically, calendars. Whenever I’m on my machine, I almost always need to look at a calendar.

In the past, I used to go all the way over to iCal. Sure, I can do this using keyboard shortcuts only, but sometimes all I want is a quick answer to “what date is this upcoming Friday?” In situations like that, I’ve lately begun using the cal command, and my oh my, what a timesaver.

cal is kind of like man for dates. Of course, you can get more info by saying man cal to your prompt. The cal program, installed by default on almost all UNIX-based systems (including Mac OS X), has a ton of useful options. However, most of the time, I don’t need more than a few.

For instance, let’s say I just want a calendar of the current month. I can get get a compact, simple month view instead of going to iCal by saying just cal at the command line:

Perseus:~ maymay$ cal
     April 2008
Su Mo Tu We Th Fr Sa
       1  2  3  4  5
 6  7  8  9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30

Other options let me ask other questions of cal. Easy, simple, fast. I like it.

How to import CVS code repositories into Git using `git cvsimport`

This should be straightforward, but it’s not. To import (not track, but just import) code from a remote CVS repository to a local git repository, you need to do the following:

  1. Be certain you have the git-core package installed on your system and that this package includes the git-cvsimport command. You can run git help -a | grep cvsimport to verify this.
  2. Be certain you have the cvsps command-line tool installed. This does not come with the git suite of tools, so you’ll need to get it separately. If you’re a lazy Mac OS X user, like me, you can use MacPorts: sudo port install cvsps. Otherwise, get it from the source.
  3. Prepare your CVS login information for the remote server before you run git cvsimport. You need to do this so that the git tool will be able to log you in to the CVS server automatically. The command for this looks like:
    CVSROOT=:cvs-login-method:cvs-user-name@cvs.server.name:/path/to/CVS/root cvs login

    For example, if you’re pulling code from the anonymous CVS server that runs on Drupal.org, you might use this: CVSROOT=:pserver:anonymous@cvs.drupal.org:/cvs/drupal-contrib cvs login. This command will prompt you for the password for the user you specified at the server you specified (for anonymous access, the password is almost always anonymous) and will hash this in the ~/.cvspass file for future use by CVS

  4. Finally, run the git cvsimport tool, and specify the proper options. Using the Drupal example above, your command might look like this:
    git cvsimport -v -d :pserver:anonymous@cvs.drupal.org:/cvs/drupal-contrib contributions/modules/module-name

    This would login to cvs.drupal.org using the CVS’s pserver login method, provide the username anonymous and the password you specified in the previous step that is hashed in ~/.cvspass, set the CVS document root to /cvs/drupal-contrib, and pull the code located at contributions/modules/module-name into the current working directory as a git repository.

This works pretty nicely, and creates a git repository just as though you’d created it with git init in the current working directory.

If you get an error that looks like this:

AuthReply: cvs [pserver aborted]: descramble: unknown scrambling method

then you’ve most likely specified the CVS document root incorrectly. Most notably, git cvsimport does not understand a CVS document root wherein the password is specified in the document root URL itself. So, for example, git cvsimport -d :pserver:password:username@cvs.server.name:/path/to/CVS/root code/to/checkout will not work. Omitting the password and the separating colon from the URL should fix it.

HowTo: Use git for personal development when everyone else is using Subversion (part 2)

When we left off, you had just finished transforming a remote Subversion repository into a git repository and optimizing it to save you some space. Now that you have a git repository, what do you do?

First things first. Once you have an idea of what work you want to do, you should give yourself a space to do this work without disturbing anybody else’s work. Do this by making a new, personal branch. Unlike Subversion and some other centralized version control systems, git makes it possible to do make all kinds of changes to your repository, including making branches, and even save those changes without having to republish everything back to the central repository server at each step. In other words, if you thought Subversion branches were “cheap,” you’ll love git’s branches.

Also unlike Subversion, which stores its branches in completely separate pathnames, git keeps all branches in the same filesystem tree separated only with metadata (in .git/refs/remotes for remote branches and in .git/refs/heads for local branches, to be a bit more precise), so you don’t have to create lots of different directories for all your branches (unless you want to). With no branches defined, you’re working in the “master branch,” or “the trunk” by default.

git branch
* master

By saying git branch you ask git to print a list of all (local) branches. The one with the asterisk marks the current branch, the one you’re using at the moment you run the command. Since there are no branches, you’re currently in the “master” branch. But we don’t want to make changes here, we want to make changes in our own private branch, so we’ll make a new one.

git checkout -b new_branch_name

This will create a new branch with the name new_branch_name and immediately switch to it. Notice that you’ve made absolutely no changes to the filesystem itself; only the git metadata has been altered. Saying git branch again will show you the change:

git branch
  master
* new_branch_name

Also note that since we haven’t comitted any changes, we don’t need a commit message (or “log message”) for creating this branch. We’ll add one later, when we need it. Now go ahead and write some code in your new branch. At any time, you can create a new branch in the same way you added the first. Each new branch is created at the HEAD (“latest”) revision of whatever branch you’re currently working in.

git checkout -b another_branch
Switched to a new branch "another_branch"
git branch
  master
  new_branch_name
* another_branch

To switch back to any other branch, simply git checkout that branch again:

git checkout master
Switched to branch "master"

If you want to delete a branch you don’t like, that’s easy too:

git branch -d another_branch
Deleted branch another_branch.

Keep in mind that throughout all of these branch creation and deletion actions, the only thing that’s being altered is the git metadata. That’s why it’s so cheap to create new branches. If you ever have a new idea you’re working on, it’s recommended that you create a branch for it, even if that branch is so short-lived it never gets published.

So you have a new local branch, and you’ve been working as you normally do for a few minutes, creating files, editting them, and so on. Running git status now will ask git to show you the changes you’ve made to your filesystem. If you’ve created any files in new directories that git doesn’t know about, it will simply report that directory. If you’ve made new files in directories git does know about, it will list all those files explicitly.

git status
# On branch cartoon_contests
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#
#	sites/default/modules/cartoon_contests/
#	sites/default/modules/factiva/.factiva.module.swp
#	sites/default/modules/testfile
#	sites/default/settings.php
nothing added to commit but untracked files present (use "git add" to track)

In the above sample output, I’m working on a new Drupal module for a web site. I’ve created a new directory, sites/default/modules/cartoon_contests/ (note the trailing slash), and I have several untracked files. One is my vim swap file for a different module, .factiva.module.swp, one’s an unimportant testfile, and the last is the Drupal configuration file, settings.php.

The only thing I want to commit is the new cartoon_contests directory, and all the files within it. Like Subversion, I have to tell git that I want to track this directory, which is done simply by saying git add cartoon_contests. Unlike Subversion, future invocations of git status let me see everything that git is going to go in the next commit.

git status
# On branch cartoon_contests
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#	new file:   sites/default/modules/cartoon_contests/cartoon_contests.info
#	new file:   sites/default/modules/cartoon_contests/cartoon_contests.module
#
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#
#	sites/default/modules/factiva/.factiva.module.swp
#	sites/default/modules/testfile
#	sites/default/settings.php

The “Changes to be committed” section of the output is called the staging area, or the index. In this way, I can prepare all the changes I want to commit before I do so, making sure they’re perfect before I actually commit them to the git repository. At any time before I commit, I can make additional modifications, such as git adding more files or directories, git reseting to unstage all (or some) of my changes, etc. git help reset also has a number of handy explanations with examples for different things you might need to do at this point.

If you were using svn:ignore, equivalent functionality exists in git. Simply append file glob patterns, one per line, to the $GIT_DIR/info/exclude file in your git repository. Like so:

echo -e "*.swp\nsites/default/settings.php" >> .git/info/exclude ; git status
# On branch cartoon_contests
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#	new file:   sites/default/modules/cartoon_contests/cartoon_contests.info
#	new file:   sites/default/modules/cartoon_contests/cartoon_contests.module
#

If you’ve got many of these, you can use git svn show-ignore >> .git/info/exclude to search through your old Subversion repository and look for any and all ignores, automatically adding them to git’s exclude list. (Checkout Tsuna’s blog entry on learning git for more tips like this.)

Finally, after you’ve done some of your work and you’ve finished staging your changes, you’re ready to commit them to the repository. On the other hand, if you hate what you’ve done and want to undo it all, you can say git reset --hard HEAD to throw away all your local changes. To throw away changes to just a single file, just checkout that file again by saying git checkout filename. This is the equivalent of Subversions svn revert filename command.

Before you actually make your first commit, however, you should properly introduce yourself to git. You don’t have to do this because git will try to figure out who you are by itself (details explained in detail here), but you probably should at least create global defaults for your user (which git will store in ~/.gitconfig). If you want to, you can also create per-repository defaults (which git will store in .git/config), or even system-wide defaults for all users of this computer (which git will store in /etc/gitconfig). To do so, say this:

git config --global user.name "Your Full Name"
git config --global user.email "you@example.com"

This will create the file ~/.gitconfig if it doesn’t alreay exist and will write your name into it. You can also just edit the file directly yourself instead of using git config commands.

Like Subversion, committing will create a saved, fixed point-in-time that marks the changes you have made to your files. Like branches, commits are also very cheap in git, so go ahead and commit at any time you like. Remember to stage your files (by git adding them), and then just git commit.

git add filepattern
git commit -m "My very frst git commit!"
Created commit ef483c1: My very frst git commit!
 5 files changed, 58 insertions(+), 12 deletions(-)

If you forget to specify a commit log message (-m "log message") on the command line, or if you want to enter a multi-line commit log, git will prompt you for it in your favorite $EDITOR. You can view a history, including all the log messages, for your project with git log. You can even the view logs in a number of pretty formats. Check git help log for more information.

If you want to change anything about the commit you just made, such as the author, you can just run git commit again with the --amend flag added to the command. Notice the typo in the commit log? Fixing it is really easy:

git commit --amend -m "My very first git commit!"
Created commit 88602f6: My very first git commit!
 5 files changed, 58 insertions(+), 12 deletions(-)

Finally, with your new code commited to your local git branch, it’s time to share that code with your colleagues who are (for some inexplicable reason) still using Subversion. This is also extremely simple. Just say git svn dcommit. (That’s right, dcommit, not just commit. I don’t know why….)

Not sharing your changes via Subversion, but with a patch instead? git diff -p will generate patches for you. See git help diff-files and look for the “GENERATING PATCHES WITH -P” section.

If you found this helpful, you may also enjoy an alternative tour for beginners from Carl Worth.