Introduction to Git

This is the first of a short series of articles introducing git to new users. Git is a source code management (SCM) utility that has come into widespread use within the open source community and is now expanding rapidly into the corporate realm. It offers some unique advantages over other SCM systems, particularly Subversion, which it is quickly supplanting.

The first thing that surprises most people is that git is a command-line tool. No fancy user interface for this tool yet (although I suspect there are individuals in the open source community already working on creating utilities to augment git's user interface). Git provides an amazingly complete, and often eclectic, set of commands to accomplish all sorts of actions on a source code repository.

The good thing is that these commands ensure that all aspects of your source code repository are accessible and can be modified as needed. The down side is that it can often be difficult for a newcomer to figure out where to start.

Setting Up a Repository

Most people that I know are using git in conjunction with GitHub, an online business that hosts git repositories and which has built all sorts of web-accessible tools to support the software development process. You don't have to use GitHub; you can set up your own git repository wherever you want, but then you have to worry about hosting it, providing secure access to it, administering access for new users, etc.

GitHub's business model provides free public repositories for users, which is great for the open source community. They also offer paid services for users who want private repositories, i.e. - repositories to which the owner can regulate access to only specified users. This is the option typically chosen by corporate users.

To use GitHub, you must first sign up for an account. Once you've created an account, GitHub has plenty of information available online to guide you in creating your first source code repository so I'm not going to cover that.

Using Your Git Repository

You've just created your first git source code repository at GitHub. Now you want to use it. To make things simple, you're not sharing it with any other developers, so you don't have to worry about things like branching, merging, etc.

First, let's clone the repository so you'll have a local copy to work on.

      $ git clone git@github.com:your_account/repository.git

Now that you've got a copy of the repository, add some new source code files. Once you've done this, you'll naturally want to check your changes in. Let's determine what's changed:

      $ git status
      # On branch master
      #
      # Untracked files:
      #   (use "git add ..." to include in what will be committed)
      #
      #      yourfile1.rb
      #      yourfile2.rb
      no changes added to commit 
      (use "git add" and/or "git commit -a")

Your brand new repository started out empty, but you've created some new files. Git shows the changes to the repository.

Unlike Subversion, git will only check in files that you have marked for check-in. You can do this by "adding" each file individually:

      $ git add yourfile1.rb
      $ git add yourfile2.rb

Check the status again to see what will be checked in:

      $ git status
      # On branch master
      # Changes to be committed:
      #   (use "git reset HEAD ..." to unstage)
      #
      #      modified:   yourfile1.rb
      #      modified:   yourfile2.rb

To check in the code:

      $ git commit
      [master e732f5a]    Minor change.
      1 files changed, 9 insertions(+)

The commit action automatically brings up an editor, generally vi, so that a message can be associated with the commit (similar to Subversion, CVS and other command-line-based source code management tools).

      Minor change.
      # Please enter the commit message for your changes.
      # Lines starting with '#' will be ignored, and an empty 
      # message aborts the commit.
      # On branch master
      # Changes to be committed:
      #   (use "git reset HEAD ..." to unstage)
      #
      #      modified:   db/yourfile1.rb
      #      modified:   db/yourfile2.rb

The comment above in the above example is "Minor Change." (I would generally enter something with a bit more useful detail). Within the editor, any line beginning with "#" is a comment and will be ignored.

As an alternative, you could also commit all changes automatically:

      $ git commit -a
      [master e732f5a]    Minor change.
      1 files changed, 9 insertions(+)

This will also commit all the files without the necessity for explicitly adding each one. The downside, of course, is that this might also check in files that you don't want, e.g. - scratch files, temporary files, etc.

This commit action didn't actually do what you think it might have done. You see, you have a complete copy of the source code repository locally. You've just checked the changes into your local copy of the repository. This is nice, because you can be working offline, not attached to the Internet, and check code into your repository.

Clearly, though, there must be some way to sync your changes with the master repository on GitHub. Here's how you do it:

      $ git push origin master
      Counting objects: 35, done.
      Delta compression using up to 2 threads.
      Compressing objects: 100% (22/22), done.
      Writing objects: 100% (22/22), 2.12 KiB, done.
      Total 22 (delta 18), reused 0 (delta 0)
      To git@github.com:your_account/repository.git
         731358f..e732f5a  master -> master

This pushes the changes from the master branch of your local repository (the branch you're on by default) to the origin, i.e. - the repository from which this local one was cloned.

Here's where some of git's power is exposed. If GitHub disappeared tomorrow, you'd still have a full copy of the repository, which can function as the origin for other developers if necessary. You could let other developers clone your local repository, push changes to it, etc.

Git decentralizes the source code repository and reduces the chance of a catastrophic failure, such as a repository that gets corrupted or a hardware crash impacting the computer that a repository is on. This is why git is referred to as a distributed source code management system.

You might want to tag a release. This is particularly useful if you're using hosting services such as EngineYard or Heroku. These services will pull code from a git repository and automatically deploy it if a release has been tagged.

To view the current tags that have been defined:

      $ git tag

You should examine the list of existing tags to make sure that the tag you're planning on creating doesn't already exist. Since this is a new repository, there are no tags yet.

To tag a release:

      $ git tag RC_1.15

As with checking in code changes, this tags the current revision of all the files in your local copy of the repository. To get the tag pushed to your GitHub repository:

      $ git push --tags

This pushes your local tags to the origin repository by default.

Note: This type of tag is what is referred to as a "light-weight tag." Git also supports signed tags with associated messages identifying what the tag represents. I've generally found light-weight tags sufficient for most needs, so that's what I've covered.

Conclusion

This has been a whirlwind introduction to git, with just barely enough information to get you started using git with your GitHub repository. There's a lot more power available with git, and we'll be covering some more advanced features in later articles.



Comments

No comments yet. Be the first.



Leave a Comment

Comments are moderated and will not appear on the site until reviewed.

(not displayed)