Saturday, September 25, 2010

My Distributed Version Control System comparison

If you want to cut to the chase, I've been using Bazaar for 6 months now and I really like it. Read more below for background and details on why I chose Bazaar over other VCS'es.

A while back I had decided that I wanted to switch the version control system that we use for the main project I'm on at my job. The company has used a propriety, licensed version control system called Accurev for years now (as a side note, most of the work at the company is not Ruby on Rails, but is C, C++, and C#) Accurev is pretty easy to pick up and start using but has lots of shortcomings and can be pretty inflexible, not to mention the fact that there is a yearly per developer license cost.

So 7-8 months ago I finally decided to evaluate other version control systems for our project, and possibly for the rest of the company. Here are my requirements for a new VCS:
  1. Free and preferably open source
  2. Easy branching so each task people are working on can be in its own branch
  3. Easy merging
  4. Can use a central server for the main repository with secure, encrypted transport to server
  5. Developers still need to be able to use VCS if server is down or if they have no Internet
  6. Command line support for all main operations
  7. GUI available for performing basic commands, viewing commit log, doing diffs, and performing merges. Preferrably GUI is built in to VCS (I don't want to have to do a whole other study on what 3rd party GUIs to use)
  8. Works the same in Windows, Mac, and Linux (I know what you're saying, why Windows - this isn't for our project but for other non Rails projects at the company, the majority of which are Windows)
  9. Has a large community of users (not some new VCS that could go away in a year)

Because of #5, traditional free VCSes like Subversion fall short. So, I evaluated the 3 main distributed version control systems. A problem with all 3 that will hopefully get better with time (although Git has had plenty of time, so I'm not sure if it will) is a lack of documentation. Most of the examples for all 3 seem to be geared either towards a single developer, or a large, widely distributed open source project. The work on my project is neither of those - it's a group of 3 people working on a company proprietary project.

Git
Git is by far the most widely used distributed version control system. I started using git before this evaulation for personal projects. If you are working on an open source project, this is definitely the way to go, git gives you a ton of functionality, and using github.com is really easy. And it's extremely fast and efficient, it definitely has the others beat in that department. Setting up a central server repository is very easy, it can use SSH so you don't even need to have git installed on the server. But I eliminated git pretty quickly, here's why:
  1. Doesn't work too well in Windows. You can either go through the hassle of using it in Cygwin, which is not fun, or use the Google project msysgit. During the Pragmatic Studios Advanced Ruby on Rails class that I took earlier in the year, I saw about half of the class who had Windows laptops struggling the entire class to get git to install and work correctly. Since this time, msysGit has come a long way, but I still believe that the other VCSes work better in Windows.
  2. Terrible built in GUI. The gitk program that comes with Git looks like a UNIX GUI from 1995. Even forgetting how bad it looks, it isn't too functional, it won't do much. I did a brief investigation to find a 3rd party GUI but couldn't find much. While I use the command line for most of my work, it's no substitute for a GUI for doing merges, diffs, and viewing branches, forks, and merges in the log.
  3. Revisions are identified by a long hash tag. For convenience you can refer to the revisions with just the last 6 characters of the tag. Not everyone writing source code is so hardcore that they remember all of their work with a 6 hex digit identifier. Which would you rather remember, revision 121.5, or revision 6f88ca?
  4. User unfriendliness/complicated - Using git just seems to require having to learn too much. This is more of an intangible thing. While geeks who eat, breathe, and sleep coding (still not sure if I'd put myself in that category) can easily pick up git, seeing how a large group of regular developers struggled with Git at the Advanced Ruby and Rails class did not give me a good feeling for using Git with other developers.
Mercurial
Next up is Mercurial. It's very similar to Git. A main difference is that it's built with Python and runs well on any OS that Python runs on. The user experience is pretty much the same on Windows, Mac, and Linux. But I decided not to use Mercurial for some of the same reasons as Git:
  1. Bad built in GUI. Not quite as ugly as the git GUI, but pretty much just as useless.
  2. User unfriendliness/complication. Same as git, it just seemed a bit too complicated for regular developers to be comfortable with. I'll admit I didn't look in to Mercurial too extensively but it just seemed too much like git for my project.
  3. Slow and inefficient - mercurial definitely seemed to run a little slower than git, and according to the benchmarks on Bazaar's site, the repository takes up much more space than a git or bazaar repository.
Bazaar
Bazaar isn't quite as widely used as Mercurial, but does have the support of the Ubuntu group and MySQL, and a code hosting site launchpad.net which is like github (but also has bug tracking, mailing lists, etc.). I decided to go with Bazaar. My team has been using it for abou5 months now, and things have gone pretty well, everyone seems to like it (although it did take a little getting used to, coming from Accurev). For a good "why should I use Bazaar" page look at http://doc.bazaar.canonical.com/migration/en/why-switch-to-bazaar.html. Let me explain my reasons why we're using it, and why I think it's better than the other VCSes:
  1. User friendliness is a core goal and not an afterthought. Somewhere on their web page they have a quote that a main goal of Bazaar is "version control for human beings". I don't think you'd ever see this quote anywhere about git, the attitude there seems to be more "it works perfectly for the Linux kernel development team so of course it will work for everything else". Most commands are very easy to use, and the work flow seems to be a little easier. Example - you don't have to add modified files. A commit will automatically commit modified files. Having to "add" an already tracked file in the same way that you actually add an untracked file to the repository, like you have to do in git, doesn't seem intuitive.
  2. Cross platform support - like Mercurial, it's built on Python and runs great in Windows, Mac, and Linux.
  3. Nice built in GUI - The Bazaar Explorer gui that comes with Bazaar is actually pretty nice. It looks nice, allows you to do almost all operations from the GUI, the log viewer is excellent. About the only downside is that there is no built in merge tool, you have to use a 3rd party tool (I've found diffmerge to work the best across all OS'es). But the hooks are in explorer to launch any 3rd party merge tool.
  4. Revision numbers are not hash tags - Each revision has a full hash tag that never changes (called revision ID), but there is also a "revision number", which is sequential. It's much easier to refer to revisions by a single number than a hash tag. A complication with this is that after a merge, revision numbers from the branch that you're merging with get renamed. Like say that revision 6 is the last common revision. New revisions 7 and 8 are created on branch A, and 7 is created on branch B. If a merge with B is done on branch A, what was 7 on branch B now becomes 6.1 (or 6.1.1), then a new revision 8 is created on branch B when you commit after performing the merge. At first this seems a little unsettling but you get used to it quickly. It also makes viewing the revision log after branches a little clearer, you immediately see the last common revision from the revision number and not by tracing back. I believe Mercurial works this way too but I didn't look in to it far enough.
  5. Both checkouts and branches. Bazaar has a checkout command which is the same as a Subversion checkout - a working directory is created from the server with all current files, and any operations (commit, log view, etc.) performed happen on the server, there is no local repository. Having the ability to do both checkouts and fully distributed branches gives you a ton of flexibility for your project management. Say that you want work done from your office desktops to always be preserved on the server for security in case of a disk failure, but you also want people to do distrubed work - contractors at other facilities, developers on laptops, etc. Bazaar is the only VCS that I know of that can do both. This is also handy for test servers - simply do a checkout so that way a full copy of the repository isn't on the test machine.
  6. Support for various transport mechanisms - the other two do this as well. You can do all operations over a variety of transport protocols - http, https, sftp (for servers where you don't have Bazaar installed), ssh (where Bazaar is isntalled on the server), plus I think a few others.
  7. Recovery from user error - we've made just about every mistake that there is to make (merging with the wrong branch, pushing code that isn't ready up to the main branch, specifying wrong files to commit, reverting everything not just one change, etc.). While recovery from these problems hasn't always been straight forward, and sometimes is hard to figure out how to do, in the end, we've always been able to recover. I don't have much experience with the other two VCS'es for this, I imagine they would work pretty much the same way too.
Bazaar is certainly not perfect. I've writte a subsequent blog posting Bazaar problems and lessons learned explaining some of the problems we've run in to and how to fix them. But, while git may work best for large open source projects, in my opinion Bazaar is by far the best VCS to use for the vast majority of software development work being done - businesses doing company proprietary development work with a team of developers of varying skill levels.

Sunday, September 12, 2010

YouTube videos

Now that I have a phone with a camera that can take decent videos (Droid X), I've started taking videos when I go to concerts and posting them on YouTube. You can see all of them at my YouTube channel at http://www.youtube.com/user/valenshek?feature=mhum.

Friday, September 10, 2010

Can't run rake test in Rubymine with Ruby 1.9

This one has happened several times to me on new installs but I figured I should finally write something up about it. If you're using Rubymine (which I would highly recommend) with Ruby 1.9, and you attempt to run rake test using the built in rake task tools in Rubymine, you'll get an error that says "File 'test/unit/autorunner.rb' not found in $LOAD_PATH of Ruby SDK ...". There is an easy fix for this. Just install and attach the test-unit gem.

Tuesday, September 7, 2010

Upgrade to Rails 2.3.9 session no longer works

I just upgraded to the newly released Rails 2.3.9, and session data stopped getting saved. I could set session data and it was accessible within the same request, but on the next request, the session data is gone.

After digging a little deeper, I found that I was specifying the options for the session in the wrong place. Previously, the session options were specified in environment.rb. Now, Rails has moved this in to a different file, config/initializers/session_store.rb. Simply create this file with the following code:



Change the key to what you previously set as :session_key, and set secret to your previous :secret value. Be sure to uncomment the last line if you're using the database as the session store. Also, be sure to delete all session code from environment.rb after you do this.

However, this still didn't do the trick for the main Rails app that I work on. I'm using ActiveRecord to store session data for this app - data is too sensitive to store in a cookie. After adding the file session_store.rb to config/initializers, data still wasn't getting stored in the session. This appears to be a bug in Rails 2.3.9, as evidenced by ticket #5581. I tried the patch that Mislav posted in the comments of the ticket, but the session still didn't work for me. So, it's back to Rails 2.3.5 for my main app. The ticket is closed, so it appears as if this has been fixed in the Rails code, but I'm not sure if/when a 2.3.10 version of Rails will be released.