2011-08-03 TOMMY TYNJÄ
Distributed version control systems (DVCS) has been around for many years already, and is increasing in popularity all the time. There are however many projects that are still using a traditional version control system (VCS), such as Subversion. I have until recently, only been working with Subversion as a VCS. Subversion sure has its flaws and problems but mostly got the job done over the years I’ve been working with it. I started contributing to the JBoss ShrinkWrap project early this spring, where they use a DVCS in form of Git. The more I’ve been working with Git, the more I have been aware of the problems which are imposed by Subversion. The biggest transition for me has been to adopt the new mindset that DVCS brings. Suddenly I realized that my daily work has many many times been influenced on the way the VCS worked, rather than doing things the way that feels natural for me as a developer. I think this is one of the key benefits with DVCS, and I think you start being aware of this as soon as you start using a DVCS.
While a traditional VCS can be sufficent in many projects, DVCSs brings new interesting dimensions and possibilites to version control.
What is a distributed version control system?
The fundamental of a DVCS is that each user keeps an own self-contained repository on his/her computer. There is no need to have a central master repository, even if most projects have one, e.g. to allow continuous integration. This allows for the following characteristics:
One of the biggest differences between Git and Subversion which I’ve noticed is not listed above and is the speed of the version control system. The speed of Git has really been blowing me away and in terms of speed, it feels like comparing a Bugatti Veyron (Git) with an old Beetle (Subversion). A project which would take minutes to download from a central Subversion repository is literally taking seconds with Git. Once, I actually had to investigate that my file system acutally contained all the files Git told me it downloaded, as it went so incredibly fast! I want to emphasize that Git is not only faster when downloading/checking out source code the first time, it also applies to commiting, retrieving history etc.
Squashing commits with Git
To be able to change history is something I’ve longed for in all these years working with Subversion. With a DVCS, it is possible! When I’ve been working on a new feature for instance, previously I’ve ususally wanted to commit my progress (as checkpoints, mentioned above) but in a Subversion environment this would screw things up for other team members. When I work with Git, it allows me the freedom to do what I’ve wanted to do during all these years, committing small incremental changes to the code base, but without disturbing other team members in their work. For example, I could add a new method to an interface, commit it, start working on the implementation, commit often, work some more on the implementation, commit some more stuff, then realize that I need to rethink some of the implementation, revert a couple of commits, redo the implementation, commit etc. All this without disturbing my colleagues working on the same code base. When I feel like commiting my work, I don’t necessarily want to bring in all small commits I’ve made at development time, e.g. just adding javadoc to a method in a commit. With Git I can do something called squash, which means that I can bunch commits together, e.g. bunch my latest 5 commits together to a single one, which I then can share with other users. I can even modify the commit message, which I think is a very neat feature.
Example: Squash the latest 5 commits on the current working tree
$ git rebase -i HEAD~5
This will launch a VI editor (here I assume you are familiar with it). Leave the first commit as pick, change the rest of the signatures to
squash, such as:
pick 79f4edb Work done on new feature pick 032aab2 Refactored pick 7508090 More work on implementation pick 368b3c0 Began stubbing out interface implementation pick c528b95 Added new interface method
pick 79f4edb Work done on new feature squash 032aab2 Refactored squash 7508090 More work on implementation squash 368b3c0 Began stubbing out interface implementation squash c528b95 Added new interface method
On the next screen, delete or comment all lines you don’t want and add a more proper commit message:
# This is a combination of 5 commits. # The first commit's message is: Added new interface method # This is the 2nd commit message: Began stubbing out interface implementation
# This is a combination of 5 commits. # The first commit's message is: Finished work on new feature #Added new interface method # This is the 2nd commit message: #Began stubbing out interface implementation ...
Save to execute the squash. This will leave you with a single commit with the message you provided. Now you can just share this single commit with other users, e.g. via push to the master repository (if used).
Another interesting aspect of DVCSs is that if you use master repository, it won’t get hit that often since you execute your commits locally before squashing things together and send them upstream. This makes DVCSs more attractive from a scalability point of view.
A DVCS does not enforce you to have a central repository and every user has its own local repository with full history. Users can work and commit locally before sharing code with other users. If you haven’t tried out DVCS yet, do it! It is actually as easy as stated earlier: Download, install and create your first repository! The concepts of DVCS may be confusing for a non-DVCS user at first, but there are a lot of tutorials out there and “cheat sheets” which covers the most basic (and more advanced) tasks. You will soon discover many nice features with the DVCS of your choice, making it harder and harder to go back to a traditional VCS. If you have experience from DVCSs, please share your experiences!