With Great Power Comes Great Responsibility

Git is a very powerful tool. It has many commands that allows great flexibility in order to work with many issued, with its command for branching, merging, rebasing and so on.

As usual, power needs control. For examples, the more branches we have, the more difficult is to handle them correctly. It's not a limitation of Git, it's a limitation of the user. In order to take serious advantage for an instrument, it's necessary to set some discipline, some rules, in order to use properly the tools that we have.

When we started to use Git, everyone was capable of use it, but everyone used it in his own way. This is great if you're the single developer of the project, and if your repository remain local, but when you need to share your work with others, it's necessary to set some guideline for maintaining the repository ordered and clean.

Basic Repository

When we create a repository, after the first commit we create those branches:

  • develop
  • master

develop branch is the one in which developers usually work. They implement new features there and make all troubles that they want. master branch is the one in which software releases will be given. In this branch should stay only commits for production commits. These are commits (often tagged) for code ready to be distributed. They are the stable releases. Both branches are commited to the remote repository, and all developers can access to them. They work with the develop branch, going to master branch when everything is ready for a release.

So if we want to create a empty repository from scratch:

git init
touch README.md
git add .
git commit -a -m "First commit."
git checkout -b develop

 

At this point, we have a repository with both branches (checked out on develop branch), so we can start to work with it.

Issue branches

One important thing is to remember is that the develop branch is shared across all developers. An important consequence is that the develop branch must be always ready to be built/tested/whatever. A developer must not commit anything to develop branch that can break the code in any way. If I'm working on an issue, I pull the code from remote repository and then I've a compilation error, I will have many problems in order to solve them, going to find the developer that made the mistake, or returning back in the repository to a previous commit. develop branch should also be the one selected for a continuous integration software that make unit tests and so on, and obviously if there's some error on it the regression test fails. Develop branch should be always tested. A developer should never work on develop branch directly, but work in other branches, and then when his work is ready, merge it to the develop branch.

Git allows us to easily create local branches, so if I must work on an issue, I create a new local branch from the develop one, work with it, and when it works merge my local branch into the develop one. In this case the develop branch will never have spurious commits that can broke the project.

Merge vs. Rebasing

I really like the Vincent Driessen post about git workflow. The workflow that we use is similar to his one, but we have some differences in order to maintain the repository cleaner.

In our projects we use a project management application. In particular, we like Redmine a lot. But, regardless the tool that you use, usually in order to develop a project you will have a place in which you set your bugs, tasks and so on. Usually every application that tracks project issues, assign an unique id to them. When I work on a project, I take an issue, work on it, and when it's completed I commit it to the develop branch.

This is what Vincent says, but we put our local branch in the develop one in a different way. Instead of merging our local branch, we rebase it. It's a bit longer, but it has many advantages in my opinion.

One Issue, One Commit

When we work on an issue locally, we can create a big number of commits, and some of them can also break the code. It's not important in this stage because it's a WIP. Once that the issue is completed, all these commits are useless, so we can squash the local branch in a single commit. In the develop branch we will have a single commit containing all changes for the issue. During the squash we can rewrite the commit putting the identifier of the issue, so we know exactly what every commit does.

Merging workflow

We are working on an issue on a local branch, and we finish it. At this point we can have this situation (for lg alias you can see this StackOverflow post):

$ git lg
* 144e8c1 (HEAD -> w-016) - Daniele Lupo : Regression test ok.
* 7dd99e5 - Daniele Lupo : Fixed compilation.
* a563502 - Daniele Lupo : Added missing classes.
* d0ccddf (develop)(origin/develop) - Daniele Lupo : w-014
* 4fcc96e (master) - Daniele Lupo : First commit.

We can see that we have finished to work on our issue branch (named w-016, with the id of the issue), and now we want to merge it into develop branch. In order to do this with can simply switch to develop branch and perform merging:

git checkout develop
$ git pull origin develop
$ git merge --no-ff w-016

If everything goes well, we obtain something like that:

$ git lg -n20
*   ddfa90e (HEAD -> develop)(origin/develop) - Daniele Lupo : Merge branch 'w-016' into develop
|\\
| * 144e8c1 (w-016) - Daniele Lupo : Regression test ok.
| * 7dd99e5 - Daniele Lupo : Fixed compilation.
| * a563502 - Daniele Lupo : Added missing classes.
* | f53f216 - Daniele Lupo : w-012 fixed.
* | 48a4440 - Daniele Lupo : w-044 fixed.
|/
* d0ccddf - Daniele Lupo : w-014.
* 4fcc96e (master) - Daniele Lupo : First commit.

At this point everything should be done. But we have at least two things that can create problems:

  1. In the issue branch we have a commit that does not compile. This commit is inserted into the develop branch.
  2. If there's some conflict during the merge, we have to solve it, and we cannot be sure that the conflicts are resolved successfully. We can do some error so the merged commit can have problems, and this commit is in the develop branch.

 In addition to this, the local branch can have many more commits (like one hundred), and so the repository history is simply bloated.

Rebasing workflow

In our project we change these steps, using a workflow that use the rebasing. We start from the same situation:

$ git lg
* 144e8c1 (HEAD -> w-016) - Daniele Lupo : Regression test ok.
* 7dd99e5 - Daniele Lupo : Fixed compilation.
* a563502 - Daniele Lupo : Added missing classes.
* d0ccddf (develop)(origin/develop) - Daniele Lupo : w-014
* 4fcc96e (master) - Daniele Lupo : First commit.

In order to put our changes into the develop branch, we perform following steps.

1. Interactive rebasing

We perform an interactive rebase of our local branch into the develop branch:

$ git rebase -i develop

During the interactive rebasing we squash all our commits into a single one:

pick a563502 Added missing classes.
s 7dd99e5 Fixed compilation.
s 144e8c1 Regression test ok.
# Rebase d0ccddf..144e8c1 onto d0ccddf (3 commands)
#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
# x, exec = run command (the rest of the line) using shell
# d, drop = remove commit
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

 Then we change the commit message using the issue code:

# This is a combination of 3 commits.
# The first commit's message is:
w-016: Solved issue.
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Date:      Sat Nov 12 21:36:47 2016 +0100
#
# interactive rebase in progress; onto d0ccddf
# Last commands done (3 commands done):
#    s 7dd99e5 Fixed compilation.
#    s 144e8c1 Regression test ok.
# No commands remaining.
# You are currently rebasing branch 'w-016' on 'd0ccddf'.
#
# Changes to be committed:

After this, we obtain a branch with a single commit, identifying the issue that we solved:

$ git lg
* 34e226a (HEAD -> w-016) - Daniele Lupo : w-016: Solved issue.
* d0ccddf (develop)(origin/develop) - Daniele Lupo : w-014
* 4fcc96e (master) - Daniele Lupo : First commit.

2. Pull develop branch

At this point, we checkout the develop branch, and pull its changes.

$ git lg
* f6366e3 (HEAD -> develop)(origin/develop - Daniele Lupo : w-012 fixed.
* 9509d88 - Daniele Lupo : w-044 fixed.
| * 34e226a (w-016) - Daniele Lupo: w-016: Solved issue.
|/
* d0ccddf - Daniele Lupo : w-014
* 4fcc96e (master) - Daniele Lupo : First commit.

3. Putting all together

At this point, we can rebase our issue branch to develop branch, and merge it to develop:

$ git checkout w-016
$ git rebase develop
$ git checkout develop
$ git merge w-016

At the end, we obtain the following tree:

$ git lg
* 18958c4 (HEAD -> develop, w-016) - Daniele Lupo : w-016: Solved issue.
* f6366e3 - Daniele Lupo : w-012 fixed.
* 9509d88 - Daniele Lupo : w-044 fixed.
* d0ccddf - Daniele Lupo : w-014
* 4fcc96e (master) - Daniele Lupo : First commit.

As we can see, we obtained a tree that's cleaner, that does not contain broken commits, and that allows us to identify clearly the changes for a specific issue.

 Pro and Cons

If we compare this workflow with the Vincent's one, we can see that in order to add an issue to the develop branch, we need to perform more steps, involving rebasing that's considered a more dangerous command. Considering that many developers use a GUI like GitKraken or GitExtensions in order to perform operations, we can notice that these steps are more tedious.

We focused on learn to use Git from command line because we can perform many advanced tasks with less effort compared to most of Gui's, so we always focus on using the command line. I suggest to learn to use Git this way because in my opinion it's the best way to use this tool properly. If you learn to use the command line, the greater number of steps is not a big issue.

Instead, using this approach, we have many advantages.

The first one is that, as you can see, the history remains clean, reducing the bloating of commits, and it's easy to identify where we performed a change (or also we introduced a bug). If the history is easier to read, is easier to manage in order to blame, bisect and so on.

We also reduce the possibility of conflicts problems. When we rebase our local branch into the develop one, if there are conflicts, we can solve them and they remain in our local branch, so it's possible to test our code. If we did some error during conflicts resolution, we can solve them in our local branch, re-perform an interactive rebase so we can have always a single commit with no errors. The develop branch remains untouched until the very ends.

It's also easier to identify commits. The commit in the develop branch have a descriptive message, instead of the default merge commit message that's much more obscure (really, who changes the commit message when merging branches?).

Release

When we are ready to make a release, we take the commit from develop branch that must be released, and we create from it a new branch, named release-vX.Y.Z, where we use the version number. At this point, we have two main branches; in the develop one, we continue to add features for the next version, while in the release one, we test the program. If testers discover some bug, we can solve it in this branch, creating a new commit that solves the specific bug. In this way we maintain separated the code for new features from the bugfix.

When the release is tested, and the release is ready for production, we perform two operations:

  1. We merge the release branch into the master one, tagging the commit with vX.Y.Z tag, so we can identify easily the commit corresponding to the production code for that version.
  2. We merge the release branch into the develop one. With this merging we put all the bugfix into the develop branch, so the code is fixed for new issues and future releases.

This workflow for making a release allow us to avoid to insert code of new features during a bugfix, not tested code and so on. Different people can work on these branches so it's also easier for a project manager to assign developers to the project.

Hotfix

We all know that bugs are always present in production code, so sometime a new bug is discovered in a production-ready code, and we need to fix it.

In this case, instead of fixing the bug in the develop branch and creating a new release with a bunch of new and untested code, these are the steps that we follow:

  1. Create a new branch from the commit corresponding to the production code used. This is easy to be done because the commit is tagged with the version of the project, so we can checkout the tag and then we can create a new hotfix branch to it.
  2. We solve the bug in the hotfix branch.
  3. When we test it, we merge the commit again to the master branch, tagging it with a new version, and we merge the hotfix branch also to the develop branch.

Using this method we can insert the bugfix in the release that we need, without mixing production code and develop code.

Conclusions

In this article we described our git workflow for developing our projects, inspired by a widely used one, but with some changes that allows us to manage easily the project itself.


Articles of the Git Series:

  1. Why GIT?
  2. My Git workflow

Add comment


Security code
Refresh

Articles Feed