Keep your git history clean using rebase
Jérémie Marniquet Fabre7 min read
Thanks to a new colleague of mine, I have learned how to make my git history cleaner and more understandable.
The principle is simple: rebase your branch before you merge it. But this technique also has weaknesses. In this article, I will explain what a rebase and a merge really does and what are the implications of this technique.
Basically, here is an example of my git history before and after I used this technique.
Stay focus, rebase and merge are no joke! :)
What is the goal of a rebase or a merge?
Rebase and merge both aim at integrating changes that happened on another branch into your branch.
What happens during a merge?
First of all, there are two types of merge:
- Fast-forward merge
- 3-way merge
Fast-forward merge
A fast-forward merge happens when the most recent shared commit between the two branches is also the tip of the branch in which you are merging.
The following drawing shows what happens during a fast-forward merge and how it is shown on a graphical git software.
A: the branch in which you are merging
B: the branch from which you get the modifications
- git checkout A
- git merge B
As you can see, git simply brings the new commits on top of branch A. After a fast-forward merge, branches A and B are exactly the same.
Notes:
- git checkout A, git rebase B you would have had the exact same result!
- git checkout B, git merge A would have left the branches in the “before” situation since branch A has no new commits for branch B.
3-way merge
A 3-way merge happens when both branches have had new commits since the last shared commit.
The following drawing shows what happens during a 3-way merge and how it is shown in a graphical git software.
A: the branch in which you are merging
B: the branch from which you get the modifications
- git checkout A
- git merge B
During a 3-way merge, it creates a new commit named “merge commit” (in orange) that contains:
- All the modifications brought by the three commits from B (in purple)
- The possible conflict resolutions
Git will keep all information about the commits of the merged branch B even if you delete it. On a graphical git software, git will also keep a small loop to represent the merge.
The default behaviour of git is to try a fast-forward merge first. If it’s not possible, that is to say if both branch have had changes since the last shared commit, it will be a 3-way merge.
What happens during a rebase?
A rebase differ from a merge in the way in which it integrates the modifications.
The following drawings show what happens during a rebase and how it is shown in a graphical git software.
A: the branch that you are rebasing
B: the branch from which you get the new commits
- git checkout A
- git rebase B
When you rebase A on B, git creates a temporary branch that is a copy of branch B and tries to apply the new commits of A on it one by one.
For each commit to apply, if there are conflicts, they will be resolved inside of the commit.
After a rebase, the new commits from A (in blue) are not exactly the same as they were:
- If there were conflicts, those conflicts are integrated into each commit
- They have a new hash
But they keep their original date which might be confusing since in the final branch, commits in blue were created before the two last commits in purple.
What is the best solution to integrate a new feature into a shared branch and keep your git tree clean?
Let say that you have a new feature made of three new commits on a branch named `feature`. You want to merge this branch into a shared branch, for example `master` that has received two new commits since you started from it.
You have two main solutions:
First solution:
- git checkout feature
- git rebase master
- git checkout master
- git merge feature
Note: Be careful, git merge feature should do a fast-forward merge, but some hosting services for version control do a 3-way merge anyway. To prevent this, you can use git merge feature —ff-only
Second solution:
- git checkout master
- git merge feature
As you can see, the final tree is more simple with the first solution. You simply have a linear git history. On the opposite, the second solution creates a new “merge commit” and a loop to show that a merge happened.
In this situation, the git tree is still readable, so the advantage of the first solution is not so obvious. The complexity emerges when you have several developers in your team, and several feature branches developed at the same time. If everyone uses the second solution, your git tree ends up complex with several loops, and it can even be difficult to see which commits belong to which branch!
Unfortunately, the first solution has a few drawbacks:
History rewriting
When you use a rebase, like in the first solution, you “rewrite history” because you change the order of past commits on your branch. This can be problematic if several developers work on the same branch: when you rewrite history, you have to use git push - - force in order to erase the old branch on the remote repository and put your new branch (with the new history) in its place.
This can potentially erase changes another developer made, or introduce conflicts resolution for him.
To avoid this problem, you should only rebase branches on which you are the only one working. For example in our case, if you are the only one working on the feature branch.
However, you might sometime have to rewrite the history of shared branches. In this case, make sure that the other developers working on the branch are aware of it, and are available to help you if you have conflicts to resolve.
The obvious advantage of the 3-way merge here is that you don’t rewrite history at all.
Conflicts resolution
When you merge or rebase, you might have to resolve conflicts.
What I like about the rebase, is that the conflicts added by one commit will be resolved in this same commit. On the opposite, the 3-way merge will resolve all the conflicts into the new “merge commit”, mixing all together with the conflicts added by the different commits of your feature branch.
The only problem with the rebase is that you may have to resolve more conflicts, due to the fact that the rebase applies the commits of your branch one by one.
Conclusion
To conclude, I hope I have convinced you that rebasing your branch before merging it can clear your git history a lot! Here is a recap of the advantages and disadvantages of the rebase and merge method versus the 3-way merge method: