Have you ever wanted to change history? - The story of Git - part five

History. When we look at it we see a story. A story of how something happened, how it evolved, how it ended up being the thing it is today... It is the same with the code - when we look at the code history we would like to see a story of how your code evolved, from the first commit to the first deploy to production, and in the end the release to your users. I say would like, because we often times end up seeing a tangled spider web of how things evolved, and trying to determine what happened when might become hurtful to our health. That is, if you didn't use rebase.

Joining the code

Let's start from the beginning - the way you can join your code with the main branch is called merging. In Git, you have two ways to do that - merge and rebase.

The first up in line is merge. It is also called a 3-way merge, and you'll see why in this paragraph. This type of joining will happen usually when you branched out from trunk, added changes to your branch, and when you want to merge back your changes into it, you see that the trunk has moved forward in time (somebody merged something to it before you did). Since it's not a race of who will be the first to merge, Git will first look at the common parent of the trunk and your branch (in other words - a commit before we diverged from trunk), second - the last commit from the trunk, and third - the last commit from your branch, and it will merge those two branches together. That is why it is called a 3-way merge - Git looks at three different commits. And as the result of this, Git will create a merge commit, the commit that has two parent commits. A bit complex, but that is the default merge in Git. The following snippet will show you how merge looks in practice.

# before merge
main A------B------C------D------E
                    \
your-branch          C1------C2------C3     

$ git checkout main
$ git merge your-branch

# after merge
main A------B------C------D------E------F
                    \                  /
your-branch          C1------C2------C3     

In the example above, the commit 'F' is a merge commit - it will have two parents - 'E' and 'C3'.

The simpler merge is the merge with fast-forward. It happens when you create a branch from the trunk, you then add some changes, and when you finish and join the branch back to trunk, you see that in the meantime trunk didn't move forward (didn't receive any commits). Git will just put your commits onto the trunk without any additional commit, and move the trunk forward. That is why it's called fast-forward. The example below shows the fast-forward merge, where there is no merge commit.

# before merge 
main A------B------C
                    \
your-branch          C1------C2------C3     

$ git checkout main
$ git merge your-branch

# after merge
main A------B------C------C1------C2------C3

Last, but definitely not the least, the notorious one - rebase. Why notorious? In short - it changes the Git history. This is nothing to worry about, changing of Git history is not that dangerous and hard. There are groups that are for and against it, but we'll see about that later in the text.

A bit about Rebase

A rebase means, in one sentence - changing a base of your commits. A bit longer explanation - when you rebase, imagine a hand that will take all of your commits and put them to the last one of the branch you've decided to rebase on. Let's see it in an example below:

# before rebase 
main A------B------C------D------E
                    \
your-branch          C1------C2------C3     

$ git checkout your-branch
$ git rebase main

# after rebase
main A------B------C------D------E
                                  \
your-branch                        C1'------C2'------C3'

# perform fast-forward merge into main
$ git checkout main
$ git merge your-branch

main A------B------C------D------E------C1'------C2'------C3'

Why do these commits from your-branch after rebase have single quote on them (the ' sign)? Were they changed? In a way they were. No, you didn't lose any of your work, the only thing that changed is the parent of the C1 commit. And this will be a new hash in git, therefore a new object. And we now know that you shouldn't be afraid of the hash.

Now if you go on and merge your-branch to main it will look like a fast-forward merge, with all the commits in linear order, easy to follow. Or maybe not so easy?

To change or not to change the (Git) history?

There are two views on this. Should you, or you shouldn't change the Git history? In order words - should you use rebase or merge? First view is that the Git history should reflect how your project developed, how it all actually happened, never mind if it is a messy thing. From this point of view it makes no sense to change the Git history, it would be like "lying".

The second point of view - the Git history should be a story of how a project was made. It is sort of like publishing a book - you wouldn't publish your first draft of the book, but the end result.

What is better - there is no easy answer. What is most suitable for you is the best. The only thing to follow is to not rebase public branches, it will mess up other people's work. Only rebase branches you haven't published or you know that nobody other than you uses.

To wrap things up

You can always use rebase locally before merging. In that way you have both of the two worlds.

My personal preference is to do the rebase of commits before merging to main branch, with an addition of squashing the commits occasionally. And what is squashing? Well it's combining all or some of your commits into one. And because I tend to get a bit chatty and commit often, I use squashing. The thing that I like with rebase, besides of keeping the history clean and easy to track, is that it allows you to do the interactive rebase, the mode where you can choose which commits to pick, squash or even remove, when rebasing. Pretty nice, isn't it?

To do it, go to your branch that has diverged from the main and type in git rebase -i main. That will start the rebase in interactive mode, similar to the one below.

pick 6c0746b Test commit
pick d7e3d94 Adding third file

# Rebase 4d3da14..d7e3d94 onto 4d3da14 (2 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# .       create a merge commit using the original merge commit's
# .       message (or the oneline, if no original merge commit was
# .       specified). Use -c <commit> to reword the commit message.
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#

As you can see, you've got plenty of options. Choose them wisely. Because, similar to when you first get that long awaited root permissions...

With great power, comes great responsibility.