Git – under the hood again
Here is a continuation of my older article about git. This time, I will talk about branches and HEAD (my personal horror until I understood its true nature).
Last time I wrote about how does git handle files, changes in those and commits. In this paper I will continue this topic by focusing on other aspects.
Branches
I will start with a simple example – a new branch and, some a new file that will be committed. So before all of those preparations our .git/objects dir looked like this
$ ls .git/objects/
24/ 3b/ 51/ 5a/ 6c/ 94/ a5/ info/ pack/
afterwards:
$ ls .git/objects/
24/ 3b/ 51/ 5a/ 6c/ 94/ a5/ b8/ cd/ e0/ info/ pack/
As we know, those 'b8′, 'cd’ and 'e0′ dirs were created due to the tree, the new file blob, and the commit (all other dirs are from my previous git article).
$ git cat-file -p cdd24e
New file from branch
$ git cat-file -p e0e1cd
tree b8ffbaa7f9e5362315d65a2ee2ed82c755a0bd98
parent 3b9a54b298140d75494cc1e598fa83f3362f0d18
author Kamil.Kurzyna <kamilkurzyna@gmail.com> 1613080661 +0100
committer Kamil.Kurzyna <kamilkurzyna@gmail.com> 1613080661 +0100
branch commit
$ git cat-file -p b8ffba
100644 blob 5a3194d2db0000c18464869764e169276313f4bd file.txt
100644 blob 24e7dfa22219c967b5e1724d86404489cf93f7f0 file2.txt
100644 blob cdd24eb6ba81bd5922d2af60aeff0515be3e951e file3.txt
If we look closely, we will see two things – commit has a parent, which is actually the previous commit (despite being done on master), and that tree file contains all blob to file links. Now if we merge new branch-changes to master, we will see that the master contains all the files from the new-branch
$ ls .git/objects/
24/ 3b/ 51/ 5a/ 6c/ 94/ a5/ b8/ cd/ e0/ info/ pack/
Great, but how does master know that it should use the newest tree file as its current state? How did the branch know which (or if any) commit will become a parent for itself?
HEAD
HEAD is a magical word that often shows in stackoverflow commands, and that we all use to solve out git problem. Except it’s not magical. HEAD is basically a pointer – like in (for example) C++ '*’. But what it is pointing to? To the last commit. So if we take a look at our current situation – new_branch merged with the master with its changes
$ git log --oneline --decorate
e0e1cdf (HEAD -> new_branch, master) branch commit
3b9a54b second commit
a511dd7 my commit
we will see a pointer e0e1cdf for new-branch and master for the last commit. Now if we go back to the new-branch and make another commit we will get
$ git log --oneline --decorate
a1dabb1 (HEAD -> new_branch) head commit on branch
e0e1cdf (master) branch commit
3b9a54b second commit
a511dd7 my commit
Now we see, that there are two HEADs. Why? Since HEAD is a pointer to the newest commit, and we made the commit on a different branch, git created another HEAD dedicated exactly for this branch. When? At the moment we created new_branch. To get a better grasp of it
1.point of start
master -> master_HEAD = 3b9a54b
2.creation of new_branch
master -> master_HEAD = 3b9a54b
new_branch -> new_branch_HEAD = master_HEAD = 3b9a54b
3.new commit in new branch
master -> master_HEAD = 3b9a54b
new_branch -> new_branch_HEAD = e0e1cdf
4.merge to master
master -> master_HEAD = e0e1cdf
new_branch -> new_branch_HEAD = master_HEAD = e0e1cdf
5.adding new commit in new_branch
master -> master_HEAD = e0e1cdf
new_branch -> new_branch_HEAD = a1dabb1
Summary
After unraveling the true nature of branch and HEAD, we can see, that all the work between branches is basically a work on commit files. So now each time you see a HEAD in your git command, you know that basically you work on the simple commit.
Peace!