Dear reader,
This blog post is intended to help people willing to understand what it means to rebase a Pull Request submitted on GitHub. It does not explain "how" (this part has been addressed already by multiple posts, you'll find links about this at the end) to perform the rebase but rather what happens behind the curtain and why it is needed.
So I'll continue this post while considering that
- you are willing to contribute to an open source project hosted on GitHub
- you need to rebase one Pull Request you have submitted and you do not know how or why
Please note that, in order to make this post accessible to people still not comfortable with git concepts, I will simplify and twist a little how git rebases work, in order to make them easier to understand. The main idea however should be accurately presented.
How rebases come into your life
If you are reading this, congratulations! Because it means you have submitted a Pull Request to an open source project hosted on GitHub and that is already remarkable.
However it might happen that your PR (Pull Request) becomes out-of-date. This can happen for several reasons, the most frequent one being that files modified in your PR are in conflict with changes that have already been merged.
This will be indicated by GitHub as such:
And even if your PR meets all the other criteria to be merged, the merge cannot happen. GitHub will block your PR.
How git conflicts happen
How did we end up here ? Here is probably what happened :
1) In order to create the PR, you forked the main repository and created a branch of yours. This branch was issued from the repository main branch HEAD, the latest commit available on the main branch. We'll say the main branch is called develop
and your branch is my-branch
.
Here is a small graph presenting the situation:
The blue commits are the develop
commits, the red commits are the commits you added on your branch. Your branch my-branch
is issued from the last blue commit, the HEAD of develop
.
Your Pull Request goal is to merge the 2 red commits into the develop
branch.
2) At the time you created your branch, everything was fine. Your PR could have been merged without issues. However your PR was not merged immediately.
Maybe it needed some adjustments, required by the maintainers. Maybe it needed some advanced exploration and testing to make sure it was free of bugs. Maybe you got feedbacks and could not handle them sooner because you were busy...
Whatever the circonstances, while your PR was idle, the work on the main repository continued! Other Pull Requests were merged, which means new blue commits were added on develop
branch.
3) Unfortunately, because a Pull Request always target the latest commit on targeted branch, it might not be mergeable anymore. If for example the 2 new blue commits on develop
modify a file also modified by your PR, then this might introduce a git conflict that prevents the merge of your PR.
And this is it: your PR cannot be merged.
At this point, one project maintainer is likely to ask you to rebase your PR on top of the targeted branch. Rebasing the branch used to create your PR will resolve the issue preventing the merge.
Time to rebase
The solution is consequently to "rebase your PR on develop". But what does it mean ?
Here is what is going to happen:
4) When you start the rebase, git is going to modify your branch. First, it will take all your commits and put them aside, like this:
Your red commits are removed from the branch. Gone. The branch looks like it went back into the past.
5) Git is going to bring the new blue commits of the develop
branch inside your branch. This is the first main concept behind the rebase. Git is going to bring the latest changes of develop
into your branch.
6) This situation is now equivalent to having started your branch from the latest commits. From git point of view, since your branch contains all the blue commits from develop
, it's just like if your branch was started from the current HEAD, and not the old one, two commits before.
7) It is now time for git to bring back the red commits put aside. Now git will apply them back on the branch. However the branch has changed, the red commits will be added onto the new HEAD of your branch.
8) Git is going to apply the commits one by one. For each commit, git will check whether there is a git conflict and, if there is one, will ask you to resolve it.
9) Finally, after a few resolution of conflicts, all your red commits will be put back into your branch. This is it. Your branch is now rebased, up-to-date with develop
.
10) Now you can synchronize your fork with your local repository by pushing your changes and GitHub is going to update the Pull Request status, acknowledging its new content. The PR should now be mergeable.
Conclusion
The git rebase operation is actually a rewrite of your branch history, whose goal is to bring into your branch the changes that happened on the main repository branch. The strategy behind removing your own commits to re-apply them onto the updated HEAD aims to create a git history as clean as possible.
Some considerations
There is another way to update your PR to fix the conflicts: that is to merge
develop
intomy-branch
. A rebase is however usually preferred because mergingdevelop
intomy-branch
create merge commits that make the git history a lot harder to read.Git rebase will re-apply your commits one after the other on the updated branch. This might be a very long operation if each commit has git conflicts with main branch. If you have modified 15 times the same line that was in conflict, you will have 15 conflict resolutions to do! In this usecase, merging
develop
intomy-branch
is an interesting solution because you solve the conflicts only once.Since you have rewritten your git history, when you want to synchronize your fork with your local repository, you will not be able to use
git push
. You will need to use--force
option.When submitting a PR, you can grant maintainers the right to modify the branch that you used to submit the PR. Maintainers granted this right have the ability to perform the rebase themselves, but this is a lot more complex for them because they are not the PR author. When resolving git conflicts, it is very useful to know how the PR was built to understand how the conflict should be resolved.
After theory comes practice
Here are some great tutorials explaining the "how" to rebase a PR submitted on GitHub:
- How to Rebase a Pull Request on edx-platform
- How To Rebase and Update a Pull Request by Lisa Tagliaferri on digitalocean.com
- How to rebase a GitHub pull request by Aurelien Navarre
Top comments (5)
Glad I could be a subject for this article! Very useful, I'm saving it for my next rebase! :)
Thanks a lot!
Nicely explained!!
Damn! @matks for that wonderful explanation.
Now I'll start with the "How" part.
Great Article @matks !
I have a doubt , We have to rebase only when there is a conflict right? Git will allow to merge our commits to main branch if there is no conflicts.
Such a crystal clear explanation! I wish I read it before stumbling on my first steps with git ;=);=)