Skip to content
This repository was archived by the owner on Nov 18, 2021. It is now read-only.
This repository was archived by the owner on Nov 18, 2021. It is now read-only.

amending PR branches via force-push causes unreachable commits to be garbage-collected #997

@aspiers

Description

@aspiers

When you want to amend a pull request, you have a choice of two ways to do it:

  1. Push new commits on top of the head of the existing PR branch
  2. Force-push (git push -f) a new head to the PR branch, thereby rewriting history.

Unfortunately currently both have significant problems, as explained in blog posts by @jd, myself, and probably others. @jd subsequently blogged that the improvements GitHub made more recently are sufficiently good for him, but the problems associated with rewriting PR history which he originally pointed out still remain.

This issue is for tracking just one of those problems: when you rewrite PR history, there is no guarantee that commits from older versions of the PR will be preserved, unless they are still reachable from the PR branch head. (See #998 and #999 for other, separate problems with force-pushing to PR branches.)

If these older commits are no longer reachable, for some period of time it will be possible to view them by constructing the correct https://github.com/USER/REPO/commit/SHA1 URL, or recover them from GitHub's "reflog" equivalent, but eventually they will be garbage-collected at some point by GitHub's backend.

This is in stark contrast to Gerrit which preserves every version of a patch set for every review.

Since we can't easily trigger GitHub's garbage collection, it's hard to definitively prove this is a genuine issue. However there is some strong evidence:

I was also attempting to prove it via an experiment in aspiers/test#3 - I previously thought that if within a few months of writing this, the test commit was still viewable, then maybe this was not genuine issue. But at the time of writing (editing) this text, the commit was still visible, so bearing in mind the above evidence, it's more likely that there is something else going on here. For example, maybe my attempt to construct an obfuscated URL which didn't accidentally prevent the commit from being garbage-collected didn't work. Or maybe the garbage-collection is triggered not by the age of orphans, but by the number of orphans, or when the number of objects exceeds a threshold, in which case my test repository will be useless in this experiment since it has almost zero traffic. GitHub might even be using git-gc(1) behind the scenes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    code reviewparityFeatures that GitHub is missing, but competitors implement; also, see [Migration]pull-requests

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions