And the forking Microsoft-owned code warehouse doesn't see this as much of a problem
Researchers at Truffle Security have found, or arguably rediscovered, that data from deleted GitHub repositories (public or private) and from deleted copies (forks) of repositories isn't necessarily deleted.
Joe Leon, a security researcher with the outfit, said in an advisory on Wednesday that being able to access deleted repo data – such as APIs keys – represents a security risk. And he proposed a new term to describe the alleged vulnerability: Cross Fork Object Reference (CFOR).
"A CFOR vulnerability occurs when one repository fork can access sensitive data from another fork (including data from private and deleted forks)," Leon explained.
For example, the firm showed how one can fork a repository, commit data to it, delete the fork, and then access the supposedly deleted commit data via the original repository.
The researchers also created a repo, forked it, and showed how data not synced with the fork continues to be accessible through the fork after the original repo is deleted. You can watch that particular demo.
Well, sort of. GitHub certainly could refuse to render orphan commits. They pop up a banner saying so but I don't see why they should show the commit at all. They could still keep the data until it's garbage collected since a user might re-upload the commit in a new branch.
This seems like a non-issue though since someone who hasn't already seen the disclosed information would need to somehow determine the hash of the deleted commit.
Ah - Actually reading the article reveals why this is actually an issue:
What's more, Ayrey explained, you don't even need the full identifying hash to access the commit. "If you know the first four characters of the identifier, GitHub will almost auto-complete the rest of the identifier for you," he said, noting that with just sixty-five thousand possible combinations for those characters, that's a small enough number to test all the possibilities.
So enumerating all the orphan commits wouldn't be that hard.
In any case if a secret has been publicly disclosed, you should always assume it's still out there. For sure, rotate your keys.
The article is specifically about how GitHub forks are not the same as a git clone. A clone isn’t accessible from the upstream without the upstream pulling the changes, but this vulnerability points out that a fork on GitHub is accessible from the upstream without a pull, even if the fork is private.
It’s because GitHub under the hood doesn’t actually do a real clone so that they can save on disk usage.
How can such a wrong answer get so many points? Clones and forge forks are unrelated. First, GitHub or GitLab cannot and could not link clones together without analyzing the remotes of each clone.