Github
Permanently remove git-history files

Complete Guide to Permanently Removing Files from Git History

This guide explains how to permanently remove files from Git history using two popular methods: BFG Repo-Cleaner and git filter-repo. Both methods help you rewrite Git history to remove sensitive or unnecessary files while taking safety considerations into account.


Prerequisites

  1. Backup Your Repository: Rewriting Git history is destructive and irreversible. Always create a backup:

    git clone --mirror <repository-url> backup-repo
  2. Understand the Impact: History rewriting changes commit hashes, which will affect collaborators. You'll need to force-push the rewritten history, and others will need to re-clone.

  3. Install Required Tools:


Method 1: Using BFG Repo-Cleaner

Step 1: Install BFG

Download the BFG jar file from here (opens in a new tab).

Step 2: Clone the Repository

Clone your repository as a bare repository:

git clone --mirror <repository-url>
cd <repository-name>.git

Step 3: Run BFG

Run BFG with the appropriate flags to remove unwanted files. Examples:

  • Remove a specific file:
    java -jar bfg.jar --delete-files <file-name>
  • Remove files larger than a certain size:
    java -jar bfg.jar --strip-blobs-bigger-than 100M

Step 4: Clean and Push

After running BFG, clean the repository and push the changes:

git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push --force

Method 2: Using git filter-repo

Step 1: Install git filter-repo

Install git filter-repo using your package manager:

  • For macOS:
    brew install git-filter-repo
  • For Linux (with pip):
    pip install git-filter-repo

Step 2: Clone the Repository

Clone the repository locally (not as a bare repo):

git clone <repository-url>
cd <repository-name>

Step 3: Run git filter-repo

Run one of the following commands based on your needs:

  • Remove a specific file:
    git filter-repo --path <file-name> --invert-paths
  • Remove files matching a pattern:
    git filter-repo --path-glob '*.log' --invert-paths
  • Remove blobs larger than a certain size:
    git filter-repo --strip-blobs-bigger-than 100M

Step 4: Clean and Push

After the filter-repo process, force-push the cleaned history:

git push --force

Safety Considerations

  1. Communicate with Collaborators: Inform collaborators about the history rewrite. They must re-clone the repository to avoid issues.

  2. Double-Check Files to Remove: Verify which files are being removed to avoid accidentally deleting essential data.

  3. Test the Modified Repository: Clone the rewritten repository into a separate directory and verify its integrity:

    git clone <modified-repo-url> test-repo
  4. Protect Repository Access: If sensitive data (e.g., passwords, API keys) was exposed, rotate the credentials immediately.


Key Differences Between BFG and git filter-repo

FeatureBFG Repo-Cleanergit filter-repo
Ease of UseHigh (simpler commands)Medium (more customizable)
PerformanceFaster for simple operationsOptimized for complex use cases
CustomizationLimitedExtensive
InstallationRequires JavaPython-based or native binary

Troubleshooting

  • Error: Repository too large: If the repository is too large, consider using --strip-blobs-bigger-than to remove oversized files.

  • Force-Pushing Issues: Ensure you have the necessary permissions to force-push to the remote repository.

  • Collaborator Issues: Share this guide with collaborators to help them re-clone and reset their local repositories.


Conclusion

Using BFG Repo-Cleaner or git filter-repo allows you to efficiently and permanently remove files from Git history. Always prioritize safety by backing up your repository and communicating with your team before making irreversible changes.


Powered by Nextra