How to un-submodule a Git submodule?

What are the best practices for un-submoduling a Git submodule, bringing all the code back into the core repository?


Solution 1:

If all you want is to put your submodule code into the main repository, you just need to remove the submodule and re-add the files into the main repo:

git rm --cached submodule_path # delete reference to submodule HEAD (no trailing slash)
git rm .gitmodules             # if you have more than one submodules,
                               # you need to edit this file instead of deleting!
rm -rf submodule_path/.git     # make sure you have backup!!
git add submodule_path         # will add files instead of commit reference
git commit -m "remove submodule"

If you also want to preserve the history of the submodule, you can do a small trick: “merge” the submodule into the main repository, so that the result will be the same as it was before, except that the submodule files are now in the main repository.

In the main module you will need to do the following:

# Fetch the submodule commits into the main repository
git remote add submodule_origin git://url/to/submodule/origin
git fetch submodule_origin

# Start a fake merge (won't change any files, won't commit anything)
git merge -s ours --no-commit submodule_origin/master

# Do the same as in the first solution
git rm --cached submodule_path # delete reference to submodule HEAD
git rm .gitmodules             # if you have more than one submodules,
                               # you need to edit this file instead of deleting!
rm -rf submodule_path/.git     # make sure you have backup!!
git add submodule_path         # will add files instead of commit reference

# Commit and cleanup
git commit -m "removed submodule"
git remote rm submodule_origin

The resulting repository will look a bit weird: there will be more than one initial commit. But it won’t cause any problems for Git.

A big advantage of this second solution is that you can still run git blame or git log on the files which were originally in submodules. In fact, what happens here is just a renaming of many files inside one repository, and Git should automatically detect this. If you still have problems with git log, try some options (e.g., --follow, -M, -C) which do better rename and copy detection.

Solution 2:

I've created a script that will translate a submodule to a simple directory, while retaining all file history. It doesn't suffer from the git log --follow <file> issues that the other solutions suffer from. It's also a very easy one-line invocation that does all of the work for you. G'luck.

It builds on the excellent work by Lucas Jenß, described in his blog post "Integrating a submodule into the parent repository", but automates the entire process and cleans up a few other corner cases.

The latest code will be maintained with bugfixes on github at https://github.com/jeremysears/scripts/blob/master/bin/git-submodule-rewrite, but for the sake of proper stackoverflow answer protocol, I've included the solution in its entirety below.

Usage:

$ git-submodule-rewrite <submodule-name>

git-submodule-rewrite:

#!/usr/bin/env bash

# This script builds on the excellent work by Lucas Jenß, described in his blog
# post "Integrating a submodule into the parent repository", but automates the
# entire process and cleans up a few other corner cases.
# https://x3ro.de/2013/09/01/Integrating-a-submodule-into-the-parent-repository.html

function usage() {
  echo "Merge a submodule into a repo, retaining file history."
  echo "Usage: $0 <submodule-name>"
  echo ""
  echo "options:"
  echo "  -h, --help                Print this message"
  echo "  -v, --verbose             Display verbose output"
}

function abort {
    echo "$(tput setaf 1)$1$(tput sgr0)"
    exit 1
}

function request_confirmation {
    read -p "$(tput setaf 4)$1 (y/n) $(tput sgr0)"
    [ "$REPLY" == "y" ] || abort "Aborted!"
}

function warn() {
  cat << EOF
    This script will convert your "${sub}" git submodule into
    a simple subdirectory in the parent repository while retaining all
    contents and file history.

    The script will:
      * delete the ${sub} submodule configuration from .gitmodules and
        .git/config and commit it.
      * rewrite the entire history of the ${sub} submodule so that all
        paths are prefixed by ${path}.
        This ensures that git log will correctly follow the original file
        history.
      * merge the submodule into its parent repository and commit it.

    NOTE: This script might completely garble your repository, so PLEASE apply
    this only to a fresh clone of the repository where it does not matter if
    the repo is destroyed.  It would be wise to keep a backup clone of your
    repository, so that you can reconstitute it if need be.  You have been
    warned.  Use at your own risk.

EOF

  request_confirmation "Do you want to proceed?"
}

function git_version_lte() {
  OP_VERSION=$(printf "%03d%03d%03d%03d" $(echo "$1" | tr '.' '\n' | head -n 4))
  GIT_VERSION=$(git version)
  GIT_VERSION=$(printf "%03d%03d%03d%03d" $(echo "${GIT_VERSION#git version}" | tr '.' '\n' | head -n 4))
  echo -e "${GIT_VERSION}\n${OP_VERSION}" | sort | head -n1
  [ ${OP_VERSION} -le ${GIT_VERSION} ]
}

function main() {

  warn

  if [ "${verbose}" == "true" ]; then
    set -x
  fi

  # Remove submodule and commit
  git config -f .gitmodules --remove-section "submodule.${sub}"
  if git config -f .git/config --get "submodule.${sub}.url"; then
    git config -f .git/config --remove-section "submodule.${sub}"
  fi
  rm -rf "${path}"
  git add -A .
  git commit -m "Remove submodule ${sub}"
  rm -rf ".git/modules/${sub}"

  # Rewrite submodule history
  local tmpdir="$(mktemp -d -t submodule-rewrite-XXXXXX)"
  git clone "${url}" "${tmpdir}"
  pushd "${tmpdir}"
  local tab="$(printf '\t')"
  local filter="git ls-files -s | sed \"s/${tab}/${tab}${path}\//\" | GIT_INDEX_FILE=\${GIT_INDEX_FILE}.new git update-index --index-info && mv \${GIT_INDEX_FILE}.new \${GIT_INDEX_FILE}"
  git filter-branch --index-filter "${filter}" HEAD
  popd

  # Merge in rewritten submodule history
  git remote add "${sub}" "${tmpdir}"
  git fetch "${sub}"

  if git_version_lte 2.8.4
  then
    # Previous to git 2.9.0 the parameter would yield an error
    ALLOW_UNRELATED_HISTORIES=""
  else
    # From git 2.9.0 this parameter is required
    ALLOW_UNRELATED_HISTORIES="--allow-unrelated-histories"
  fi

  git merge -s ours --no-commit ${ALLOW_UNRELATED_HISTORIES} "${sub}/master"
  rm -rf tmpdir

  # Add submodule content
  git clone "${url}" "${path}"
  rm -rf "${path}/.git"
  git add "${path}"
  git commit -m "Merge submodule contents for ${sub}"
  git config -f .git/config --remove-section "remote.${sub}"

  set +x
  echo "$(tput setaf 2)Submodule merge complete. Push changes after review.$(tput sgr0)"
}

set -euo pipefail

declare verbose=false
while [ $# -gt 0 ]; do
    case "$1" in
        (-h|--help)
            usage
            exit 0
            ;;
        (-v|--verbose)
            verbose=true
            ;;
        (*)
            break
            ;;
    esac
    shift
done

declare sub="${1:-}"

if [ -z "${sub}" ]; then
  >&2 echo "Error: No submodule specified"
  usage
  exit 1
fi

shift

if [ -n "${1:-}" ]; then
  >&2 echo "Error: Unknown option: ${1:-}"
  usage
  exit 1
fi

if ! [ -d ".git" ]; then
  >&2 echo "Error: No git repository found.  Must be run from the root of a git repository"
  usage
  exit 1
fi

declare path="$(git config -f .gitmodules --get "submodule.${sub}.path")"
declare url="$(git config -f .gitmodules --get "submodule.${sub}.url")"

if [ -z "${path}" ]; then
  >&2 echo "Error: Submodule not found: ${sub}"
  usage
  exit 1
fi

if ! [ -d "${path}" ]; then
  >&2 echo "Error: Submodule path not found: ${path}"
  usage
  exit 1
fi

main

Solution 3:

Since git 1.8.5 (Nov 2013) (without keeping the history of the submodule):

mv yoursubmodule yoursubmodule_tmp
git submodule deinit yourSubmodule
git rm yourSubmodule
mv yoursubmodule_tmp yoursubmodule
git add yoursubmodule

That will:

  • unregister and unload (ie delete the content of) the submodule (deinit, hence the mv first),
  • clean up the .gitmodules for you (rm),
  • and remove the special entry representing that submodule SHA1 in the index of the parent repo (rm).

Once the removal of the submodule is complete (deinit and git rm), you can rename the folder back to its original name and add it to the git repo as a regular folder.

Note: if the submodule was created by an old Git (< 1.8), you might need to remove the nested .git folder within the submodule itself, as commented by Simon East


If you need to keep the history of the submodule, see jsears's answer, which uses git filter-branch.

Solution 4:

  1. git rm --cached the_submodule_path
  2. remove the submodule section from the .gitmodules file, or if it's the only submodule, remove the file.
  3. do a commit "removed submodule xyz"
  4. git add the_submodule_path
  5. another commit "added codebase of xyz"

I didn't find any easier way yet. You can compress 3-5 into one step via git commit -a - matter of taste.