Git: How to make outer repository and embedded repository work as common/standalone repository?

I have a big project(let's say A repo), and it there one child folder which is come from B repo. I would meet warning like below when I commit from A repo

warning: adding embedded git repository: extractor/annotator-server
hint: You've added another git repository inside your current repository.
hint: Clones of the outer repository will not contain the contents of
hint: the embedded repository and will not know how to obtain it.
hint: If you meant to add a submodule, use:
hint:
hint:   git submodule add <url> extractor/annotator-server
hint:
hint: If you added this path by mistake, you can remove it from the
hint: index with:
hint:
hint:   git rm --cached extractor/annotator-server
hint:
hint: See "git help submodule" for more information.

I have seen git-submodule and git-subtree:

Maintaining Git repo inside another git repo

https://www.atlassian.com/blog/git/alternatives-to-git-submodule-git-subtree

But I don't like them , because they need extra config.


What I want is , for example:

structure like:

A/
--- a.py

--- B/
--- B/b.py

When I change B/b.py .

  1. If I am on path A/ , git add can detect B/b.py changed, git push only commit that to A repo.

    git add .   (would add changes under A/  )
    git push   (would push changes under A/  )
    git pull   (would pull changes under A/  )
    git clone XXX:A  (would clone all files under A/ ,    A/B/ is just looks like plain folder with all files, not a repo )
    
  2. If I am on path A/B/ , git add only add B/b.py changes to B repo, and git push only commit that to B repo.

    git add .   (would add changes under B/ , but not add changes to A repo)
    git push   (would push changes under B/ , but not push changes to A repo)
    git pull   (would clone changes under B/ ,  )
    git clone XXX:B  (would clone all files under B/  )
    
  3. Once I want to snyc A and B in another machine, just do

    git clone A
    rm -rf A/B/
    git clone B ./B
    git add . && git commit 'sync with B'
    

In another word, A and B act as a standalone repo.

But the truth is , A repo treat B repo as submodule:

A repo https://github.com/eromoe/test

B repo https://github.com/eromoe/test2


How do I force A repo track all files under A/ , and B repo track all files under A/B/ ? I want A and B act as a self-contain repo , without any other config.


You can use below commands to add files from test2 repo to test repo as below:

# In local test repo
rm -rf test2
git clone https://github.com/eromoe/test2
git add test2/
git commit -am 'add files from test2 repo to test repo'
git push

Note:

You should use git add test2/ (with slash, not git add test2).

git add test2/ will treat test2 folder and it's files as ordinary folder and file for test repo (create mode 100644).

git add test2 will treat test2 folder as a submodule for test repo (create mode 160000).


Probably, git reminded the repository. It helped for me:

    git rm --cached your_folder_with_repo
    git commit -m "remove cached repo"
    git add your_folder_with_repo/
    git commit -m "Add folder"
    git push

Manual, brute-force method:

For anyone landing on this page whose goal is just to archive a bunch of git repos inside a bigger parent repo or something, the simplest brute-force solution is to just rename all nested .git folders to anything else--ex: to ..git. Now, git add -A will add them all just like any other normal folder inside the parent git project, and you can git commit everything inside the parent repo easily. Done.

Automatic, brute-force method:

Use git-disable-repos.sh

(Part of https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles).

I just wrote this script over the weekend and have already used it on a number of projects. It works very well! See the comments in the top of the file for details and installation, and run git disable-repos -h for the help menu.

Installation:

git clone https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles.git
cd eRCaGuy_dotfiles/useful_scripts
mkdir -p ~/bin
ln -si "${PWD}/git-disable-repos.sh" ~/bin/git-disable-repos
# If this is the first time using your ~/bin dir, log out and
# log back in now. Otherwise, just re-source your .bashrc file:
. ~/.bashrc

Here is the standard usage pattern:

cd path/to/parent/repo
# Do a dry-run to see which repos will be temporarily disabled
git disable-repos --true_dryrun
# Now actually disable them: disable all git repos in this dir and below
git disable-repos --true
# re-enable just the parent repo
mv ..git .git
# quit tracking the subrepo as a single file (required
# if you previously tried to add it to your main repo before
# disabling it as a git repo)
git rm --cached path/to/subrepo
# add all files, including the now-disabled sub-repos, to the parent repo
git add -A
# commit all files
git commit

That will commit all sub-repos, including their (now ..git) .git folders and all git artifacts, as regular files, to the parent git repo. You have 100% of the control! Want to update just 1 subrepo? Then cd into it and rename its one ..git folder back to .git, manually, then use that sub-repo like normal, then when done run git disable-repos --true on it again (or manually do the rename from .git back to ..git), and commit it into the parent repo. The beauty of my git disable-repos script is that it can quickly and seemlessly disable or enable 100s of subrepos at once if necessary, whereas this would be impractical to do manually.

Perhaps my use-cases is strange: I need to just commit a ton of stuff into one repo until I can clean up and separate out each subrepo individually at a later date, but it does what I need it to do.

And here is the full help menu output of git disable-repos -h:

$ git disable-repos -h

'git disable-repos' version 0.3.0
  - Rename all ".git" subdirectories in the current directory to "..git" to temporarily
    "disable" them so that they can be easily added to a parent git repo as if they weren't 
    git repos themselves (".git" <--> "..git").
  - Why? See my StackOverflow answer here: https://stackoverflow.com/a/62368415/4561887
  - See also the "Long Description" below.
  - NB: if your sub-repo's dir is already being tracked in your git repo, accidentally, stop 
    tracking it with this cmd: 'git rm --cached path/to/subrepo' in order to be able to 
    start tracking it again fully, as a normal directory, after disabling it as a sub-repo 
    with this script. To view all tracked files in your repo, use 'git ls-files'. 
      - References: 
        1. https://stackoverflow.com/questions/1274057/how-to-make-git-forget-about-a-file-that-was-tracked-but-is-now-in-gitignore/1274447#1274447
        2. https://stackoverflow.com/questions/27403278/add-subproject-as-usual-folder-to-repository/27416839#27416839
        3. https://stackoverflow.com/questions/8533202/list-files-in-local-git-repo/14406253#14406253

Usage: 'git disable-repos [positional_parameters]'
  Positional Parameters:
    '-h' OR '-?'         = print this help menu, piped to the 'less' page viewer
    '-v' OR '--version'  = print the author and version
    '--true'             = Disable all repos by renaming all ".git" subdirectories --> "..git"
        So, once you do 'git disable-repos --true' **from within the parent repo's root directory,** 
        you can then do 'mv ..git .git && git add -A' to re-enable the parent repo ONLY and 
        stage all files and folders to be added to it. Then, run 'git commit' to commit them. 
        Prior to running 'git disable-repos --true', git would not have allowed adding all 
        subdirectories since it won't normally let you add sub-repos to a repo, and it recognizes 
        sub-repos by the existence of their ".git" directories.  
    '--true_dryrun'      = dry run of the above
    '--false'            = Re-enable all repos by renaming all "..git" subdirectories --> ".git"
    '--false_dryrun'     = dry run of the above
    '--list'             = list all ".git" and "..git" subdirectories

Common Usage Examples:
 1. To rename all '.git' subdirectories to '..git' **except for** the one immediately in the current 
    directory, so as to not disable the parent repo's .git dir (assuming you are in the parent 
    repo's root dir when running this command), run this:

        git disable-repos --true  # disable all git repos in this dir and below
        mv ..git .git             # re-enable just the parent repo

    Be sure to do a dry run first for safety, to ensure it will do what you expect:

        git disable-repos --true_dryrun

 2. To recursively list all git repos within a given folder, run this command from within the 
    folder of interest:

        git disable-repos --list

 3. Assuming you tried to add a sub-repo to your main git repo previously, BEFORE you deleted or 
    renamed the sub-repo's .git dir to disable the sub-repo, this is the process to disable 
    the sub-repo, remove it from your main repo's tracking index, and now re-add it to your 
    main repo as a regular directory, including all of its sub-files and things:

    Description: remove sub-repo as a sub-repo, add it as a normal directory, and commit
    all of its files to your main repo:

    Minimum Set of Commands (just gets the job done without printing extra info.):

        git disable-repos --true  # disable all repos in this dir and below 
        mv ..git .git             # re-enable just the main repo
        # quit tracking the subrepo as a single file
        git rm --cached path/to/subrepo
        # start tracking the subrepo as a normal folder
        git add -A
        git commit

    Full Set of Commands (let's you see more info. during the process):
    
        git disable-repos --true  # disable all repos in this dir and below 
        mv ..git .git             # re-enable just the main repo
        git ls-files path/to/subrepo  # see what is currently tracked in the subrepo dir 
        # quit tracking the subrepo as a single file
        git rm --cached path/to/subrepo
        git status
        # start tracking the subrepo as a normal folder
        git add -A
        git status
        git commit


Long Description: 
I want to archive a bunch of small git repos inside a single, larger repo, which I will back up on 
GitHub until I have time to manually pull out each small, nested repo into its own stand-alone
GitHub repo. To do this, however, 'git' in the outer, parent repo must NOT KNOW that the inner
git repos are git repos! The easiest way to do this is to just rename all inner, nested '.git' 
folders to anything else, such as to '..git', so that git won't recognize them as stand-alone
repositories, and so that it will just treat their contents like any other normal directory
and allow you to back it all up! Thus, this project is born. It will allow you to quickly
toggle the naming of any folder from '.git' to '..git', or vice versa. Hence the name of this
project: git-disable-repos. 
See my answer here: 
https://stackoverflow.com/questions/47008290/how-to-make-outer-repository-and-embedded-repository-work-as-common-standalone-r/62368415#62368415

This program is part of: https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles

Other, More-Sophisticated Tools:

For anyone looking for a more "professional" solution, these seem to be the most popular solutions, in order with the most-popular (and seemingly, therefore, most-supported?) first:

  1. git submodule - https://git-scm.com/docs/git-submodule - the canonical, officially-supported tool built into git.
  2. git subtree - https://www.atlassian.com/git/tutorials/git-subtree
  3. git subrepo - https://github.com/ingydotnet/git-subrepo

Which of those is the best? I cannot say, but they all look confusing to me so I'm choosing the manual, brute-force option I described above, as that meets my intended purposes best in this case until I can find the time to break out each of the sub-repos into their own individually-maintained repos on GitHub someday.

More on git submodule:

Update 21 Sept. 2020: this article by Martin Owen in May 2016 ("Git Submodules vs Git Subtrees") contains a good comparison of git submodule vs git subtree, and generally favors git submodule. However, the author was not even aware of git subrepo at the time, and made no mention of it except when it was brought up in the comments.

git submodule seems to be the canonical, officially-supported tool built into git. Although it looks like it has a learning curve for sure, I plan on using it in my next project, now that I'm ready to open that project up and begin working on it again, and it depends on sub-git repos. I plan on beginning by learning about it here:

  1. A brief intro by Atlassian's Bitbucket: https://www.atlassian.com/git/tutorials/git-submodule
  2. The official git submodule documentation here: https://git-scm.com/book/en/v2/Git-Tools-Submodules

Additional References:

  1. https://medium.com/@porteneuve/mastering-git-subtrees-943d29a798ec
  2. When to use git subtree?
  3. https://webmasters.stackexchange.com/questions/84378/how-can-i-create-a-git-repo-that-contains-several-other-git-repos
  4. Git treat nested git repos as regular file/folders
  5. Git: How to make outer repository and embedded repository work as common/standalone repository?
  6. https://www.atlassian.com/git/tutorials/git-subtree

Keywords: git add subrepo; git add sub repository; git add nested repository; git add .git folder and files