Split VCS repository#

Test#

A new version control repository exists, with code/notes split out from another repository. Implicit in starting a new project is making a decision whether to create a new repository or work in a new directory or on a branch of the most closely-related repository.

Examples of the most closely-related (default) repository:

  • Your personal-notes

  • Your shared-notes

  • A book’s associated repository (VGT, SR2, LYHGG)

  • Your coworker’s repository

Any of your personal or public repositories:

If you feel uncertain about what your non-split options are, consider these DuckDuck bangs:

  • !glab

  • !gh

Value#

In general, this approach encourages independence and isolation.

Dependency Management#

See the reasons you split docker images in Containerize Application; but splitting images does not strictly require splitting repositories. Conversely, it makes little sense to split repositories for code that has nearly zero special dependencies. For example:

  • Plain text English notes

  • bash

  • git

  • python (without packages)

Speed#

Search. You can expect git grep and git log -G to slow down as the size of a repository increases.

Pulls. How long does it take to pull the repository to a new developer’s machine? If your .git directory is large, then it will take a long time. The more repos you merge (the bigger your monorepo), the slower this gets. So you have to take more time to think about keeping history small.

See also “VCS Scalability” in Monorepos: Please don’t!, “Tooling” in Multirepo vs Monorepo, and “Scalability Challenges” in !w Monorepo.

Cost#

In general, this approach encourages shared responsibility and centralization.

Modularization Premium#

Said another way, it’s difficult to iterate quickly on a set of manyrepo (cost to feedback speed). This point is made in many ways in Monorepo Explained.

In general, modularizing code has a cost. Even refactoring code takes time, and you shouldn’t do it earlier than you need to. Slightly larger costs include even defining a simple API (such as an evaluation docker’s API) and creating version numbers you manually increment.

The modularization premium may include setting up some kind of “metarepo” (e.g. an orchestration repo).

See also Strong Module Boundaries and its discussions of MonolithFirst. A split of repositories often means a split into separate microservices. Consider the MicroservicePremium specifically as well.

Wikipedia is a great example of how you can avoid splitting your thoughts (it’s similar to a monorepo of notes) and still work on the public side of the line; the downside is how it can be hard to contribute to because it has so many “conceptual” dependencies.

Cross-Project CI/CD#

See “Tooling” in Monorepo, Manyrepo, Metarepo and Advantages of monorepos. An unmentioned advantage to the monorepo approach potentially lets you use only git rather than the cross-project options offered by GitLab and GitHub (tying you to their platforms).

For example, let’s say you wanted to enforce a code formatting standard. It’s quite easy to set up a CI/CD pipeline that enforces coding standards because docker images with these tools are readily available. In practice it doesn’t happen because no one wants to set this up 10 times for many small repositories, or figure out how to deduplicate .gitlab-ci.yml content (and still call the same content in 10 places).

Worse, one team (thinking of a person, actually) wants to use yapf. Another wants to use black. They have separate repositories so they can have their own code formatting standard.

Simpler Reorganization#

See “Simplified organization” in Advantages of monorepos. When you want to reorganize docker boundaries or “projects” in general you can do so without going to GitLab or GitHub. Consider the value in organizing notes in general; see Organize notes.

Forced Collaboration#

See Monorepo: please do!.

Simpler Retrospective#

If you only have one or a few repos, it’s much easier to review all the commits that one person or the team did in a sprint.