The reason I am writing this article is that thanks to Angular, and the Nrwl workspace, I have experienced a mono repo for the first time, and I really like it. I am currently attempting to bring this architecture over to my react app. When I thought about it, I decided, why not have a mono repo for all of my personal projects? For instance, python projects, GraphQL repos, re-usable scripts, micro-services etc. For instance, whenever I learn a new UI technology, I will generally tie it into my pixel illustrator app. So at-least having common logic in a libs folder that can be re-used might be useful, even from a UI perspective. At least for referencing purposes, right?
Around this time I realized, while yes, I see the value of a mono repo, I really don’t have the handle on a mono repo that I really wanted. I intuitively understand that the value of a mono repo, isn’t simply having all of my projects under a single repository. There are other benefits that it can bring. This article is a result of that. Really wanted to understand the benefits. I will be honest, perhaps it was for selfish reasons, so that I have a public place to reference. However, maybe you will be able to benefit equally. In which case, good, I am happy to help.
After doing quite a bit of research, the authoritative benefits of a mono repo, can be found here. It is an article entitled, “Why Google stores billions of lines of code in a single repository” by Rachel Potvin and Josh Levenberg. The title is self-explanatory.
It brings up the following eight points, as to why they use a mono repo, and why they promote it.
There is no confusion about which repository hosts the authoritative version of a file.
Extensive code sharing and reuse.
This one I can speak for personally. By having a libs folder with common components, fragmentation is less common.
Simplified dependency management
There is no such thing as versions. Once an edit is made it is propagated down the entire pipeline.
Ability to change one specific file, and have it affect thousands.
Large scale refactoring
Refactoring one place once will affect other places as well. In addition, the ability to use something like Rosie(we will get to Rosie) to patch all files at the same time.
Collaboration across teams
Somewhat tied to extensive code sharing and reuse. If all code is in a visible libs folder, one has the ability to look at logs, see who is responsible for the codebase, see codeowners, and ask to make a certain edit using a company communication channel such as slack. Once again, I have personally experienced this one and can say it is much more effective when using a mono repo vs. a multi-repo.
Flexible code ownership
Being that every folder has a code ownership file that determines who the code owners of a code base is, a commit always triggers those code owners to take a look. The system can handle code ownership for us.
Other teams can easily go into other repositories and see how things are being done.
Great, so the above has me convinced. I’ve seen a similar article suggest that many of the above can be applied with a multi-repo. I respectfully humbly disagree, and perhaps I misunderstood the author’s intent. The point of a mono-repo isn’t so much so to do things which a multi-repo cannot, but rather to make it easier. To that, I can personally attest to it.
Ok, great, so at this point, I pretty much understood at a high level, that my mono repo should allow for the above to happen. However, out of curiosity's sake, I wanted to know the tools that Google and Facebook use for the above. To make sure that my architecture is similar.
In order for a mono repo to properly work at a very large scale, language-agnostic, the following tools are needed:
- A custom version control system(Google’s Piper, Facebook’s use of Mercurial)
- Client in the cloud(Ability to browse files, without having them actually there on your device)
- Code review tool(Github will suffice, for starters)
- Fast, indexed regex search over large file trees
- Program Analysis Ecosystem
- Pre-submits and post-submits
- Patching large amounts of files. I.e. A large backward-compatible change is made first. Once it is complete, a second smaller change can be made to remove the original pattern that is no longer referenced. (For instance, Google uses Rosie)
- Trunk based development
A Quick Pause
It’s important to realize that there are two different types of mono repos:
- Huge repositories containing all the code contained by a company.
- Project-specific mono-repos(Angular, React, Babel)
Open Source Tooling
Allows for Lerna bootstrap, which will symlink all Lerna packages that are dependencies of each other. In addition, Lerna Publish, which will by default update all packages simultaneously.
Handles having all dependencies linked together. All dependencies therefore installed together. In addition, yarn will use a singular lock file for each.
Allows use import namespaces, specific to the directory file path.
In addition, as my mono repo becomes more difficult to manage and move beyond numerous packages.
When mono repo gets large enough for compiling code specific to changes you have made.
That would be it. There is much more to discover, but I am definitely happy with this for now.