Monorepo

What is a Repo?

A repository, or repo, is a centralized digital storage that developers use to make and manage changes to an application’s source code. Developers have to store and share folders, text files, and other types of documents when developing software. A repo has features that allow developers to easily track code changes, simultaneously edit files, and efficiently collaborate on the same project from any location. --AWS

Explain It Like I'm 5

A repo is a shareable file folder between developers that tracks additions and modifications across time and holds code files and other stuff for software projects.

Simple Project Setups ...

Usually, if you scan open source projects on GitHub or if you create a project on your computer, you put files related to the project in one spot. A non-software related example would be if you were tasked with tracking inventory for a flower store. You could imagine having a file folder with an excel spreadsheet for tracking inventory and a separate folder for tracking receipts:

Flower Shop Inventory/
|- Inventory.xlsx
|- Shop Purchase Receipts/
   |- Purchase 2025-01-01 ACE Hardware.jpeg
   |- Purchase 2025-01-01 Flower Supplier Online.jpeg
   |- ...
|- Sale Receipts/
   |- ...

It wouldn't make too much sense to start intermingling staff reviews in one of the two receipts folders. You'd want to segregate information appropriately in order to manage it over time in an organized fashion.

Similarly when creating a software project, it typically makes sense to package all of the code for a project into a singular repository to the exclusion of other projects. The way other projects make it into your repository is via dependency management per your project's needs. Mixing and mingling code that doesn't relate almost immediately manifests information organization problems.

... Lead to Polyrepo Setups

Poly refers to many repos/repositories. If you have many software projects, whether personally or in your organization, a polyrepo setup would involve having a repo per project in 1-to-1 correspondence.

The pros of a polyrepo setup boil down to the simplicity of managing a single project in a single place, which is a huge pro when dealing with projects one at a time. If the team needs to maintain project A, they go open the repository for A which contains all of the context for A and only A, similarly for projects B, C, and so forth.

Polyrepo setups make sense. You've got multiple pieces of code and a place for each piece.

Dilemmas with Polyrepos

Hard to See the Full Picture

If you were teaching 3 year olds to glue macaroni and glitter to a paper plate, you could imagine having a table or area organized for each step. Area 1 would have the paper plates which you would distribute. Area 2 would have the glue and macaroni. Area 3 would have the glitter. There's a linear process to taking these base items and combining them into art that parents will be obligated to own forever. At each station you would focus on one thing at a time to simplify the herding of cats, er... coders, er... 3 year olds I mean.

When developing a software project however, it is really important to grasp the full scope of work at each step. Complexity tends to compound and complexity introduced in the base dependencies tends to flow into the downstream software. Having each individual repo on GitHub (where it takes 5 to 6 different button clicks to find the right repo with dreadfully slow page load times) makes it hard to get the full picture as you work from project to project.

Which Tends to Silo

Now that we've discovered the price of context switching between projects, it becomes really easy for Steve to become the overlord of that project or Bob to become the gatekeeper to repo X. Now if Steve and Bob are capable of managing the complexity of their projects, then it shouldn't matter if the broader team can or can't gain visibility into the code.

However, software development is better played as a team sport. Introducing context switching can lead to innefficient siloing of personnel simply due to the subject-matter-expert-ing of coworkers for projects instead of more important areas of expertise, like how to write a proper nginx config. Preferably anyone should be able to contribute any piece of code to any project within the boundaries of their capabilities. However, if Steve is the only person who has setup their local environemnt properly on their machine for that project and it's an ordeal to replicate their environment of that project, Steve eventually will be the only one who touches that project.

You and I both know Steve isn't always going to be around to maintain that project either.

Other Issues

Sharing bespoke dependencies
Dependency pinning
Keeping track of different build/deployment pipelines
Code DIES when it is unobserved

Enter the Monorepo: All the Code in One Spot

In a monorepo setup for an organization, typically you would organize all of the code for that organization into one spot. Every project would live in the same place. "But didn't you describe moments ago that intermingling unrelated digital information introduces problems." Yes, I did, astute reader. Therefore allow me to make the distinction.

My personal projects are grouped on my computer in a structure similar to this:

projects/
|- blog/ <- this project has its own separate repo
   |- monorepo.md
   |- ...
|- I'll-finish-this-later/ <- and this one
   |- ...
|- crud-app-number-1023003/ <- and this one too
   |- ...

When I open each project with my code editor, I only see the files that pertain to each separate project.

The only transformation we'll make is to place the repository (and visibility) one step higher in the file tree:

projects/ <- this is now the `repo`
|- blog/
   |- monorepo.md
   |- ...
|- I'll-finish-this-later/
   |- ...
|- crud-app-number-1023003/
   |- ...

So now, when I open my code editor, I open the projects/ folder and have the benefit of seeing every segregated project by itself in the same spot.

While I don't actually use a monorepo to manage personal projects...yet, I LOVE this set up for work because:

I no longer pay a huge penalty from navigating from project to project.
I can get visibility into any part of the organization's code.
Collaborating with teammates is vastly simplified.
The easy path is to maintain essentially the same environment with the same tooling for every project in the repo.
- The hard path would be maintaining separate environments for each project.
It is significantly easier to build, introduce, or share custom targeted tooling for maintaining software that can support every project in the organization in a transparent way in one spot.
If Steve or Bob need a PR reviewed, it's easy for the entire team to see when they need a review.

Every project's files indeed stays and should stay separate, however, everyone can quickly navigate from project to project. All we've done is eliminated the context switching achieving significant benefits.

Summary

I maintain selecting to support a polyrepo or monorepo setup is a case by case decision (like every other technology decision should be), but my personal preference is the monorepo.

Sources

AWS: What is a Repo?