Documentation Is All You Need

If you are a dev and have never worked in a big corporation, congratulations. Also, here is some bad news if you are planning on joining one of them: Coding will not be the hardest part of your job. It’s figuring out who you need to talk to whenever you need something, begging for access to environments, applications and to code that will make you rethink your life decisions. Spending three weeks trying to understand how that codebase (the one you battled for two weeks to get access to) works is what will make you go bald. And only then, finally, getting to use it on your project. Which is now three weeks late.

I have worked at one of the biggest investment banks in the world, on what was one of its newest AI teams at the time. When I joined, I thought I was about to build the future. Instead, I found myself wasting most of my time hunting down a tiny group of people who held the tribal knowledge of the systems I needed to understand, like some kind of corporate anthropologist, except less respected and with worse coffee.

Tribal knowledge. Sounds warm and communal, right? It is not. In large companies, tribal knowledge is the thing standing between your project and oblivion. Also, the specific bit of tribal knowledge you so desperately need seems to always be in the hands of someone who is perpetually in meetings, about to go on vacation or who has just resigned.

My first project at the bank was building an AI powered contract generator for a very specific use case. It involved generating more than 50 pages with almost 300 different variables that could change completely from a contract to another. Needless to say, it was an extremely complex challenge, not only because it meant speedlearning the equivalent of a master’s degree in financial law, but also because the generator needed a highly dependable validation system for the generated contracts.

The validation process involves interacting with up to 30 endpoints spread across multiple internal applications in order to: extract company identifiers, validate whether those companies were clients, retrieve internal IDs, check regulatory ties, determine which funds they were operating with, validate those funds, and only then assemble the final document. Standard enterprise story. Nothing exotic. Just pure, distilled, institutional chaos with a REST API slapped on top.

The next natural step is to find out how to call those APIs. Should be easy right? Once again, I found myself completely hindered by tribal knowledge. The Swagger was technically there. Sometimes it did have the endpoint name. Sometimes it didn’t. Occasionally it had parameters. What was never there: the indication of what each of the endpoint’s parameters meant and what the endpoint returned. There were even endpoints that made the user send some mysterious IDs on the request. No one knew what they were about, not even the devs who built the endpoint. After some extensive digging I found out those IDs were hardcoded values in the frontend, placed there so that the application was able to tell different types of processes apart. For some reason, something supposedly straightforward ended up taking a huge amount of time. What could that reason be? How can I better deal with situations like this going forward?

I decided to reach out to people. You probably know how that went. Lets just say people are very good at helping in the most unhelpful ways.

Eventually I accepted that my salvation was not coming from my coworkers. So, I started using AI to break that barrier generated by tribal knowledge. I started using AI to understand what needed to be understood.

And that was when it clicked for me:

Documentation is all you need.

The source code contains all the business rules, but it can be hard to read and navigate and understand. Now imagine you have the perfect documentation, the perfect Swagger. And all of the code and your perfect documentation are indexed in a vector database, linked to each other.

That’s exactly what I started doing for any codebase I was working with. And guess what, my life got easier. It made any explicit or implicit bit information within a codebase way more accessible to my agents. I’m going to get technical in a future post, but I just want to state the case: writing the code is the easy part. The hard part is understanding what needs to be built, how the existing systems actually work, which APIs to call, what business rules are there, and plan ahead on how to avoid accidentally breaking something that’s been quietly running since 2011 (no one even knew it was running until it broke).

The bottleneck is context. It has always been context. AI just made that obvious.

A quick taxonomy nobody asked for but everyone needs

Documentation is the artifact: endpoint contracts, architecture maps, process descriptions, example payloads, business logic explanations. Something your team never has time to write, but always has time to suffer without.

Context is what documentation gives you: understanding of how the system works, how pieces connect, and why that one function has a comment that just says “do not touch.”

Specs are how developers turn that understanding into action: clear instructions for what exists, what should be built and how it should behave, so the AI stops hallucinating endpoints that don’t exist or building something with a different behaviour than what you specified.

Documentation produces context. Context enables good specs. Good specs make AI generated code actually useful and dependable instead of faulty and misleading.

The Swagger exists, but it describes a system that no longer works that way. The README was last touched before the COVID pandemic. The business processes live partly in code, partly in someone’s head, and partly in a Confluence page that has been viewed eleven times, eight of which were the person who wrote it checking back with their own work.

People think legacy code means COBOL. It doesn’t. Legacy code is any code that nobody understands anymore. That can be a 40 year old mainframe procedure. It can be a microservice you deployed six months ago that two people have already left the company over. It can also be the AI generated code you vibecoded last Tuesday while watching a Youtube video and a Netflix show at the same time. I might even work, but you haven’t got the slightest idea of what’s in there and why or how it works.

Every codebase becomes legacy the moment its context walks out the door. Which, in most companies, happens constantly.

AI is producing code faster than any organization can develop its understanding of it, while also enabling people to generate code they’ve never truly grasped. Humanity has built an extremely efficient machine for generating future confusion.

Before AI coding tools, bad documentation mostly slowed humans down. Now it also cripples the machine. A code agent with no context is an intern who just started and instead of asking the right questions just assumes it knows everything. On the other hand, a code agent with proper access to good documentation, process maps, and business logic references can be an extremely powerful tool.

The difference between “AI cannot work in our environment” and “AI transformed how we build software” is almost always context. Not the model. Not the tooling. The context. What produces context?

Documentation is all you need.

Everything else is just very expensive confusion.