Documentation Is All You Need

If you have never worked in a big corporation, congratulations. Also: here is the bad news. Most of the job is not coding. Most of the job is figuring out who you need to talk to, begging for access to something, spending three weeks understanding how that something works, and only then, finally, actually using it in your project. Which is now three weeks late.

I was working at one of the biggest banks in the world, on one of its newest AI teams. I thought I would be building the future. Instead, I spent most of my time hunting down a tiny group of people who held the tribal knowledge of the systems I needed, like some kind of corporate anthropologist, except less respected and with worse coffee.

Tribal knowledge. Sounds warm and communal, right? It is not. In large companies, tribal knowledge is the thing standing between your project and oblivion, controlled by someone who is always in a meeting, about to go on vacation, or has just resigned.

I was building an AI based contract creator for a specific use case. Literally more than 50 pages with almost 300 different variables that could completely change a contract. This project was hard, not only because it meant doing a speedrun master’s intensive in financial law, but because I needed to consume tons of endpoints to validate the generated document.

To make it work, I had to validate extracted data across 20 to 30 endpoints spread across different internal applications: extract company identifiers, validate whether those companies were clients, retrieve internal IDs, check regulatory ties, determine which funds they were operating with, validate those funds, and only then assemble the final document. Normal enterprise story. Nothing exotic. Just pure, distilled, institutional chaos with a REST API slapped on top.

The real problem was, again, the tribal knowledge. The Swagger was there, technically. Sometimes it had the endpoint name. Occasionally it had parameters. What it never had: any indication of what those parameters meant, what they returned. There were even cases where I needed to send IDs that the people who built it didn’t know what they were about, which I later found out were hardcoded in the frontend, placed there to identify different types of processes. How would I know that?

So I talked to people. You know how that goes. Everyone is helpful in the way that produces no actual help.

Eventually I gave up on human beings and started using AI to read the codebase instead.

And that was the part that broke my brain:

Documentation is all you need.

The code has all the business rules, but it’s hard to read and navigate. Now imagine you have the perfect documentation, the perfect Swagger, all code and documentation indexed in a vector database, and they are linked.

That’s what I started doing, and guess what, it worked. I’m going to get technical in a future post, but I just want to state the case: writing the code is the easy part. The hard part is understanding what needs to be built, how the existing systems actually work, which APIs to call, what the business rules are, and what sequence of steps won’t accidentally break something that’s been running silently since 2011.

The bottleneck is context. It has always been context. AI just made that embarrassingly obvious.

A quick taxonomy nobody asked for but everyone needs

Documentation is the artifact: endpoint contracts, architecture maps, process descriptions, example payloads, business logic explanations. The thing your team never has time to write and always has time to suffer without.

Context is what documentation gives you: understanding of how the system works, how pieces connect, and why that one function has a comment that just says “do not touch.”

Specs are how developers turn that understanding into action: clear instructions for what should be built and how it should behave, so the AI stops hallucinating endpoints that don’t exist.

Documentation produces context. Context enables good specs. Good specs make AI generated code actually useful instead of confidently wrong.

The Swagger exists, but it describes a system that no longer works that way. The README was last touched before the pandemic. The business process lives partly in code, partly in someone’s head, and partly in a Confluence page that has been viewed eleven times, eight of which were the person who wrote it checking their own work.

People think legacy code means COBOL. It doesn’t. Legacy code is just code nobody understands anymore. That can be a 40 year old mainframe procedure. It can also be a microservice you deployed six months ago that two people have already left the company over. It can be AI generated code you wrote last Tuesday and already cannot explain.

Every codebase becomes legacy the moment its context walks out the door. Which, in most companies, happens constantly.

AI is producing code faster than any organization can develop understanding of it. We have built an extremely efficient machine for generating future confusion.

Before AI coding tools, bad documentation mostly slowed humans down. Now it also cripples the machine. A code agent with no context is an intern who just started and is too afraid to ask questions. A code agent with real documentation, process maps, and business logic references is something actually useful.

The difference between “AI cannot work in our environment” and “AI transformed how we build software” is almost always context. Not the model. Not the tooling. The context that documentation provides.

Documentation is all you need.

Everything else is just very expensive confusion.