Field report

Your Data Lives in Brian's Head

June 16, 2026 /

data-discipline
tribal-knowledge
archetypes
sequencing
smb-ai

In a 5-50 person business, the load-bearing data layer is usually a person, not a system. That person has a name. The team knows it. The CRM doesn’t.

The CRM looks empty for a reason

When the dashboard reads thin, most operators conclude the team isn’t disciplined about notes. The intake form has gaps. The status field is stale. Nobody updates the source channel.

That read isn’t wrong. It misses the deeper one.

The team isn’t writing things down because nothing forces them to. Someone already knows. The shop manager who can tell from a customer’s name which jobs are going to come back as callbacks. The senior tech who reads a job description and routes it to the right crew in five seconds. The office manager who hears “the website thing” and knows it means a referral from one specific partner from 2021.

That person is the data layer. The CRM sits downstream.

The system looks empty because the actual database isn’t a system. It’s a chair, occupied by one person, who answers operational questions faster and more reliably than any tool the business has bought in the last decade. The team built a routing pattern around that chair years ago. Nobody named it. Nobody documented it. The pattern just is.

Empty CRMs at this size aren’t usually a discipline problem. They’re a routing problem. The data exists. It got routed around the system before the system was even chosen.

He’s payroll until he’s a project

The cost stays invisible while he’s there. He fields routing questions. He spots exceptions before they hit production. He sends the right tech to the right job because he reads the customer’s name and knows what last winter looked like. The team’s day runs. The dashboard nobody trusts is irrelevant because his answers arrive faster than the dashboard could.

What’s being paid for, quietly, is operational infrastructure nobody named. A decade of institutional memory subsidized as a side effect of one salary. Invisible because nobody requested documentation. Fragile because the knowledge sits in one head, and there is exactly one of him.

Three events make the cost legible.

He takes two weeks off. The team makes plausible-looking calls all month. The customer who has always been on a 1.5x rate on Saturday work gets quoted standard and accepts. The relationship is twenty-two-hundred dollars lighter the next month, and the team didn’t know to flag it.

He gets recruited. The operator realizes the side benefit they treated as a payroll line is a single point of failure with a market price, and that market price is being negotiated.

The business tries to scale. A new hire arrives. The owner budgets two months of ramp. It runs eight. Every week of that overrun is full salary against half output, while the new person learns from the same person who was supposed to be back to building.

By the time the documentation project starts, it starts under deadline pressure with the institutional knowledge halfway out the door. The bill arrives all at once.

Three kinds of knowledge that hide in him

When I run the data-discipline pillar of the assessment, I’m not looking for empty fields. I’m looking for where the knowledge actually lives that the empty fields would otherwise hold. Three categories show up in nearly every 5-50 person business. Each one is a different shape of what extraction has to surface.

Customer-specific exceptions.

Twelve, fifteen, twenty customers who deviate from the standard contract in a way the team has memorized. One pays net-30 instead of net-15 because of a deal made in 2019. One gets the senior tech on every job, no exceptions, because the last junior tech who went got chased off the property. One only takes calls Tuesday afternoon. None of this lives in a field. It lives in the part of the brain that recognizes a name and routes the call accordingly. When he takes a day off, half of those customers get the wrong handling. When he leaves, all of them do.

Process exceptions.

The two dozen “when X happens, you actually do Y” rules that the documented process doesn’t cover. A job comes in for a service the company sells, but the access at the customer’s house requires a different crew composition. An order arrives for a product variant the standard pricing sheet underbills by 15%. A quote request mentions a competitor name that signals the prospect is shopping and won’t close, so the team triages it. The documented playbook covers the regular cases. The senior person carries the irregular ones. The irregular ones are where most of the margin and most of the risk live.

Vendor and partner relationships.

When a part goes out of stock on Wednesday afternoon, he texts a specific contact at a specific supplier and gets it by Friday morning. When a subcontractor flakes, he knows which of three backups will pick up on short notice. When a referral partner is going through a slow season, he checks in with them so they keep sending leads. None of this lives in a CRM record. None of it is in a vendor list. It’s a network built over years, and it ships with him when he leaves.

Extraction is the data work

Most operators in this size range hear “data discipline” and picture better software. A cleaner CRM, a stricter intake form, a dashboard that actually reflects reality. A consultant has probably pitched one of the three in the last quarter.

Software isn’t the discipline. Software is the destination.

The discipline is the project of pulling what experienced people already know out of their heads and into a structured format the team and any future tool can use. In a 5-50 person business, that is what data work actually looks like, and it shows up in shapes that don’t impress: hours sitting next to him asking what he’d tell a replacement on day one, a notebook that finally exists outside a text history, a customer-exception document populated one cell at a time, a vendor list reconciled against what’s actually in someone’s phone.

This is slower than buying a tool. It can’t be demoed in a conference room. It produces nothing impressive enough for a vendor case study. It also happens to be the only data work at this size that changes outcomes, because every automation the business will ever buy is built on what this project surfaces, or fails to.

Software builds on extraction. Extraction doesn’t build on software.

A conversation worth replacing

Most discovery calls I take open with a tool. The operator has been pitched a CRM upgrade, a chatbot, a lead scorer, or an AI quoting agent. The pitch was specific and the demo was convincing.

The exchange goes like this.

Operator: We’re looking at moving CRMs. The current one is a mess. Me: How much of what your team knows is in the current one? Operator: … not most of it. Me: Where is most of it? Operator: Brian, mostly. Some in his email. Me: Moving CRMs won’t help. You’re moving the empty one.

A new CRM doesn’t fix the problem if the old CRM was empty because nothing routed through it. The new one will also be empty, with a nicer interface and a higher monthly bill. Whatever made the team route around the system the first time still routes them around it. The dashboard reflects nothing because nothing is captured. The reports lie because they’re rolling up the absence of data.

The conversation worth having opens differently. What does the senior person know that nobody else has access to. Where do the team’s operational questions actually go. What does a new hire spend six months learning that isn’t in any document.

Those questions produce a different project than the CRM migration. They produce extraction. Notebooks. Interview hours. A document of customer exceptions. A pricing reconciliation. A vendor list that finally exists outside one phone.

Then, after that work, the CRM choice is meaningful. The system has something to hold. The dashboard reflects something the team would actually trust. The new hire ramps against what the company knows, not against the same one person.

Empty the head before you fill the system.

Notes (optional, stripped before publishing)

Anchor map

“Brian” → anchored: content/archetypes/named-archetypes.md (Brian = single-point-of-failure knowledge holder; documented archetype, not a real client)
“5-50 person business” → anchored: master-context/business-context.md (ICP)
“shop manager / senior tech / office manager” → anchored: content/archetypes/named-archetypes.md (Brian-family operational roles)
“the website thing” referral pattern → anchored: same archetype file (representative Brian-knowledge cross-surface pattern)
“deal made in 2019” → illustrative; defensible as a typical legacy-pricing carve-out
“1.5x rate on Saturday work” / “twenty-two-hundred dollars” → illustrative; defensible as representative pricing exception and dollar figure for a small customer relationship
“net-30 vs net-15” → illustrative invoicing nuance
“Tuesday afternoon” call preference → illustrative customer pattern
“two weeks off” / “two months / eight months ramp” → illustrative time spans; defensible as representative SMB scenarios
“twelve, fifteen, twenty customers” / “two dozen rules” / “three backups” → illustrative counts; defensible as representative figures
“underbills by 15%” → illustrative percentage on a pricing exception
“data-discipline pillar of the assessment” → anchored: master-context/business-context.md (six-pillar diagnostic framework)
“CRM upgrade / chatbot / lead scorer / AI quoting agent” → anchored: master-context/business-context.md (typical AI pitches a prospect arrives with)
“Wednesday afternoon / Friday morning” stock-out scenario → illustrative supply-chain pattern

Credibility-probe defense

If asked “tell me about a specific Brian”: redirect to the archetype. Brian is the named archetype across Operator’s Bench content; cross-surface character (also referenced in pull-up-your-last-50-leads LinkedIn post and thats-not-an-ai-problem blog draft). Not a single real client.
If asked about the $2,200 example: defensible as the typical magnitude of a single-customer pricing miss in a service business; illustrative, not a specific engagement.
If asked about the eight-month ramp: redirect to methodology framing. Eight months is the representative carrying-cost pattern when knowledge is undocumented, not lifted from a specific engagement.
If asked “how do you actually run an extraction project”: describe the cadence. One-hour interviews with the senior person, structured by the six-pillar diagnostic, captured into notebooks and reconciled into the CRM. Scope depends on team size and exception count.
If asked about “the two dozen ‘when X happens, you actually do Y’ rules”: describe the kinds (customer-specific access requirements, pricing variants, competitor-name triage patterns). Acknowledge specific count is representative.

Structural choices

Pre-H2 hook present (4 sentences, ~35 words). Sets up the person-as-database dichotomy before structure starts. Differentiates from iteration-2 of this same slug, which used no pre-H2 hook.
Shape B closer with dialogue exchange. Iteration-2 used Shape B without dialogue. Returning the dialogue device for iteration 4 differentiates structurally; the device is established for the blog channel by sibling thats-not-an-ai-problem.md.
Closer line: “Empty the head before you fill the system.” Fresh formulation in the data-layer-as-person family. Distinct from iteration-2’s “Brian is the database. Migrate first.”, from your-ai-isnt-failing-your-data-is.md’s “Fix the data. Then automate. In that order.”, and from thats-not-an-ai-problem.md’s “AI joins the conversation when it earns the seat.”
Mini-framework is three categories of knowledge (customer exceptions, process exceptions, vendor relationships) rather than three diagnostic questions (iteration-2) or three CRM-side patterns (systems-angle sibling your-ai-isnt-failing-your-data-is.md). Categorical framing pulls the post toward “name what’s in his head” rather than “spot your own Brian.”
Mini-framework parallel structure verified. Each item: bold sub-heading naming the knowledge category, then a paragraph mapping the category to representative operational specifics (counts, scenarios, pricing patterns), then a closing line that names where the knowledge lives or what happens when he’s gone.
Mini-framework concreteness balance (#18): Customer exceptions carries 5 specifics (12-15-20 customers, net-30 vs net-15, deal made in 2019, senior-tech rule, Tuesday afternoon). Process exceptions carries 4 specifics (two dozen rules, crew-composition scenario, 15% underbill, competitor-name triage). Vendor relationships carries 4 specifics (Wednesday-to-Friday turnaround, three backups, referral-partner check-in, network-built-over-years). Densest 5 vs lightest 4 = within 2x. Balanced.
H2 title length (#22): “The CRM looks empty for a reason” (7w), “He’s payroll until he’s a project” (6w), “Three kinds of knowledge that hide in him” (8w), “Extraction is the data work” (5w), “A conversation worth replacing” (4w). All within 3-8 word range. No long thesis-statement title required.
Position adds a layer the Twist did not (#17). Twist names what the cost looks like when he’s gone (carrying cost surfaces all at once under pressure). Position redefines what “data discipline” means in a 5-50 person business: extraction, not software. New methodological frame, not a recap of the cost.
Internal phrase repetition (#21): “the senior person” appears 2x (mini-framework process section, closer). “data discipline” appears 2x (mini-framework intro, Position opening). “Brian” appears 2x (H1, closer dialogue). “the dashboard” appears 3x but each in a structurally different role (Pain: “the dashboard reads thin”; Twist: “answers arrive faster than the dashboard”; closer: “the dashboard reflects something the team would trust”). No distinctive 3-5 word phrasing reaches 3+ uses with the same role.
Distinctive-phrasing reuse check across siblings (#20): avoided iteration-2’s “the data layer you can’t see is a person,” “He runs the business until he stops,” “Brian is the database. Migrate first.”, “Slow work. The kind that doesn’t demo well.”, “list of 30 exceptions,” and the “It looks like X. It looks like Y. It looks like Z.” anaphora. Avoided systems-angle sibling’s “confidently wrong,” “the output looks correct. That’s the trap.”, “Boring infrastructure work.” Avoided thats-not-an-ai-problem.md’s “AI sequences in fine. It just sequences second.”, “AI joins the conversation when it earns the seat.”
Closer dialogue authenticity (#19): Operator lines drop pronouns and hedge (”… not most of it.”, “Brian, mostly. Some in his email.”). Lines pass the consultant-substitution test: another consultant would not say ”… not most of it.” in conversation; it’s operator-shaped fragment language. The Me lines stay sharp.
Worldview claim: data-is-the-load-bearing-layer, people/tribal-knowledge sub-angle. Sibling systems-angle blog draft is your-ai-isnt-failing-your-data-is.md. Brian archetype provides cross-surface continuity.