Never Improve a Machine You Can't Read the Output Of

The $34-Hour Mistake Nobody Caught

I was on a coaching call recently with a founder building a chatbot SaaS. Smart guy. He had a dev, a Trello board, four tasks queued up, and a very real concern about burning cash on the wrong thing. He was doing everything right on the surface.

Then we opened the Trello board and looked at the prioritization.

At the top of the queue: scraping improvements. The task was to take the existing web scraper - the thing that pulls content from a user's site and feeds it into the chatbot - and make it faster, more accurate, and more reliable. The dev's estimate? 34 hours. Call it a full work week.

And here's where it got interesting. Because also on the board - further down, not yet started, not yet approved - was the feature that would let users see what the chatbot actually said with that data. The training preview. The interface where a user types a question, gets the chatbot's answer in real time, and then decides whether that answer is right or wrong.

So the plan, as it stood, was: spend 34 hours improving the quality of the food going into the machine, before building the screen that tells you whether the machine is producing good output.

I stopped the conversation right there.

The Rule

Here's the principle, stated plainly: you should never improve a system's inputs before you can read its outputs.

That's it. That's the whole thing. But let me show you why it matters, because most technical founders violate this constantly and wonder why their roadmap feels like running in quicksand.

If you improve the scraper - better link validation, deeper crawl, smarter deduplication - and you haven't yet built the interface that shows you what the chatbot says in response to a question, you have no idea whether any of that improvement mattered. You produced more, faster. And you have zero signal on whether any of it was good.

It's like tuning a race car engine when your speedometer is broken. You might be going 200 miles an hour. You might be going 40. You genuinely cannot tell. And every hour you spend on the engine, without fixing the speedometer, is an hour spent optimizing in the dark.

On this call, the founder had already been building this product for seven or eight months. Multiple developers. Multiple rounds of rework. And they'd already had the training preview feature working at one point - it was demonstrated, it even sold - and then somehow it stopped being the priority and fell off the radar. The very feature that told you whether the product was working.

How This Happens

It's not stupidity. It's a pattern I see over and over, and it has a specific shape.

You have a working system. Something breaks, or something feels slow, or a user complains about accuracy. So you go fix accuracy. That feels productive. It generates tasks with clear technical definitions. The dev knows what to do. You ship it. You feel good.

Meanwhile, the quality control layer - the thing that would tell you objectively whether accuracy actually improved, and by how much, and in response to what kinds of inputs - that's always the next ticket. Always "we'll get to that." Always blocked on something else.

What you end up with is a system that has been improved repeatedly, on the input side, with no measurement on the output side. Your scraper is thorough. Your database is clean. Your pipeline is fast. And you don't know if the chatbot it feeds tells customers the right price or hallucinates one.

That's not a hypothetical. On this specific call, the founder mentioned that without the training preview built, users had to install the chatbot on an external website just to test what it said. Think about what that means operationally. Every time they wanted to check whether the chatbot was trained correctly, they had to go through a full installation flow on a third-party site. Which means in practice, they weren't checking. Which means the scraper improvements they were about to spend 34 hours on? No one would know if they actually helped.

Free Download: 7-Figure Offer Builder

Drop your email and get instant access.

You're in! Here's your download:

Access Now →

What We Did Instead

The fix was simple once we named the problem. We moved the AI training preview feature to the top of the board. First task, starting immediately. 24-hour estimate, not 34, because most of the UI already existed - modals, input fields, output rendering - it just needed to be wired correctly and moved to the right screen.

The logic was clean: once users can type a question and see exactly how the chatbot responds, you now have a feedback loop. You can see: is it answering correctly? Is the training data good? Is the initial message right? Every subsequent improvement - to the scraper, to the crawl depth, to the validation logic - can now be evaluated against real output. You have a measurement layer. Now you can actually improve things.

The scraping task didn't disappear. It moved to second. It made sense to do it second, because by the time it shipped, there would be a screen to confirm whether it had any effect.

The sequencing is the whole point. Same tasks. Different order. The second order is the one where your dev hours compound instead of vanish.

This Goes Way Beyond SaaS Roadmaps

I want to be direct about something: this rule isn't just a software development principle. It's a general operating rule for any system you're trying to improve.

I've watched agency owners spend weeks improving their cold email copy - better subject lines, better first lines, sharper CTAs - when they didn't have clean reply tracking set up. They didn't know which sequences were getting opens. They didn't know if replies were coming from the follow-up or the original send. They were improving the input (the email) without being able to read the output (which message, which step, caused the reply).

If you're building a lead generation system right now, the same logic applies. Before you optimize your scraping workflow, your Apollo filters, or your targeting - you need to be able to measure what's happening downstream. Are those leads converting to replies? Are those replies converting to calls? If you can't read the output, you're flying blind. Tools like ScraperCity's B2B database or an email finder give you better inputs - but they only compound your results if you're measuring what happens after the send. Check out my best lead strategy guide for how to set that measurement layer up before you scale volume.

Same thing in a sales team. I see people hire more SDRs, buy more data, invest in better sequences - all input improvements - when the CRM isn't tracking which stage deals are dying at. You can't fix a sales process you can't observe. Put the measurement in first. Then improve.

The Real Cost of Getting This Backwards

On this call, when we added up the scraping estimate, the conversation landed on roughly 34 hours of dev time that would have been spent with no feedback loop. At the rates involved, that's not a trivial number. But the real cost isn't the money. It's what happens after.

You ship the scraping improvements. Customers still complain the chatbot gives wrong answers. You don't know why, because you still can't see what the chatbot says without installing it externally. So you schedule another round of improvements. More dev time. More estimates. Another sprint with no measurement attached. And now you're eight months in with a second developer wondering why the product doesn't feel solid yet.

That's the fibonacci I've described before in my own businesses - have an idea, spend money on it, can't measure it, spend more money, repeat. The only way to break it is to refuse to add inputs until you can read outputs. Every single time, without exception.

The founder on this call understood it immediately once we framed it that way. His instinct had been right - 34 hours felt wrong - he just didn't have the language for why. Once you name the rule, it becomes a forcing function. Before every new task, you ask one question: can I measure whether this change made things better or worse? If the answer is no, that task waits. You build the measurement first.

Need Targeted Leads?

Search unlimited B2B contacts by title, industry, location, and company size. Export to CSV instantly. $149/month, free to try.

Try the Lead Database →

How to Apply This on Your Roadmap Right Now

Go look at your current task list. Whatever you're building - a SaaS, an agency delivery system, a cold email workflow - find the highest-effort item on your to-do list right now. Then ask: what is the observable output of this system, and can I currently read it?

If you can't read the output - if there's no screen, no dashboard, no report, no tracking pixel, no reply rate number - that item goes to the bottom. You build the output layer first.

For software: it's the preview screen, the test environment, the logging dashboard, the user-facing feedback interface. Whatever tells you whether the thing is working.

For cold email: it's proper tracking in your sending tool - Smartlead or Instantly both do this well - linked to a CRM like Close so you can see exactly where in the sequence deals are progressing or dying. If you don't have that wired up, you don't know which copy change moved the needle. Grab my top 5 cold email scripts if you need a starting point, but track the output of every one of them before you start tweaking.

For a team or an agency: it's the reporting layer. Weekly numbers. Deliverable tracking. Client satisfaction metrics. You don't manage what you can't measure. If a client complains and you can't pull up exactly what was delivered in the last 30 days, you're operating without an output layer.

The question to ask before every sprint, every project kickoff, every "let's improve X" conversation is this: what does success look like in a number I can read, and do I have the infrastructure to read it?

If the answer is no, build the infrastructure first. Always. No exceptions.

Sequencing Is Strategy

The founder on this call didn't have a bad dev or a bad product. He had a sequencing problem. And sequencing problems are invisible until you name them, because each individual task looks reasonable in isolation. Of course you'd want better scraping. Of course you'd want more accurate data ingestion. Those are obviously good things.

The issue is never the task. The issue is the order.

Get the output layer live first. Then pour resources into the input side. That's the compounding version of this. That's how you build a software roadmap - or a sales system, or an agency operation - where every sprint produces measurable improvement instead of motion that disappears into the void.

Measure before you optimize. Ship the speedometer before you tune the engine. It sounds obvious in retrospect. It almost never happens in practice.

If you want to think through sequencing like this for your own build - whether it's a SaaS product, an agency, or an outbound system - that's exactly what we do inside Galadon Gold. We look at what you're actually building, find the spots where you're improving inputs with no output measurement, and fix the order. Sometimes a 30-minute conversation saves you a full sprint of wasted dev hours. Come find out.

Ready to Book More Meetings?

Get the exact scripts, templates, and frameworks Alex uses across all his companies.