Your AI Strategy Is a Data Strategy Wearing a Costume
Back to Blog
AI·Jan 14, 2026·5 min read

Your AI Strategy Is a Data Strategy Wearing a Costume

Most companies think they have an AI problem. They don't. They have a data problem - messy pipelines, no context, garbage semantics - and they're throwing models at it hoping nobody notices.

Michael Zigelboim·Co-Founder & CTO

Every board meeting in 2025 had the same slide. You know the one. "AI Strategy" in bold, a diagram with arrows pointing to a brain icon, and something about "leveraging large language models to drive operational efficiency."

Cool slide. Means nothing.

Here's what actually happened at most of those companies: someone fine-tuned a model on garbage data, got garbage results, and then concluded that "AI isn't ready for our use case."

AI was ready. Your data wasn't.

The Costume

I talk to engineering leaders all the time. The conversation usually goes like this:

"We're building an AI-powered system for X."

Great. What does your data look like?

"...we have logs."

You have logs. Unstructured, half missing context, the other half duplicated across systems that can't agree what a "service" is. You have logs the same way I have a gym membership - technically yes, functionally no.

The AI strategy was never the hard part. The hard part was always the boring stuff underneath it. Schema consistency. Data lineage. Semantic meaning. The unsexy plumbing that nobody wants to build because it doesn't demo well.

But here we are. Buying GPUs and hiring ML engineers before we even know what our own data means.

Semantics Are Important

I say this a lot and people think it's a throwaway line. It's not.

When your monitoring system says "error" - what does that mean? Is it a 500? A timeout? A failed health check? A misconfigured alert that fires every Tuesday at 3am because someone hardcoded a threshold in 2021 and left the company?

These are not the same thing. But to your AI model, they're all just the word "error" in a log line.

This is the core problem. AI models are pattern matchers. Really, really good pattern matchers. But patterns without meaning are just noise with structure.

If you feed a model logs where "deployment failed" means five different things depending on which team wrote the log - you're not training AI. You're training a very expensive random number generator.

The fix isn't a better model. It's knowing what your data actually means before you throw it at one.

What I've Actually Seen Work

I've been building in this space for a while now. Here's the unsexy truth about what separates companies that get value from AI and companies that get a demo:

The teams that actually get value from AI aren't doing 18-month data governance initiatives. They're asking basic questions first: what's a service? What's an incident? What does "healthy" mean? And then they write it down. Sounds dumb. You'd be shocked how many teams can't answer these consistently.

They also invest in context, not just collection. Collecting data is easy. Every tool does it. But when an anomaly fires, do you know why it matters? Which service is affected? Who owns it? What changed in the last deploy? That context layer is where AI goes from "interesting demo" to "actually useful." It's also the part most teams skip entirely.

And the teams that ship? They pick one problem, not a platform. "Build an AI platform for operations" dies in committee. "Use AI to reduce alert noise by correlating related signals" ships in a quarter. One sounds impressive. The other works.

The Model Is the Easy Part

I know this sounds backwards. Models are complex. Training is expensive. Inference at scale is hard.

But compared to getting your data right? The model is a weekend project.

I've seen a team spend four months tuning a model to detect deployment anomalies. They got a 2% accuracy improvement after weeks of hyperparameter tweaking. Then someone added three fields to their deployment events - the git diff size, the deployer, and whether it was a rollback - and accuracy jumped 15% overnight. No model changes. Just better data.

Getting your data right means knowing what you actually have (not what you think you have), making it consistent across systems that were never designed to talk to each other, and adding enough semantic context that a machine can tell the difference between "this is fine" and "this is on fire." And doing it continuously, not once.

That's not an AI problem. That's a data engineering problem. And it's the one nobody wants to put on a slide.

The Honest Version of That Board Slide

If companies were honest, the slide would say:

"Data Strategy: Phase 1 - Figure out what we actually have. Phase 2 - Make it make sense. Phase 3 - Now we can use AI."

Phase 3 is the shortest phase. Phases 1 and 2 are where you live for a year.

Not as catchy. But at least it's real.

So What

If you're a CTO staring at an "AI strategy" doc right now, here's what I'd actually do:

Pick one problem. One. Not "transform operations with AI." Something specific like "reduce time to root cause during incidents."

Then ask: do we have the data to solve this? Not "do we have data" - you have data, everyone has data. Do you have the right data, with the right context, in a format that a model can actually learn from?

If the answer is no - congratulations, you just found your real project. It's not as exciting as telling the board you're "doing AI." But it's the project that makes everything after it actually work.

The costume is fun. But eventually someone's going to ask what's underneath it.

And you better have an answer that isn't just "logs."