Our AI Answers to Questions Clients Ask

AI Answers

When businesses begin exploring AI, the conversations are rarely about updated algorithms or the latest research papers. More often, they start with confusion, a little hype, and a lot of questions.

This blog is a summary of a recent podcast conversation between Julie, Byron, and Ryan from Rōnin Consulting, where they provide AI answers to the most common questions they hear from clients—everything from deployment timelines and infrastructure to token costs and model selection.

If you’d prefer to watch or listen to the whole discussion, you can check out the episode here: Watch on YouTube

“How fast can we deploy and see value from AI?”

Deploy AI is a little bit of a loaded question,” Ryan said, right off the bat. And he’s right.

For some clients, “deploy” means spinning up a chatbot using ChatGPT. For others, it means integrating generative AI into an internal workflow tool or running a fine-tuned model in a secure, on-prem environment. Each use case has a different timeline.

You can use a commercially supported model really quickly,” he explained. “Within a day, you can be talking to that employee,” meaning an AI-powered assistant or chatbot. However, more complex integrations, especially those that involve internal systems or sensitive data, may take weeks to months, depending on the level of customization required.

“What’s the difference between public, private, and open-source models?”

If this question confuses you, you’re not alone.

Julie joked, “Nobody knows the difference between public and commercial AI. It’s like the same thing. Nobody calls it commercial AI while looking for it in Google, except you guys.

She’s not wrong, but regardless of how it’s “named,” the real difference comes down to where the model is hosted and who controls it.

Ryan clarified: “Are you hosting it yourself—on-prem, using an open-source model—or using a vendor’s model running in their infrastructure?

Open-source models (like Llama or Mistral) can be downloaded and hosted privately in a cloud or local environment, making them technically “private” even though the underlying code is public. Meanwhile, vendor-hosted services like ChatGPT or Claude run in someone else’s infrastructure and may have more stringent data control policies, but also less flexibility.

Yes, an open-source model can be private,” Byron confirmed. “If you take that model and host it in your cloud, it’s considered ‘private,’ even though it’s open-source.

“Will our data be seen or used to train a model?”

This concern almost always comes up, and it’s valid.

Julie noted that most clients who request “private” AI aren’t seeking ownership of the model architecture. They’re worried about the questions their employees are sending, or IP in the prompts. They want to know: Can someone else see this?

Enterprise-licensed models from vendors like OpenAI, contractually, they’re not training on your data,” Byron explained. “They’re in legal jail. They can’t.

Still, Byron continued and pointed out a larger issue: “It’s not that you think OpenAI will break the rules. It’s that governments or courts might force them to.Citing recent legal cases, he raised the concern that even legally protected data could become vulnerable to future policy changes, and it’s something that we always consider.

“How much does AI actually cost?”

Cost is always variable and depends on your task, model, and data volume. For simpler use cases, such as summarizing documents, the costs are minimal. “We estimated about $2,000 a year using a mini OpenAI model,” Byron said. “And with all things considered, it’s super cheap.”

However, for other cases, such as processing thousands of medical records or performing advanced inference, the costs can rise rapidly. “We’ve had models that cost $60 an hour to run,” he added, “and it’s a factor that we can’t overlook because our clients come in and ask about token costs all the time.

When these clients come in with specific cost questions, “There’s always a discovery phase,” says Ryan, “where we figure out how much data they’re sending, how complex the task is, and which model is ‘good enough.'”

“Should we host on-prem or in the cloud?”

In most cases, the answer is the Cloud.

The majority of our clients are in the Cloud,” Ryan said. “Some of them have both: on-prem and cloud, but Cloud is still the default.

Why? Because it’s cheaper, faster, and doesn’t require companies to manage hardware or maintenance. “We’re typically don’t steer them toward on-prem solutions,” Ryan explained. “Commercially hosted is almost always faster and more cost-effective, unless there’s a unique reason to go elsewhere.”

But for clients with sensitive data or extremely high-volume usage, on-prem can make sense.

“There’s an inflection point where it becomes cheaper to buy and run your own hardware,” Byron noted, “especially if you’re already sending an absurd amount of data to your model.

 “How do I choose the right AI model?”

Choosing a model isn’t about finding the biggest or most powerful; it’s about finding the one that’s good enough for your task.

Sometimes you just have to do a big POC,” Byron said, referring to a proof-of-concept. “We have tried every open AI model, Claude, Mistral—you name it. Then we use defined test data to compare the results.

Rōnin typically takes an agile approach: “We start with a problem statement, identify the desired output, and back into the data sources,” Ryan explained. “Then we start testing models with a hundred examples to see what works. The best one that fits what the client needs – that’s what we use.

“Do I need a reasoning model?”

Reasoning models, such as GPT-4 or Claude Opus, are designed to handle complex tasks that involve logic, structured output, and multi-step instructions. They’re ideal when you’re transforming data, applying business rules, or working within strict formatting constraints.

Summarization doesn’t need a reasoning model,” Byron said. “But if you’re doing data transformation, with lots of instructions or formatting, then reasoning helps.

These models are also more auditable. “Newer models can show their thought process,” he added. “Which helps with auditing and trust.” That’s especially important for use cases in regulated industries, such as finance or healthcare, where understanding how a model arrived at its answer is just as crucial as the answer itself.

But that extra capability often comes with a higher price tag.

They can be tougher to track costs,” Ryan noted. “When they go into the reasoning step, they’re using more tokens and it’s harder to estimate and can get expensive fast.

So, do you need a reasoning model? That depends on the complexity of the task, your cost tolerance, and whether explainability is a requirement. For many use cases, a smaller, cheaper model may be more than enough.

“What’s the deal with AI agents?”

There’s a lot of buzz around agents and just as much confusion. AI agents are often marketed as futuristic co-workers: autonomous, intelligent systems that can complete complex tasks, make decisions, and even learn over time. But does every business need one?

An agent doesn’t have to be a reasoning model,” Ryan explained. “It can be a composite of many models with tools and memory and the ability to interact with its environment.” That means it could use a combination of large language models, plug-ins, APIs, and internal tools to complete tasks, often without human input.

What clients often picture is an all-knowing, fully autonomous AI that can run an entire business unit. However, in reality, the most successful AI agents typically focus on something smaller and more strategic, such as completing a specific task within a business workflow.

Julie put it plainly: “So people come in thinking they need a complete overhaul… but really, we can just tweak a couple things?

Exactly,” Byron said. “You don’t need a giant workflow engine. Just identify the high-touch areas in your process. Small wins add up fast.

Rōnin has built internal proof-of-concept agents that can launch browsers, scrape data, download files, and even install software on their own desktops to complete tasks, like compiling cat images into PDFs, just for fun. That kind of environment interaction is powerful, but not every business needs it.

As Byron put it: “Most companies don’t need a big, cracked-out, autonomous workflow engine. They just need smart automation in the right places.” In those cases, a simpler, task-based agent, or even just a well-designed prompt, can offer a much higher return on investment with less complexity.

You Don’t Need Everything, You Just Need What Works

The common thread in all of these questions?

Clients often think they need more than they do.

As Byron commented, “It reminds me of when the cloud first came out. People thought they needed Kubernetes and super complex environments, but in the end, it was something simpler that delivered the most ROI.

The same is true with AI.

You don’t need to start with agents, on-prem models, or reasoning trees. You need the right tools for the job and a team that can help you distinguish between the hype and what truly matters. If you’re exploring AI and need more AI answers and want help figuring out where to start, reach out to us today. We’re happy to discuss it further.

Author:
Julie Simpson is the Marketing Manager at Rōnin Consulting. Before joining the team, her software development knowledge was practically non-existent. However, after countless internal meetings, soaking up information, and engaging in endless Teams chats with the Rōnin crew, Julie has transformed into a bona fide technology geek. Nowadays, she dreams about AI, laughs at dev jokes, and frequently messages the team with warnings about the eventual rise of Skynet.