AI Feels Easier Than Ever, But Is It Really?

Author:Murphy | View: 24361 | Time: 2025-03-22 20:04:25

A few days ago, I was speaking at an event about how to move from using ChatGPT at a personal level to implementing AI-powered technical solutions for teams and companies. We covered everything from prompt engineering and fine-tuning to agents and function calling. One of the questions from the audience stood up to me, even though it was one I should have expected: "How long does it take to get an AI-powered feature into production?"

The 4 new trendy AI concepts, and their potential in digital products

In many ways, integrating AI into features can be incredibly easy. With recent progress, leveraging a state-of-the-art LLM can be as simple as making an API call. The entry barriers to use and integrate AI are now really low. There is a big but though. Getting an AI feature into production while accounting for all risks linked with this new technology can be a real challenge.

And that's the paradox: AI feels easier and more accessible than ever, but its open-ended (free input / free output setup) and probabilistic nature also makes it very hard to maintain under control.

In this post, we'll explore four major challenges that make building AI products complicated. By considering these factors, you'll be better equipped to launch valuable, while safe and reliable AI-driven products.

Prediction Feasibility

Let's start with the basics: AI is about making predictions. So, before jumping into AI integration, ask yourself:

Do I actually need to predict something to solve the problem?
If yes, what exactly needs to be predicted?
Does it make sense to predict it?
Do I have the right data to support this prediction?
Could an intern perform the task I'm asking AI to do, given the right data and context?

I've seen countless AI-powered products that sound impressive at first, but when you challenge their predictive assumptions, it's unclear why AI is necessary. One wild example is a dating app claiming to use AI to find your perfect match based only on a picture of your face. So what exactly is being predicted here? And why would a face be sufficient for such a complex task?

While this dating app example might seem extreme, you'll likely encounter relatable situations where AI isn't the right solution. Often, what people are asking for sounds more like magic than like AI. In other cases, simpler, rule-based systems could solve the problem more efficiently, or the required data simply isn't available to make accurate predictions. Before deciding to use AI, challenge whether AI and predictions is truly what you need.

Cost of Being Wrong

With AI, you can be wrong. Picture by Mikka Luotio on Unsplash

When you incorporate AI into your solutions, you move from a deterministic world (where the same input produces the same output) to a probabilistic one. AI models, by nature, introduce uncertainty and probabilities. Hallucinations in LLMs are a known issue, but it's more than just that. Small variations in inputs, parameters, or vendor's upgrades can lead to inconsistent outputs. A striking paper on this topic found that positive-thinking prompts improves LLMs performance, and that the most proficient mathematical reasoning can be achieved by referencing Star Trek in the prompt.

This is yet another example of how unpredictable and inconsistent AI responses can be, emphasizing the importance of accounting for variability when integrating AI into products. To be certain you can cope with this variability and uncertainty, ask yourself:

Can my predictions be wrong?
How wrong can they be before it becomes a problem?
Do I need consistent, repeatable results?

Different use cases have varying tolerance for error and uncertainty. For instance, generating a draft document with AI doesn't require perfection – even if it has hallucination risks, the user will be able to edit and correct it. Fraud detection models, on the other hand, need to be mostly right but can handle a small margin of error. However, in high-stakes applications like facial recognition used by law enforcement, the cost of being wrong can be simply unacceptable.

If there's some tolerance for error and uncertainty, several strategies can help mitigate these risks. One essential step is testing the system's outputs across multiple inputs, and assessing its robustness against small changes such as typos or slight variations. Advanced prompting techniques, like chain-of-thought, have proven to reduce erratic responses, while retrieval-augmented generation (RAG) can help force the model to consider the right context when producing a response. Additionally, implementing guardrails can be useful to manage edge cases and decrease the likelihood of undesirable outputs.

3. Societal Impact

Technical solutions inherently impact society, and the addition of AI amplifies the ethical risks involved. We are talking about risks like bias and discrimination, privacy violations, erosion of human autonomy, misinformation… While existing EU regulations such as GDPR or non-discrimination directives have a weak relationship with AI's potential harms, the novel AI Act aims to regulate AI based on the societal risk each use case presents.

To ensure a responsible use of AI and a future compliance with regulations, these are some inital questions that can be useful:

Is it ethically correct to solve this problem using AI?
Is the data I need appropriate and ethically obtained?
Are there risks of bias or discrimination?
How transparent, explainable and actionable does the system need to be?
To what degree can I automate this process or need human in the loop interventions?
What other societal impacts could arise (e.g., environmental footprint)?

Some controversial examples on this are models that predict user demographics from the name alone (why would you need to predict this data if the user hasn't shared it?), emotion AI capabilities in personal devices (inferring emotions is now a prohibited practice by AI Act in certain contexts, as it is considered an invasion to private mental health and freedom of thought, and has the risk to be used to manipulate people), or image generation models that reinforce harmful stereotypes (defaulting to generating doctors as white males, criminals as black men, and beautiful people as young naked women).

Fortunately, the field of trustworthy AI and AI governance is well-researched, providing frameworks and techniques to help mitigate these risks. Comprehensive documentation – covering datasets, models, and systems – is essential to ensure a responsible and governed AI life cycle. Frameworks that advocate testing from diverse perspectives and mapping stakeholders to uncover potential misuse and harm are also really valuable. Incorporating transparency by design, explainability, and human-in-the-loop mechanisms are critical design choices to consider as well.

There are also actions that can be taken at the team or company level. Diverse teams have proven to implement tech solutions more responsibly, as they incorporate a wider range of perspectives and consider the needs of different population segments. While AI solutions are primarily designed and built by technical teams, multidisciplinary collaboration and working closely with other departments- especially legal and privacy- can also help go in the direction of responsible and compliant implementations.

4. Production Reality

Your solution will need to operate well in production. Picture by Ant Rozetsky on Unsplash

Even when using AI and predictions make sense and are feasible for your usecase, you control output variability and uncertainty at least to a certain degree, and have managed to minimize ethical risks, there is still one more challenge to deal with: production reality. Taking AI solutions into production is often harder than expected, and it can be full of surprises.

Consider how your solution will operate in production, accounting for the technical requirements as well as the potential expected and unexpected usage and misuse:

Is the infrastructure in place to support a scalable deployment?
What technical requirements will I face (latency, throughput, etc.)
What will be the cost per call and total operational cost, and will it be possible to achieve a positive ROI?
How will I tackle security concerns?
How will I address privacy protection?

Integrating external API calls to huge models for text generation inevitably introduces latency into the process (responses can take seconds to complete, compared to fractions of a second for back-end driven tasks). It also adds a unit cost per call. For instance, if you plan to make an LLM call every time a user logs in, searches or views a product on your platform, each interaction now carries a cost based on the number of input and outputs tokens. Estimating this unit costs and projecting the total expected cost considering the traffic and number of calls you'll be making is crucial to avoid expensive surprises once the system is in production.

The free input / free output setup of LLMs also introduces significant security and privacy risks. Users can input anything from personal data to carefully thought attacks. Guardrails again are crucial to minimize the risk of accepting or storing certain types of user inputs. As mentioned before, multidisciplinary collaboration and being close to security and privacy teams can help establish strategies and frameworks to address these challenges.

Wrapping It Up

So, is production-ready AI as easy as it seems? The answer is both yes and no. While it's easier than ever to get started with AI, the challenges around prediction feasibility, cost of being wrong, societal impact, and production realities can quickly turn even the simplest solution into a complex one.

Don't let these challenges hold you back from building amazing AI products, but make sure you fully understand these challenges to tackle and manage them from the beginning. With the right considerations, you can deliver delightful while responsible AI products to the world!