Teaching AI When to Say ‘I Don’t Know’: Appier’s Risk-Aware Breakthrough

Appier’s new research tackles one of AI’s most frustrating problems: systems that confidently give wrong answers. Their risk-aware framework teaches AI when to refuse instead of guess—and it could be what finally unlocks enterprise adoption.

When Appier’s research team in Singapore published their latest paper this week, they weren’t just adding another technical report to the AI research pile. They were tackling one of the most frustrating problems facing businesses trying to adopt artificial intelligence: how do you trust an AI system that can’t tell you when it’s guessing?

Think about that for a moment. We’ve all experienced it – asking an AI assistant a question and getting a confident, detailed answer that turns out to be completely wrong. In casual conversation, it’s annoying. In a business context – where decisions about finances, healthcare, or critical operations are on the line – it’s a dealbreaker.

That’s exactly the problem Appier’s new research addresses. Their paper, published on March 10th, introduces what they’re calling a “risk-aware decision framework” for AI systems. In plain English? They’re teaching AI when to say “I don’t know” instead of making something up.

The “Guess Problem” That’s Holding Back Enterprise AI

Here’s the reality check that Appier’s research highlights. According to a McKinsey survey from last year, 62% of organizations have started experimenting with AI agents. That’s the good news. The bad news? Inaccuracy remains the single biggest concern stopping wider adoption.

It’s not that businesses don’t see the potential of AI. They absolutely do. The promise of AI agents that can handle customer service, analyze data, or manage workflows autonomously is incredibly compelling. But there’s a fundamental trust issue: how do you deploy systems that might confidently give wrong answers about important matters?

Appier’s CEO, Chih-Han Yu, put it bluntly: “For Agentic AI to operate in critical enterprise workflows, the key is not only making AI smarter, but making its autonomous decisions more reliable.”

That last word – “reliable” – is the key. We’re moving beyond whether AI can do something to whether we can trust it to do the right thing.

Teaching AI the Art of Strategic Refusal

What makes Appier’s approach interesting isn’t just that they’re trying to make AI more accurate. It’s how they’re doing it. Traditional AI evaluation focuses on a simple question: was the answer correct?

Appier’s framework adds two crucial considerations: what’s the cost of being wrong, and what’s the value of refusing to answer?

Think about it like this. If you ask an AI system about tomorrow’s weather for planning a picnic, a wrong guess might mean you get wet. Annoying, but not catastrophic. If you ask the same system about medication interactions for a patient, a wrong guess could be life-threatening.

The smart response in these two scenarios should be different. For the picnic, taking an educated guess based on probability might be reasonable. For the medication question, saying “I’m not confident enough to answer-please consult a doctor” is the responsible choice.

Appier’s research found that most current AI systems don’t make this distinction well. In high-risk situations, they tend to over-guess. In low-risk scenarios, they can become overly conservative. It’s like having an assistant who either takes wild risks with important decisions or refuses to make even simple calls.

The Three-Step Process: How It Actually Works

So how does Appier’s framework actually teach AI to make better decisions? They break it down into three logical steps that mirror how humans think through uncertain situations:

Step 1: Task Execution – First, the AI tries to solve the problem and generate an answer. This is what current systems already do.

Step 2: Confidence Estimation – Here’s where things get interesting. The AI evaluates how confident it is in that answer. Not just a vague feeling, but a quantifiable assessment of its own certainty.

Step 3: Expected-Value Reasoning – This is the strategic part. The AI considers the potential outcomes: what happens if it’s right, what happens if it’s wrong, and what happens if it refuses to answer. Then it makes the decision that maximizes the expected positive outcome.

It’s a structured approach to decision-making that feels remarkably human. When we face uncertain situations, we don’t just blurt out answers. We consider our knowledge, assess our confidence, weigh the risks, and sometimes decide the smartest move is to say “I’m not sure.”

Why This Matters Beyond the Technical Details

You might be thinking this sounds like academic research that won’t affect real businesses for years. But here’s what’s different about Appier’s approach: they’re already integrating these findings into their commercial platforms.

Appier’s Ad Cloud, Personalization Cloud, and Data Cloud-platforms used by businesses for marketing, customer engagement, and data analysis-are being updated with these risk-aware capabilities. This isn’t theoretical research sitting in a lab; it’s practical methodology being deployed where it matters.

And the timing couldn’t be more relevant. As businesses move from using AI as “copilots” (assistants that suggest but don’t decide) to “agents” (systems that can act autonomously), the reliability question becomes critical. You can tolerate occasional errors from a suggestion tool. You can’t afford them from a system making autonomous decisions about customer interactions, financial transactions, or operational workflows.

The Bigger Picture: AI Growing Up

What Appier’s research represents is something bigger than just another technical improvement. It’s part of AI’s maturation from an impressive but unreliable novelty to a trustworthy tool for serious business applications.

We’ve spent years focused on making AI more capable-bigger models, more training data, better algorithms. Now we’re entering a phase where the focus is shifting to making AI more responsible. It’s not enough that AI can do something; we need to trust that it will do the right thing.

This shift mirrors how other technologies have matured. Early automobiles were exciting but dangerous novelties. It was only when we added safety features, regulations, and reliability standards that they became the transportation backbone of modern society. AI is going through a similar transition.

What This Means for Businesses Considering AI

For organizations looking to adopt AI more seriously, Appier’s research offers both reassurance and a framework for evaluation. The reassurance comes from knowing that serious work is being done on the reliability problem. The framework comes from the specific metrics and approaches they’ve developed.

When evaluating AI systems, businesses can now ask more sophisticated questions:

• How does this system handle uncertainty? Does it always guess, or does it know when to say “I don’t know”?

• Can it assess risk appropriately? Does it understand that some mistakes are more costly than others?

• Is there transparency in decision-making? Can we understand why it chose to answer, refuse, or guess?

These aren’t just technical questions anymore. They’re becoming essential criteria for responsible AI adoption.

Looking Ahead: The Path to Trustworthy AI

Appier’s research doesn’t solve all the challenges of trustworthy AI, but it represents significant progress on one of the most critical ones. By giving AI systems the ability to assess their own confidence and weigh risks appropriately, we’re moving closer to AI that businesses can actually rely on.

The implications extend beyond Appier’s specific platforms. The methodologies and frameworks they’ve developed provide a blueprint that other AI developers can follow. The concept of risk-aware decision-making could become a standard feature in enterprise AI systems, much like safety features became standard in automobiles.

As Chih-Han Yu noted, this research helps “accelerate the real-world adoption of Agentic AI and translate it into scalable business value and ROI.” That translation-from impressive technology to reliable business tool-is exactly what’s needed for AI to fulfill its potential.

What’s clear from Appier’s work is that the AI industry is recognizing that capability alone isn’t enough. Reliability, trustworthiness, and responsible decision-making are becoming just as important. And that recognition might be the most significant development of all.

After all, the most capable AI system in the world isn’t much use if you can’t trust it with important decisions. Appier’s research represents a meaningful step toward building AI that businesses can actually depend on-not just admire from a distance.

Britain’s £40 Million Bet: Can a New AI Lab Keep the UK Competitive?

The UK government has announced a new £40 million AI research lab aiming to solve fundamental problems like hallucinations and unreliable memory. But in a global AI race dominated by US and Chinese spending, can Britain’s focus on quality over quantity keep it competitive?

When the UK government announced a new £40 million AI research lab this week, it wasn’t just another funding announcement. It was a bold statement of intent in the global race for artificial intelligence supremacy-a declaration that Britain intends to stay in the “fast lane” of one of the most transformative technologies of our time.

Think about the scale of ambition here. While Silicon Valley giants are spending billions scaling up existing models, the UK is taking a different path: investing in fundamental research to solve AI’s core problems. “We are still only scratching the surface of this technology’s potential,” the announcement declares, aiming to tackle the “hallucinations, unreliable memory and unpredictable reasoning” that still plague even the most advanced AI systems.

Britain’s £40 Million Bet: Can a New Lab Keep the UK Competitive?

At first glance, £40 million over six years might seem like a modest investment in a field where companies like OpenAI and Google are spending billions. But this isn’t about competing on scale-it’s about competing on quality, on fundamental breakthroughs, on solving the problems that still hold AI back from its full potential.

The newly announced “Fundamental AI Research Lab” represents a strategic pivot for UK science policy. Rather than trying to outspend Silicon Valley or match China’s massive state investments, Britain is playing to its traditional strengths: world-class academic institutions, deep mathematical and computer science expertise, and a culture of blue-sky research that has produced Nobel prizes for decades.

AI Minister Kanishka Narayan put it bluntly: “If we want this technology to be a force for good, we need to make sure the next big AI breakthroughs are made in Britain.” This isn’t just about national pride-it’s about ensuring that when AI systems make decisions that affect people’s lives, from healthcare diagnoses to infrastructure management, those systems reflect British values and ethical frameworks.

Solving AI’s Core Problems: Beyond Just Scaling Models

What makes this initiative particularly interesting is its focus on fundamental research rather than incremental improvements. While most AI development today follows a predictable pattern-take existing architecture, add more data, train bigger models-the UK lab is targeting the underlying flaws that no amount of scaling can fix.

Think about the problems they’re aiming to solve:

• Hallucinations – When AI confidently states false information as fact

• Unreliable memory – The inability to maintain consistent context over long conversations

• Unpredictable reasoning – The “black box” problem where even developers don’t understand why AI makes certain decisions

These aren’t minor bugs to be patched in the next software update. They’re fundamental limitations of current AI architectures that require rethinking how these systems are built from the ground up.

As Dr Kedar Pandya, Executive Director of EPSRC’s Strategy Directorate, explained: “Fundamental research enables long-term breakthroughs in AI. The UK’s capability rests on exceptional talent and world-leading university excellence, which underpin today’s systems and will power the next generation of technologies.”

The Strategic Context: Part of a £1.6 Billion AI Push

This £40 million lab isn’t operating in isolation. It’s the first concrete step in delivering the UK Research and Innovation’s (UKRI) new AI Strategy-a £1.6 billion, four-year plan unveiled just two weeks ago. That broader strategy signals a major shift in how Britain approaches AI research and development.

The numbers tell an interesting story. While the government is committing £1.6 billion over four years, the UK’s private AI sector has already raised over £100 billion in investment since the current government took office. This suggests a complementary approach: government funding the high-risk, long-term fundamental research that private investors often avoid, while private capital focuses on commercial applications and scaling proven technologies.

Raia Hadsell, Google DeepMind’s Vice President of Research and the UK government’s AI Ambassador who will chair the lab’s peer review panel, highlighted this synergy: “AI has the ability to solve humanity’s most complex problems, and fundamental research that helps this technology achieve its full potential is key. The UK has the world-class talent and academic ecosystem to drive transformational research.”

Real-World Impact: From Railway Safety to Alzheimer’s Research

This isn’t just theoretical research for its own sake. The announcement points to concrete examples of how UK AI research is already making a difference:

• RADAR AI System – A world-leading system that detects faults on railway networks in real time, preventing accidents before they happen and keeping Britain’s transport infrastructure running smoothly.

• IXICO Neuroimaging Technology – An Imperial College London spinout using machine learning to accelerate clinical trial imaging for neurological diseases like Alzheimer’s, Parkinson’s, and Huntington’s disease. This technology helps pharmaceutical companies develop new treatments faster, potentially bringing life-changing medicines to patients years earlier.

These success stories demonstrate the practical benefits of investing in AI research. It’s not just about creating clever algorithms-it’s about solving real-world problems that affect people’s daily lives, from their commute to work to their grandparents’ healthcare.

The Global Context: UK vs US vs China

To understand why this announcement matters, we need to look at the global AI landscape. The United States dominates through massive private investment from tech giants and venture capital. China leads in state-directed research and deployment at scale. Europe, including the UK, has traditionally excelled at fundamental research and ethical frameworks.

Britain’s strategy appears to be carving out a distinctive niche: focusing on the quality of AI rather than just the quantity, on solving fundamental problems rather than just scaling existing solutions, and on ensuring AI development aligns with democratic values and ethical principles.

This approach plays to traditional British strengths in mathematics, computer science, and engineering-fields where UK universities consistently rank among the world’s best. It also leverages Britain’s unique position as a bridge between American technological innovation and European regulatory frameworks.

The Funding Challenge: Is £40 Million Enough?

The obvious question is whether £40 million over six years represents sufficient investment to make a meaningful difference. To put this in perspective:

• OpenAI reportedly spends hundreds of millions training each major model iteration

• Google and Meta invest billions annually in AI research and infrastructure

• China’s AI investments are measured in the tens of billions across state and private sectors

However, this comparison misses the point. The UK lab isn’t trying to compete on training compute or model scale. It’s focusing on a different kind of research-the kind that requires deep expertise, creative thinking, and theoretical breakthroughs rather than massive computing budgets.

The additional access to “AI Research Resource compute capacity worth tens of millions of pounds” suggests the government understands that some problems do require significant computing power. But the emphasis remains on smart research rather than brute force scaling.

What Success Would Look Like

So what would constitute success for this £40 million investment? Based on the government’s own announcement, several outcomes would signal the lab is delivering on its promise:

• Breakthroughs in AI reliability – Significant reductions in hallucinations and unpredictable behavior

• New architectural approaches – Moving beyond the transformer architecture that dominates today’s AI

• Practical applications – Real-world deployments in healthcare, transport, and public services

• Talent retention and attraction – Keeping Britain’s best AI researchers in the UK and attracting global talent

• Private sector follow-on investment – Companies building on the lab’s research to create commercial products

The funding call is “open for applications now,” with the government specifically inviting “the country’s AI experts to bring their boldest and most ambitious proposals forward.” This suggests they’re looking for transformative ideas rather than incremental improvements.

Lessons for the Global AI Community

Britain’s approach offers several lessons for other countries navigating their own AI strategies:

• Play to your strengths – Don’t try to compete directly with Silicon Valley or China on their terms

• Focus on fundamentals – Solving core problems creates lasting competitive advantage

• Bridge public and private – Government funding for high-risk research complements private sector scaling

• Prioritize real-world impact – Connect research to practical applications that benefit society

• Maintain ethical leadership – Use research to shape how AI develops, not just accelerate its development

As AI Minister Narayan emphasized: “This is a long-term investment in the brilliant minds who will keep the UK in the AI fast lane. If we are the ones breaking new ground on what AI can do, we can make sure our values are baked in from the outset.”

Looking Ahead: The UK’s AI Future

The announcement of this new AI research lab represents more than just another government funding program. It’s a statement about how Britain sees its role in the AI revolution-not as a passive consumer of technology developed elsewhere, but as an active shaper of how this transformative technology evolves.

By focusing on fundamental research, the UK is investing in the foundations of future AI systems. By prioritizing reliability and transparency, they’re addressing the concerns that threaten public trust in AI. And by connecting research to real-world applications in healthcare, transport, and public services, they’re ensuring that AI development delivers tangible benefits to society.

The £40 million question (literally) is whether this targeted investment in quality over quantity, in fundamentals over scale, can keep Britain competitive in a global race where other players are spending orders of magnitude more. If successful, it could provide a model for how medium-sized economies can punch above their weight in the AI era-not by trying to outspend the giants, but by outthinking them.

As the funding applications open and Britain’s AI researchers begin pitching their “boldest and most ambitious proposals,” we’ll be watching to see whether this strategic bet on fundamental research pays off. In a field where most attention focuses on who has the biggest models or the most computing power, Britain is making a different wager: that solving AI’s core problems matters more than simply scaling existing solutions.

Only time will tell if this approach keeps the UK in the AI fast lane. But one thing is clear: in the global race for artificial intelligence leadership, Britain has just signaled it intends to be a driver, not just a passenger.

The AI Ethics Battle: When Military Contracts Trump Moral Boundaries

Sam Altman’s admission that OpenAI can’t control Pentagon AI use reveals the ethical divide splitting Silicon Valley. Here’s what it means for the future of artificial intelligence.

When Sam Altman stood before his OpenAI employees last Tuesday and admitted the company couldn’t control how the Pentagon uses their AI, it wasn’t just another corporate announcement. It was a moment that laid bare the fundamental tension between technological innovation and ethical responsibility in the age of artificial intelligence.

Think about that for a second. The CEO of one of the world’s most influential AI companies is telling his team they have zero say in how their creations get used in military operations. “You do not get to make operational decisions,” Altman reportedly said. “So maybe you think the Iran strike was good and the Venezuela invasion was bad. You don’t get to weigh in on that.”

The Ethical Divide That’s Splitting Silicon Valley

What makes this story particularly fascinating isn’t just Altman’s admission, but the stark contrast with how his competitors are handling the same dilemma. While OpenAI was signing that Pentagon deal, Anthropic-OpenAI’s main rival and creator of the Claude chatbot-was taking a completely different path.

Anthropic refused the Pentagon’s offer outright, citing concerns their technology could be used for domestic mass surveillance or fully autonomous weapons. The response from Defense Secretary Pete Hegseth was immediate and unprecedented: he declared Anthropic a “supply-chain risk,” a designation never before used against a U.S. company.

Here’s where it gets really interesting. On the exact same day Hegseth was threatening punitive measures against Anthropic, the Pentagon announced its deal with OpenAI. The timing couldn’t have been more obvious-OpenAI was stepping in to replace Claude in military applications, crossing ethical lines that Anthropic refused to cross.

When “Move Fast and Break Things” Meets Military Operations

This isn’t just theoretical ethics debate anymore. AI-enabled systems have reportedly already been used in real military operations-from the U.S. military’s operation to seize Venezuelan leader Nicolás Maduro to targeting decisions in the war against Iran. The Pentagon isn’t asking for theoretical AI capabilities; they’re demanding companies remove safety guardrails to allow broader military applications.

Altman’s damage control admission that the deal was “rushed out” and made OpenAI look “opportunistic and sloppy” feels like an understatement. When you’re dealing with technology that could literally mean life or death decisions on the battlefield, “sloppy” takes on a whole new meaning.

The Pragmatic Case: Could AI Actually Save Lives?

Here’s where the conversation gets more nuanced. While we’re rightly concerned about AI ethics in military applications, there’s a pragmatic argument worth considering: could advanced AI actually prevent unnecessary casualties?

Think about it from a military perspective. Traditional warfare often involves what military strategists call “collateral damage”-civilian casualties that occur because human operators have limited information, reaction times, and decision-making capacity under extreme stress. AI systems, in theory, could:

• Improve target identification accuracy – Reducing the risk of hitting civilian infrastructure or non-combatants

• Process more data in real-time – Analyzing satellite imagery, drone feeds, and intelligence reports simultaneously to make more informed decisions

• Enable precision strikes – Minimizing the need for broader, more destructive military campaigns

• Reduce human error – Eliminating fatigue-induced mistakes or emotional reactions in high-pressure situations

This isn’t just theoretical. Early reports from the Iran conflict suggest AI-assisted targeting systems have shown promising results in distinguishing between military and civilian targets with higher accuracy than human operators alone.

The uncomfortable truth is that warfare isn’t going away anytime soon. If nations are going to engage in military conflicts-and history suggests they will-then shouldn’t we want those conflicts to be as precise, controlled, and minimally destructive as possible?

This is the pragmatic argument that OpenAI and other companies might be making behind closed doors. It’s not about creating killer robots; it’s about creating systems that could potentially make warfare less terrible than it has to be.

The Political Money Trail Behind AI Decisions

What’s even more revealing is the political dimension that’s emerged. Anthropic’s CEO, Dario Amodei, didn’t hold back in a memo to employees, calling Altman “mendacious” and accusing him of giving “dictator-style praise to Trump.”

But here’s the kicker: Amodei claimed the real reason the Pentagon and Trump administration don’t like Anthropic is that “we haven’t donated to Trump (while OpenAI/Greg have donated a lot).” He was referring to Greg Brockman, OpenAI’s president, who reportedly gave $25 million to a PAC supporting Trump.

Think about that implication for a moment. Are we entering an era where military AI contracts get decided not by which technology is safest or most ethical, but by which company’s executives make the biggest political donations?

The Expertise Gap: When Silicon Valley Meets the Pentagon

There’s an interesting dynamic at play here that often gets overlooked in these discussions. The world of Silicon Valley and the world of national security operate on very different timelines, with very different expertise.

Sam Altman and Dario Amodei are undoubtedly brilliant in their respective domains-building AI systems and advancing machine learning research. But the skills that make someone successful in Silicon Valley don’t necessarily translate to understanding the complex realities of national security and military strategy.

Consider the different worlds these leaders come from. In tech, success often comes from moving quickly, iterating rapidly, and “disrupting” established systems. In national security, success often comes from careful deliberation, understanding historical context, and maintaining stability in incredibly complex geopolitical landscapes.

This isn’t to say tech leaders can’t contribute valuable insights to military applications-their technical expertise is precisely what the Pentagon needs. But it does suggest there might be a learning curve when it comes to understanding:

• The nuances of military decision-making – Where split-second choices have consequences that echo for generations

• Geopolitical relationships – Built over decades of delicate diplomacy

• The ethical frameworks – That have evolved through centuries of warfare and international law

• The human dimension – That no algorithm can fully capture or comprehend

What’s interesting about Altman’s admission that OpenAI can’t control how the Pentagon uses their AI is that it hints at this gap in understanding. It’s not just about contractual limitations-it’s about the reality that building a tool and understanding all its potential applications in complex military contexts are two very different things.

This isn’t unique to AI or to these particular leaders. Throughout history, technological innovators have often struggled to anticipate how their creations will be used in military contexts. The inventors of dynamite, the airplane, even the internet-all faced similar realizations that once technology leaves the lab, its uses multiply in unpredictable ways.

Perhaps what we’re seeing here is less about individual failings and more about the natural tension that occurs when fast-moving technology meets the deliberate, cautious world of national security. Both domains have valuable expertise to offer, but they speak different languages, operate on different timelines, and prioritize different values.

The challenge-and the opportunity-is finding ways to bridge this gap. How can we ensure that technological innovation benefits from military expertise about real-world applications, while military strategy benefits from Silicon Valley’s technical brilliance, without either side losing what makes them valuable in the first place?

It’s a delicate balance, and one that requires humility from both sides. Tech leaders recognizing that building the tool is just the beginning of understanding its implications. And military leaders recognizing that new technologies require new ways of thinking about old problems.

What This Means for the Future of AI Ethics

This OpenAI-Pentagon saga represents a critical inflection point for the entire AI industry. We’re seeing three distinct approaches emerging:

1. The Pragmatic Path (OpenAI) – Work with the military while trying to maintain some ethical boundaries, even if you admit you can’t control how your technology gets used.

2. The Principled Stand (Anthropic) – Refuse military contracts that cross ethical red lines, even if it means being designated a national security risk.

3. The Employee Backlash – Tech workers increasingly questioning whether they want their code used in military applications, creating internal pressure on companies.

The reality is that AI ethics can’t just be theoretical discussions in conference rooms anymore. When your technology is being used to make targeting decisions in actual wars, the ethical considerations become immediate and concrete.

Where Do We Go From Here? Lessons for a Changing Industry

So what does this mean for where we go from here? A few key lessons are emerging from this OpenAI-Anthropic divide:

• Transparency matters more than ever – Companies need to be upfront about their military partnerships before they’re forced into damage control mode.

• Employee concerns can’t be ignored – The internal backlash at OpenAI shows that tech workers are increasingly willing to speak out against ethical compromises.

• Political neutrality is becoming impossible – As AI becomes more integrated with national security, companies will inevitably get drawn into political battles.

• “We can’t control it” isn’t good enough – Altman’s admission highlights the need for stronger governance frameworks before technology gets deployed, not after.

What’s clear is that we’re moving beyond the era where AI ethics was just about bias in hiring algorithms or content moderation. We’re now dealing with questions about life-and-death military applications, and the industry’s response to these challenges will define its relationship with society for decades to come.

The real test won’t be which company builds the most powerful AI, but which one manages to balance innovation with responsibility when the stakes are this high. And right now, that balance looks more precarious than ever.