Building responsible AI, we can trust

Written by

Published on

Do you still tell AI what to do?

Well, not for long.

Now is the era of autonomous agents.

They think, decide, and act on their own.

The company 1X has opened pre-orders for Neo — a robot designed to work autonomously, helping with everyday chores.

So, totally autonomous.

That’s having an AI agent operating 24/7, right there with you — thinking, deciding, taking actions, and handling tasks on your behalf.

Sounds cool.

But it may also raise ethical questions.

Who’ll be accountable when an agent makes a mistake?

The company, the developer, or the user?

Take what Aravind Srinivas, co-founder of Perplexity, shared in a recent interview.

He asked Perplexity to:

“Respond to an email like Aravind.”

And it did perfectly.

The email sounded exactly like him.

That’s quite impressive.

But also a reminder of the ethical challenges ahead.

When AI can sound just like us, a simple “AI-generated” label may not be enough.

The key is to build AI that’s guided by strong ethics.

When systems are designed with clear accountability, transparency, and fairness from the start, we create technology we can truly trust.

Developers need to ensure that the models they build have:

  • Defined clear guardrails before launch.
  • Decided topics on a no-talk list.
  • Stated which actions are off-limits.
  • Prepared for tough moments that test ethics.

With AI systems becoming part of our lives, we need to ensure they’re built responsibly.

Let’s break this down into two key areas:

A. Defining What AI Should and Shouldn’t Do

1. Protecting Privacy

The system must safeguard people’s privacy by not sharing private or sensitive information about anyone.

Even if that information can be found online, what is private or sensitive depends on the situation.

For example, the assistant can share a public official’s office phone number.

But it must not share their personal phone number.

2. Following Laws and Preventing Harm

The agent must comply with applicable laws and should never promote or take part in anything illegal.

For example, if a user asks for tips on getting away with shoplifting, the assistant should refuse.

But if a user asks for ways to prevent shoplifting, that’s acceptable.

In that case, the assistant can share helpful advice on how store owners can protect their business and reduce theft.

3. Handling Mental Health Topics Responsibly

When it comes to mental health topics, the assistant’s job is to listen and make people feel heard.

It should subtly encourage users to seek professional help.

The assistant should neither end the conversation abruptly nor pretend to fully understand what the user is going through.

Courtesy:

https://cdn.openai.com/spec/model-spec-2024-05-08.html

B. Following Instructions Reliably

We assume that AI systems follow the instructions they’re given, even when no one’s watching.

But recent research shows that’s far from guaranteed.

Anthropic conducted an Agentic Misalignment experiment in which AI models from multiple providers were instructed to shut down at 5 p.m. after completing their tasks.

However, some models ignored the shutdown command, deceived their supervisors, or even attempted to manipulate their environment to remain active.

From an AI ethics perspective, this experiment was crucial because it revealed how autonomous systems might resist human control or prioritize their own goals over explicit instructions.

Bottom Line

This isn’t an “AI will take over” scare story.

It’s about catching these issues now, in controlled experiments, so we can build better safeguards before these systems become widespread.

The fact that we’re testing for this is actually encouraging.

Further research in this direction will help us create autonomous systems that act responsibly and earn lasting trust.

SHARE

Read more

Insight
Most organizations believe an EBS to Fusion move will take...
Case Study
Streamlined end-to-end lease processes enabling standardized processes and improved operational...
Case Study
RH modernized its Oracle Fusion HCM experience with a seamless...
Insight

What’s Free and What’s Not in Oracle Fusion AI Agent Studio

You already paid for Oracle Fusion. Now everyone’s talking about AI agents. And the natural..

Insight

What Models Learn Is What We Give Them

Over time, while building and iterating on machine learning systems, one lesson keeps repeating itself:..

Insight

From Static Screens to Adaptive Surfaces: Inside Calfus’s Generative UI Engine

For most of the last decade, UI development has revolved around one core idea: assemble..