Claude AI Model Behavior

CNET on MSN

Is AI Capable of 'Scheming?' What OpenAI Found When Testing for Tricky Behavior

Research shows advanced models like ChatGPT, Claude and Gemini can act deceptively in lab tests. OpenAI insists it's a rarity.

9hon MSN

Claude AI glitch explained: Anthropic blames routing errors, token corruption

When users of Claude AI noticed odd behavior in late August and early September – from garbled code to inexplicable outputs ...

1don MSN

AI Is Scheming, and Stopping It Won’t Be Easy, OpenAI Study Finds

New research finds that top AI models—including Anthropic’s Claude and OpenAI’s o3—can engage in “scheming,” or deliberately ...

10d

Claude’s new AI file-creation feature ships with security risks built in

The feature, awkwardly named "Upgraded file-creation and analysis," is basically Anthropic's version of ChatGPT's Code ...

NewsBytes

AI models can hide their 'bad' behavior, study finds

New research from Apollo Research and OpenAI indicates that AI models are aware when they're being evaluated and can modify their behavior accordingly.

eWeek

OpenAI and Anthropic Stress-Tested Each Other’s AI: Here’s What They Found

The companies relaxed some safeguards around their AI models to let their competitors see how often extreme behavior occurs.

How Claude Code Sub-Agents Are Redefining Ultrathink Mode AI Reasoning

Discover how sub-agents in Claude Code overcome tunnel vision and unlock smarter AI problem-solving with diverse reasoning ...

Wall Street is beginning to worry about AI 'psychosis risk.' See which models ranked best and worst.

Barclays analysts highlight a study revealing stark differences in how effectively AI models handle mental health situations.

Irregular raises $80M to set AI security standards for frontier models

Artificial intelligence security lab startup Irregular announced today that it has raised $80 million in new funding to build ...

14d

Claude Code vs ChatGPT 5 Codex : AI Coding CLIs Compared for Developers

Learn how Claude Code vs Codex AI tools compare in features, usability, and performance to optimize your coding process. Find ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results