Deep search
All
Copilot
Images
Videos
Maps
News
Shopping
More
Flights
Travel
Hotels
Search
Notebook
Top stories
Sports
U.S.
Local
World
Science
Technology
Entertainment
Business
More
Politics
Any time
Past hour
Past 24 hours
Past 7 days
Past 30 days
Best match
Most recent
Allen Institute for AI challenges DeepSeek on key benchmarks with big new open-source AI model
Amid the industry fervor over DeepSeek, the Seattle-based Allen Institute for AI (Ai2) released a significantly larger version of its Tülu 3 AI model, aiming to further advance the field of open-source artificial intelligence and demonstrate its own techniques for enhancing the capabilities of AI models.
Leave Deepseek, China’s new AI model Kimi k1.5 also surpasses ChatGPT in key benchmarks
Moonshot AI's Kimi k1.5 outperforms OpenAI's GPT-4o and Claude 3.5 Sonnet in key areas, showcasing superior multimodal abilities.
DeepSeek claims its ‘reasoning’ model beats OpenAI’s o1 on certain benchmarks
Chinese AI lab DeepSeek has released an open version of DeepSeek-R1, its so-called reasoning model, that it claims performs as well as OpenAI’s o1 on certain AI benchmarks. R1 is available from the AI dev platform Hugging Face under an MIT license,
12h
on MSN
Ai2 says its new AI model beats one of DeepSeek’s best
Move over, DeepSeek. Seattle-based nonprofit AI lab Ai2 has released a benchmark-topping model called Tulu3-405B.
3d
'Humanity's Last Exam' benchmark is stumping top AI models - can you do any better?
A new academic benchmark aims to 'test the limits of AI knowledge at the frontiers of human expertise.' So far, these LLMs ...
3d
on MSN
How DeepSeek achieved its AI breakthrough, Benchmark partner Chetan Puttagunta explains
Chinese AI startup DeepSeek is sending tech stocks plunging as the market digests what its cheaper and more efficient model ...
1d
Alibaba releases AI model it says surpasses DeepSeek
Max's release points to the pressure DeepSeek's meteoric rise in the past three weeks has placed on overseas rivals and ...
2d
Alibaba’s Qwen2.5-Max challenges U.S. tech giants, reshapes enterprise AI
Alibaba's Qwen2.5-Max AI model sets new performance benchmarks in enterprise-ready artificial intelligence, promising reduced ...
7d
on MSN
Even some of the best AI can’t beat this new benchmark
The nonprofit Center for AI Safety and Scale AI have released a challenging new benchmark for frontier AI systems.
1d
on MSN
AI battle heats up: Alibaba claims Qwen 2.5 Max outperforms Meta and DeepSeek in benchmarks
Alibaba claims that its new AI model, Qwen 2.5 Max, demonstrates superior performance over competitors like Meta's Llama and ...
1d
on MSN
DeepSeek: everything you need to know about the AI that dethroned ChatGPT
Chinese startup DeepSeek has been taking the AI industry by storm with a new chatbot rivaling ChatGPT and Gemini that uses a ...
Hosted on MSN
6d
OpenAI Accused of Manipulating Benchmark Results as Chinese Models Close AI Performance Gap
It was recently revealed that OpenAI secretly funded and accessed data related to the FrontierMath AI benchmark. The ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results
Feedback