Elon Musk has just released an AI that is smarter than chatgpt – this is why that matters

Elon Musk has just released an AI that is smarter than chatgpt - this is why that matters

Become a member of our daily and weekly newsletters for the latest updates and exclusive content about leading AI coverage. Leather


Elon Musk’s Artificial Intelligence Startup Xai has unveiled Grain 3The newest AI model that the company claims to perform better than leading competitors in important technical benchmarks. The announcement marks an important escalation in the race to develop more powerful AI systems.

The launch comes only a few days later Musk’s failed $ 97.4 billion bid to acquire OpenAiThe company that he founded with Sam Altman in 2015. During a live streamed demonstration on X, Musk Grok 3 characterized as “an order of size that is better more capable than grok 2” and emphasized the ability to reason through complex problems.

Early tests seem to support some claims from Xai. The model was at the top of the influential Chatbot Arena Leaderboard, scoring higher than that of OpenAi GPT-4OGoogle’s Twin And Deepseek’s V3 model In blind user tests. Published benchmarks show that Grok achieved 3 superior scores in mathematics (AIME ’24), Scientific Reasoning (GPQA) and coding tasks.

Grok 3 leads the Chatbot Arena -Leaderboard with a score of approximately 1400, which performs considerably better than other large AI models in blind user tests. (Source: Xai)

Inside Grok 3’s Massive Computing Infrastructure: 200,000 GPUs and a new data center

“Grok 3 clearly has around the state of the art thinking possibilities,” wrote the former OpenAi researcher Andrej Karpathy In an X post after testing early access. “Few models get this well. The top openi thinking models also get it, but all Deepseek-R1, Gemini 2.0 flash thinking, and not Claude. “

See also  Trump, Musk and the future of the housing market

The development of the model required huge computer resources. Xai doubled his GPU cluster to 200,000 Nvidia chips for training, housed in a new Memphis datacenter. This infrastructure investment emphasizes the increasing calculation requirements of advanced AI development, while companies are racing to build more capable systems.

DeepSearch and Advanced Reasing: How Grok 3 wants Chatgpt and Google Gemini to be too smart

An important innovation is the “DeepSearch” function of grok 3, which combines the search for web with reasoning options for analyzing information from multiple sources. The system also contains specialized modes for complex problem solving, including a “Think” function that shows its reasoning process and a “big brain” mode that assigns extra computing power to difficult tasks.

“The thing to really pay attention in AI is the learning speed. And @xai is much faster learning than any other, ”placed the veteran of the technical industry Robert ScobleWith reference to a conversation with Apple Siri -founder Tom Gruber.

However, some limitations have emerged during testing. Karpathy noted that the model sometimes manufactures and struggles quotes with certain types of humor and ethical reasoning tasks. These challenges are common in current AI systems and emphasize the constant difficulties in developing real human artificial intelligence.

See also  Google Gemini 2.0: Could this be the start of truly autonomous AI?

Scale.ai CEO Alexandr Wang The release, Tweeting: “Grok 3 is a new best model in the world of the @xai team!” He noticed his superior performance at various benchmarks and expressed enthusiasm for future cooperation.

AI Industry Competition Witing Up: What Grok 3’s Launch means for OpenAi, Deepseek and the future of artificial intelligence

The model will be available via X’s Premium+ subscription ($ 40/month) and a new self -employed person “Super grock“Service ($ 30/month). Enterprise API access is planned for the coming weeks.

This launch reinforces competition in the AI ​​industry, especially as a Chinese startup Deep Recently demonstrated similar performance with reportedly lower calculation requirements. The development also raises questions about the sustainability of the computational arms race in AI, because companies invest billions in increasingly powerful hardware infrastructure.

In important performance benchmarks, GROK 3 and mini variant are superior scores in mathematics, science and coding tests compared to competing models from Google, OpenAi, Anthropic and Deepseek. The Full-Size Grok 3 model (dark blue) achieved particularly strong results in scientific reasoning. (Source: Xai)

Musk emphasized that Grok 3 remains in beta, with expected improvements “Almost every day. “The company is planning to add speech interaction options within a few weeks and will open its previous model, Grok 2, as soon as the new version stabilizes.

But perhaps the most meaningful aspect of Grok 3’s debut 3 is not the technical specifications or benchmark scores, but what it represents: the increasing tension between Musk and his former colleagues at OpenAi. Only a few days after his failed bid of $ 97.4 billion to acquire OpenAi, Musk has unveiled a model that challenges its supremacy-which suggests that in the high-stakes race for AI-Dominance, even a rejected cheater can be a formidable rival become.

See also  SpaceX Starship Explosion Likely Caused by Propellant Leak, Elon Musk Says

Source link