Connect with us

Technology

Did Xai lie about GROK 3 comparative tests?

Published

on

The xAI Grok AI logo

Debates on AI comparative tests – and their reporting by AI Labs – spill at the general public.

This week, Openai worker accused Elon Musk’s Ai Company, XAI, publishing comparative results for his or her latest AI model, Grok 3. One of the co -founders of XAI, Igor Babushkin, he insisted that the corporate had the suitable.

The truth lies somewhere in between.

Advertisement

IN Publish on the XAI blogThe company has published a chart showing the outcomes of GROK 3 on Aime 2025, a set of adverse mathematical questions from the recent Invitational mathematical exam. Some experts have Aime validity as a AI reference point. Nevertheless, AIME 2025 and older versions of the test are widely used to look at the mathematical ability of the model.

The XAI chart showed two variants of GROK 3, Grok 3 Reasoning Beta and GroK 3 mini reasoning, beating the very best available OpenAI, O3-Mini-High, on Aime 2025. But OpenAI employees on X quickly noticed that the XAI chart XAI chart. He didn’t consider the AME 2025 O3-Mini-High lead to “Cons@64”.

What is Cons@64, are you able to ask? Well, that is the abbreviation for “Conszeus@64” and principally gives model 64 tries to reply every problem in relation and accepts answers most frequently generated as final answers. As you may imagine, Cons@64 tends to extend the outcomes of the models, and skipping it from the chart may cause one model to surpass one other when it shouldn’t be in point of fact.

GROK 3 Beta and grok 3 mini reasoning for AIME 2025 at “@1”-what implies that the primary result, which models have achieved at a distance-see below the results of the O3-Mini-High. Grok 3 Reasoning Beta also the trail also behind the O1 Openai model on “Medium” Computing. However, XAI is GROK 3 promoting As “the smartest artificial intelligence of the world.”

Advertisement

Babushkin Ox was arguing that OpenAI previously published similarly misleading comparative charts – although charts comparing the performance of its own models. A more neutral event in the talk has developed a more “accurate” chart showing almost every model in Cons@64:

But as a researcher AI Nathan Lambert He identified within the postPerhaps crucial metric stays a secret: the calculation (and money) cost he needed for every model to realize his best result. It simply shows how little a lot of the test tests AI communicates about the restrictions of models – and their strengths.

(Toshma of All State) (Enter updates) in Triptaren !!!

Advertisement
This article was originally published on : techcrunch.com

Technology

Microsoft Nadella sata chooses chatbots on the podcasts

Published

on

By

Satya Nadella at Microsoft Ignite 2023

While the general director of Microsoft, Satya Nadella, says that he likes podcasts, perhaps he didn’t take heed to them anymore.

That the treat is approaching at the end longer profile Bloomberg NadellaFocusing on the strategy of artificial intelligence Microsoft and its complicated relations with Opeli. To illustrate how much she uses Copilot’s AI assistant in her day by day life, Nadella said that as a substitute of listening to podcasts, she now sends transcription to Copilot, after which talks to Copilot with the content when driving to the office.

In addition, Nadella – who jokingly described her work as a “E -Mail driver” – said that it consists of a minimum of 10 custom agents developed in Copilot Studio to sum up E -Mailes and news, preparing for meetings and performing other tasks in the office.

Advertisement

It seems that AI is already transforming Microsoft in a more significant way, and programmers supposedly the most difficult hit in the company’s last dismissals, shortly after Nadella stated that the 30% of the company’s code was written by AI.

(Tagstotransate) microsoft

This article was originally published on : techcrunch.com
Continue Reading

Technology

The planned Openai data center in Abu Dhabi would be greater than Monaco

Published

on

By

Sam Altman, CEO of OpenAI

Opeli is able to help in developing a surprising campus of the 5-gigawatt data center in Abu Dhabi, positioning the corporate because the fundamental tenant of anchor in what can grow to be considered one of the biggest AI infrastructure projects in the world, in accordance with the brand new Bloomberg report.

Apparently, the thing would include a tremendous 10 square miles and consumed power balancing five nuclear reactors, overshadowing the prevailing AI infrastructure announced by OpenAI or its competitors. (Opeli has not yet asked TechCrunch’s request for comment, but in order to be larger than Monaco in retrospect.)

The ZAA project, developed in cooperation with the G42-Konglomerate with headquarters in Abu Zabi- is an element of the ambitious Stargate OpenAI project, Joint Venture announced in January, where in January could see mass data centers around the globe supplied with the event of AI.

Advertisement

While the primary Stargate campus in the United States – already in Abilene in Texas – is to realize 1.2 gigawatts, this counterpart from the Middle East will be more than 4 times.

The project appears among the many wider AI between the USA and Zea, which were a few years old, and annoyed some legislators.

OpenAI reports from ZAA come from 2023 Partnership With G42, the pursuit of AI adoption in the Middle East. During the conversation earlier in Abu Dhabi, the final director of Opeli, Altman himself, praised Zea, saying: “He spoke about artificial intelligence Because it was cool before. “

As in the case of a big a part of the AI ​​world, these relationships are … complicated. Established in 2018, G42 is chaired by Szejk Tahnoon Bin Zayed Al Nahyan, the national security advisor of ZAA and the younger brother of this country. His embrace by OpenAI raised concerns at the top of 2023 amongst American officials who were afraid that G42 could enable the Chinese government access advanced American technology.

Advertisement

These fears focused on “G42”Active relationships“With Blalisted entities, including Huawei and Beijing Genomics Institute, in addition to those related to people related to Chinese intelligence efforts.

After pressure from American legislators, CEO G42 told Bloomberg At the start of 2024, the corporate modified its strategy, saying: “All our Chinese investments that were previously collected. For this reason, of course, we no longer need any physical presence in China.”

Shortly afterwards, Microsoft – the fundamental shareholder of Opeli together with his own wider interests in the region – announced an investment of $ 1.5 billion in G42, and its president Brad Smith joined the board of G42.

(Tagstransate) Abu dhabi

Advertisement
This article was originally published on : techcrunch.com
Continue Reading

Technology

Redpoint collects USD 650 million 3 years after the last large fund at an early stage

Published

on

By

Redpoint Ventures, an organization based in San Francisco, which is a few quarter of a century, collected $ 650 million at an early stage, in keeping with A regulatory notification.

The latest RedPoint fund corresponds to the size of its previous fund, which was collected barely lower than three years ago. On the market where many enterprises reduce their capital allegations, this cohesion may indicate that limited partners are relatively satisfied with its results.

The company’s early stage strategy is managed by 4 managing partners: Alex Bard (pictured above), Satish Dharmraraj, Annie Kadavy and Eric Brescia, who joined the company in 2021 after he served as the operational director of Githuba for nearly three years.

Advertisement

The last outstanding investments of the RedPoint team at an early stage include AI Coding Pool Pool, which was founded by the former partner Redpoint and CTO GitHub Jason Warner, distributed laboratories of SQL database programmers and Platform Management Platform Platform Levelpath.

A multi -stage company also conducts a development strategy led by Logan Barlett, Jacob Effron, Elliot Geidt and Scott Raney partners. Last 12 months, Redpoint raised its fifth growth fund at USD 740 million, which is a small increase in the USD 725 million fund closed three years earlier.

The recent RedPoint outputs include the next insurance, which was sold for $ 2.6 billion in March, Tastemada Startup Media Travel -utar -Media was enriched by Wonder for $ 90 million, and the takeover of Hashicorp $ 6.4 billion by IBM.

Redpoint didn’t answer the request for comment.

Advertisement

(Tagstranslate) Early Stage Venture Capital (T) Basenside (T) Redpoint Venture Partners

This article was originally published on : techcrunch.com
Continue Reading
Advertisement

OUR NEWSLETTER

Subscribe Us To Receive Our Latest News Directly In Your Inbox!

We don’t spam! Read our privacy policy for more info.

Trending