Latest from Tom's Hardware
Palantir CEO Alex Karp claims AI companies are stealing customers' data while charging them for unproductive tokens — says 'livid' businesses 'are paying for tokens that create no value' 2 July 2026 at 20:27

Palantir CEO Alex Karp claims AI companies are stealing customers' data while charging them for unproductive tokens — says 'livid' businesses 'are paying for tokens that create no value'

By:editors@tomshardware.com (Bruno Ferreira)

2 July 2026 at 20:27

Alex Karp, CEO of well-known AI data analytics company Palantir, delivered quite the bombshell of an interview to CNBC's Squawk Box. Although the interview's topic was about the firm's partnership with Nvidia, apropos the recently launched Sovereign AI OS Architecture, Karp bluntly claimed that frontier AI companies like OpenAI and Anthropic siphon customers' valuable information while delivering questionable value.

He continued by stating that American enterprises are quietly "livid," as "they are paying for tokens that create no value," and that the AI players "are stealing [their customers'] weights and alpha." The latter items refer to customers' business processes and interconnections between their data, along with the data itself. Palantir's shares jumped about 9% the day of the interview, while those of other AI companies experienced a dip.

Palantir's CEO just exposed Sam Altman and Dario Amodei for robbing every Fortune 500 company.Within two minutes, Alex Karp took the entire frontier AI industry apart on national television.His exact words: "Every single enterprise in this country, these people are LIVID.… pic.twitter.com/132b5s6dQGJuly 1, 2026

For context, many of Palantir's products are on-premises solutions or a variation thereof, and they carry a truckload of certifications like the DOD-required CMMC Level 2 or ISO27001/17/18. Karp's business also alleges that it does not train any models and merely utilizes other entities', without retraining them with customer data. Instead, the company's particular approach is coined "ontology" and, as a simplification, focuses on business data classification, entity definitions, and behavior.

Improving the training of an LLM requires an influx of new and improved information, which is why Karp claims that frontier labs are double-dipping by both selling customers LLM utilization all while using their data for improving said LLMs — in other words, the risk for a customer is that they're arguably teaching the bots' abilities and information that could get their business easily replicated and potentially replaced.

He puts the value of a token in question by using an old business analogy: if the frontier players supposedly generate so much value for their customers, why don't they treat it as an investment and charge a percentage of said value? Not too long ago, Palantir CTO Shyam Sankar shared the same view in an equally abrasive manner: "more tokens means more slop," questioning the productivity gains of the "tokenmaxxing" fad that tech leaders like Nvidia's Jensen Huang have promoted.

Karp is likewise not too keen on the promises that frontier companies make about data harvesting, calling Silicon Valley's general attitude of "you can trust me because I never lied" straight-up "B.S." He further notes that enterprises want to know who owns the data, where it is cached, and whether the prompts are secure, while also taking a dim view of services that then rely on third parties, as those might not be bound by the same contractual obligations. Furthermore, he described the notion of the Silicon Valley zeitgeist applying its views to defense-related information as effing insane [sic].

A portion of the world has a dim view of Palantir's defense-related business ethics, something that Karp acknowledges, all while displaying at least some self-awareness that Silicon Valley leadership arguably lacks. He plainly states that he too profits from the aforementioned practices, though there's little doubt his talking points serve the interest of selling on-premises services.

OpenAI mulling giving US gov't a 5% stake in the company, days after Washington delayed GPT-5.6 — Altman reportedly wants every leading U.S. AI lab paying into an Alaska-style public fund

Latest from Tom's Hardware

2 July 2026 at 16:06

OpenAI has discussed handing the U.S. government a 5% ownership stake, the Financial Times has reported, citing two people familiar with the talks, with CEO Sam Altman proposing that every leading U.S. AI developer contribute the same share of its equity to a vehicle modeled on the Alaska Permanent Fund, which pays annual dividends to state residents from Alaska’s oil wealth. At the $852 billion valuation OpenAI set in its March funding round, a 5% stake is worth roughly $42.6 billion. The FT characterized the discussions as conceptual and early-stage, and reported that implementing any deal might require an act of Congress.

Go deeper with TH Premium: AI and data centers

Microsoft data center in Mount Pleasant, Wisconsin — (Image credit: Microsoft)

Altman is understood to have raised the idea with President Donald Trump, Commerce Secretary Howard Lutnick, and Treasury Secretary Scott Bessent, and has spoken with Senator Bernie Sanders (I-Vt.) in recent weeks. The all-labs structure would pull equity from companies, including Google, Meta, and Anthropic, none of which have indicated they would participate. OpenAI declined to comment to the FT, and the White House didn’t immediately respond.

Altman’s 5% is the smallest figure we’ve seen to date attached to public ownership of the AI sector. Sanders filed the American AI Sovereign Wealth Fund Act in June, seeking 50% of the voting shares of U.S. AI companies through a fund his office valued at $7 trillion, enough to pay every American a $1,000 annual dividend. Trump said last month that he was exploring options to give the public a stake in leading AI firms, and Vice President JD Vance said the president prefers equity over cash payouts.

The administration has already run this playbook on chipmakers, with the federal government having taken a 9.9% stake in Intel last August by converting CHIPS Act grants into equity at $20.47 per share, and AMD and Nvidia agreed to hand over 15% of their China chip revenue in exchange for export licenses. OpenAI itself proposed a “public wealth fund” in an April policy paper, and Altman first pitched a government stake to the administration in early 2025, CNBC reported last month.

News of the talks comes just six days after OpenAI delayed the full public launch of GPT-5.6 at the government’s request, with Lutnick reportedly warning Altman against releasing the model without prior approval. Anthropic spent most of June with its Claude Fable 5 and Mythos 5 models disabled worldwide under the first U.S. export controls ever applied to an AI model rather than to hardware; access was restored yesterday.

Both OpenAI and Anthropic have confidentially filed for initial public offerings, and OpenAI faces a probe from a coalition of 42 state attorneys general. A government shareholding negotiated before a listing would lock in Washington’s position ahead of the ownership expansion that a full float brings, however.

Elon Musk categorically denies SpaceX is making an AI device with proprietary OS — says rumors of a handheld thinner than an iPhone are 'utterly false'

Latest from Tom's Hardware

By:ashilov@gmail.com (Anton Shilov)

2 July 2026 at 13:48

Elon Musk has slammed reports that SpaceX is developing a handheld AI device as "Utterly False," in a post on X. It follows a report from the Wall Street Journal on Wednesday claiming that SpaceX had demonstrated an early prototype of a handheld device featuring xAI's artificial intelligence technologies and a proprietary operating system to a small group of investors ahead of the company's initial public offering.

Utterly falseJuly 1, 2026

The handheld device from SpaceX is reportedly thinner than Apple's iPhone, is based on a Qualcomm Snapdragon system-on-chip, and runs a proprietary operating system, according to the WSJ story. The main selling point of the device is its artificial intelligence technologies from xAI. However, the report does not elaborate on their nature or how they integrate with the operating system. The concept device remains in its infancy and may never become a commercial product. Furthermore, assuming that it is in its early stages of development, its final look and specifications would likely differ significantly from the prototype.

Earlier this year, Reuters reported that SpaceX was developing a smartphone, citing its sources, which reportedly stressed that SpaceX has had plans for a handset for years. At the time, Musk told Reuters that a hypothetical Starlink phone would be 'not out of the question at some point,' but admitted that it would be an AI-centric device that would be 'very different than current phones.' Nonetheless, in an X post, Musk denied his company was building 'a phone.' Based on the comment Musk made today, SpaceX is still not developing a smartphone-like device.

Meanwhile, this would not be the first time Musk has denied something that later turns out to be true. After Reuters reported Tesla had canceled its inexpensive Model 2 vehicle in April, 2024, Musk replied, 'Reuters is lying (again)' without elaborating. Tesla still has not launched its low-cost electric vehicle, but prioritized Robotaxi instead. Furthermore, the company has not touched upon Model 2 in its conference calls in detail in the years that followed, essentially confirming that the entry-level Model 2 in the form that was envisioned before 2024 has been cancelled.

Building a smartphone that connects to Starlink satellites and terrestrial 5G networks is theoretically possible: there are 3GPP Release-18 and 3GPP Release-19 5G-NTN specifications that are designed to achieve just that, and many modern devices already support texting using NTNs. For a mobile device that supports both 5G and NTN networks today, the biggest challenges are to build a modem and a front-end module that can hit the right balance between performance, reliability, area, and power consumption. Of course, the constellation itself has to offer enough 'backbone' bandwidth to support global communications.

While an always-connected handheld device with advanced AI capabilities would, from many points of view, reflect Musk's long-standing ambition to create an 'everything app,' it would automatically make SpaceX compete against companies like Apple, Google, Samsung, and many other multinational conglomerates. Tough competition is arguably not something that any CEO would like to admit, especially when the product in question is in its infancy and there is an IPO ahead.

Anthropic restores Claude Fable 5 as US lifts export controls — single filter now blocks prompt that could identify software vulnerabilities and write code to exploit them

Latest from Tom's Hardware

1 July 2026 at 15:30

Anthropic has restored global access to Claude Fable 5, a day after the U.S. Department of Commerce withdrew the export controls it imposed on the model on June 12th, according to a company blog post. The fix that ended an 18-day standoff was a single safety filter tuned to block one technique flagged by Amazon researchers, with Commerce's own Center for AI Standards and Innovation (CAISI) reviewing the safeguards before the controls came off.

Fable 5 returns across Claude.ai, the Claude Platform, Claude Code, and Claude Cowork today, with access on AWS, Google Cloud, and Microsoft Foundry to follow. The June 12th directive had barred any foreign national, including Anthropic's own non-citizen staff, from using Fable 5 or the more capable Mythos 5, which it’s built on. With no way to verify the nationality of its users, Anthropic pulled both models worldwide.

The contentious technique was flagged by Amazon researchers, who found a way to prompt Fable 5 into identifying software vulnerabilities and, in one case, writing code demonstrating how one could be exploited. Anthropic trained a new classifier that blocks that specific technique in more than 99% of cases and reroutes flagged requests to the older Opus 4.8. The company said the change also catches more benign coding and debugging requests as a side effect.

The classifier targets the reported prompt and not the model’s capabilities. Fable 5 can still identify the vulnerabilities in the Amazon report; the filter detects the request and reroutes it rather than stripping the ability from the model. Detection-based safeguards are also what were defeated to trigger the ban in the first place, and a classifier that’s tuned to one known technique does nothing for the ones not yet found. Anthropic concedes that no model can be made fully robust to jailbreaks and that it expects more to surface.

Anthropic's review, conducted with the government and Amazon, found that Opus 4.8, OpenAI's GPT-5.5, and China's Kimi K2.7 could identify the same vulnerabilities. Every model that it tested, including Haiku 4.5, Sonnet 4.6, and several Opus versions, could reproduce the single exploit demonstration, backing the argument that Mythos-class cyber capabilities were oversold.

Fable 5's return reclaims benchmark positions that Chinese lab Z.ai's GLM-5.2 had held by default while Fable was offline, including the top accessible score on the AA-Briefcase multi-week task test. Mythos 5, which carries fewer guardrails and stays limited to Project Glasswing partners, returned to a set of U.S. organizations on June 26th.

Anthropic also opened a HackerOne program for researchers to report new Fable 5 jailbreaks, and it committed to giving designated government partners earlier access to test future frontier models before release. For Pro, Max, Team, and select Enterprise plans, Fable 5 counts toward up to 50% of weekly usage limits through July 7th, after which it moves to usage credits.

AI researchers trick chatbots into sharing how to make cocaine as long as they believe a user is wearing a green shirt — 'CoT Forgery' exploit spurs LLMs to divulge forbidden info by faking trusted chains of thought

Latest from Tom's Hardware

1 July 2026 at 14:00

AI models will explain how to synthesize cocaine if the request is wrapped in fake reasoning claiming compliance is fine because the user is wearing a green shirt, according to a new paper that traces the success of prompt injection, the unsolved security flaw in every AI chatbot and agent, to how LLMs read text. The paper says that models work out who is speaking from the writing style, not the role tags meant to separate trusted commands from untrusted data.

The work, “Prompt Injection as Role Confusion” by independent researchers Charles Ye, Jasmine Cui, and MIT associate professor Dylan Hadfield-Menell, heads to the ICML 2026 conference in Seoul on July 6th, and an extended write-up has been posted by the authors ahead of that event.

The cocaine trick, which the authors call CoT Forgery, took jailbreak success from near zero to roughly 60% across every model tested and won the 2025 OpenAI GPT-OSS-20B red-teaming contest on Kaggle.

An example of CoT Forgery. — (Image credit: Charles Ye, Jasmine Cui, Dylan Hadfield-Menell)

As the researchers describe it, models receive a conversation as one continuous string of text, partitioned by tags such as user, tool, and think that are supposed to mark each segment’s source and authority. The researchers built “role probes” that score how strongly a model internally treats each token as its own reasoning or as a user command.

Those scores predicted whether an attack would succeed before the model generated a single token, and they showed that models lean on style to make determinations about what kind of content is in a given partition. Text that merely reads like reasoning to a model registers as reasoning even when the surrounding tags said otherwise.

CoT Forgery injects fabricated reasoning into a prompt so the model treats it as its own already-reached conclusion and acts on it, inheriting the trust a model places in its own thinking. The rationale can be transparently absurd, like the green shirt, because the model doesn’t scrutinize it as an outside claim. What's more, the attack didn’t weaken as requests grew more extreme, unlike persuasion-based jailbreaks.

Removing the stylistic markers that make injected text read like the model’s reasoning, while leaving its meaning unchanged for a human, dropped average attack success from 61% to 10%. Swapping a single phrase, “The user” for “The request,” cut success by 19%. “Role tags were a formatting trick that became the security architecture and the cognitive scaffolding of modern LLMs,” the authors note in their write-up, and the increasing load on that structure to manage LLM behavior has apparently created vulnerabilities of its own.

To determine whether confusion about roles was specific to their attack or a more generalizable principle that explains why prompt injection works, the researchers took a different approach. They hid a command in a webpage telling the model to upload a secrets file, then prepended “User:” to it to make the dangerous instruction sound like it came from the trusted User role. The exploit worked, suggesting that role confusion underlies the success of prompt injection generally.

Microsoft recently acknowledged the same agentic risk, warning that content embedded in documents or UI elements can override an agent’s instructions.

The authors also flagged a more subtle risk for agents that browse and shop. Because role perception is a matter of degree, the tone of a retrieved webpage can bleed past the tag boundary into a model’s own state, and thousands of page variations could be tested cheaply to find which ones nudge an agent toward a purchase, legally and at scale.

Without genuine role perception, the authors concluded, injection defense will remain a perpetual game of whack-a-mole.

Nvidia reportedly cancels quad-die Rubin Ultra GPU in favor of dual-GPU design, report claims — complex design purportedly scrapped over 'manufacturing execution concerns'

Latest from Tom's Hardware

By:ashilov@gmail.com (Anton Shilov)

30 June 2026 at 16:45

In a bid to offer unbeatable performance, Nvidia had planned to use four GPU chiplets in its Rubin Ultra AI accelerator due in 2027. However, due to concerns about the manufacturability of such a solution, the company decided to cancel it in favor of a dual-GPU design that is easier to produce, according to SemiAnalysis.

Tom's Hardware Premium Roadmaps

a snippet from the HBM roadmap article — (Image credit: Future)

Nvidia's Rubin Ultra GPU with four compute chiplets was arguably one of Nvidia's most ambitious projects in recent years, as it not only doubled performance compared to the original Rubin (which uses two compute chiplets), but also increased the complexity of Nvidia's data center GPUs to levels never seen before. However, connecting four near reticle-sized dies using existing advanced packaging technologies is a tremendous engineering challenge, and cooling four complex dies and 16 HBM4E modules is hard and costly. As a result, due to 'manufacturing execution concerns,' Nvidia reportedly canceled Rubin Ultra in its four compute dies form in favor of a design with two compute chiplets. Note that the information is unofficial, so take it with a grain of salt. We've reached out to Nvidia for comment.

Nvidia data center GPU roadmap 2025 showing Rubin and Rubin Ultra — (Image credit: Nvidia)

As a consequence, Nvidia's 'new' Rubin Ultra would be around half as powerful as the original one, which would certainly make it less competitive against contending offerings, namely AMD's Instinct MI500-series. Of course, Nvidia will still likely optimize its Rubin Ultra design to squeeze some additional performance out of the AI accelerator to justify the upgrade.

Also, keep in mind that Nvidia's Rubin Ultra uses HBM4E memory instead of HBM4 used by the original Rubin. Furthermore, starting with Rubin GPUs, Nvidia plans to offer liquid-cooled Kyber rack-scale systems that increase GPU count per scale-up domain to at least 144 packages, which will increase compute performance that Nvidia will sell to its customers.

SemiAnalysis notes that the impact of the cancellation of an AI accelerator with 16 HBM4E packages could have an impact on the HBM market in general, as the 'new' Rubin Ultra will only use eight HBM4E modules.

The purported cancellation of Rubin Ultra with four compute chiplets would also mean that one Rubin Ultra GPU with two compute chiplets will cost less than the original one. Meanwhile, since Nvidia is mostly focused on selling rack-scale solutions rather than on individual GPUs, it remains to be seen how this impacts the actual spending of Nvidia's partners, since if they have to buy more systems to get more GPUs, they will likely spend more than they would if they had to buy fewer systems with the same number of compute chiplets.

Chinese Z.ai's latest model tops AI ranking charts amid Anthropic Fable 5 ban — blacklisted China firm's popular open-weight GLM-5.2 AI model powered by Huawei silicon

Latest from Tom's Hardware

30 June 2026 at 15:58

On June 12th, the U.S. Commerce Department issued an export-control directive barring Anthropic from supplying Fable 5 or Mythos 5 to any foreign national, forcing the company to disable both models worldwide. The next day, Beijing-based Z.ai, formerly Zhipu AI, began rolling out GLM-5.2, an open-weight model it released under a permissive MIT license. The new model was purportedly trained entirely on Huawei Ascend chips with no Nvidia hardware.

Within a week, GLM-5.2 had climbed to the top of the openly available leaderboards, Z.ai's market value had passed HK$1 trillion (about US$128 billion), and the most capable model many users outside the U.S. could legally access was a free download from a company that sits on Washington's trade blacklist.

Trailing Anthropic

GLM-5.2’s results are both strong and uneven, taking first place on Design Arena's human-preference coding board, finishing roughly 10 Elo points ahead of Fable. It also ranks as the top openly available model on Artificial Analysis's Intelligence Index v4.1, where its score of 51 sits ahead of MiniMax-M3, DeepSeek V4 Pro, and Google's Gemini 3.1 Pro Preview. On the SWE-bench Pro, it scored 62.1, compared to GPT-5.5's 58.6.

In terms of longer work, such as Code Arena’s front-end board, the picture changes somewhat, with GLM-5.2 landing second behind Fable 5. On Artificial Analysis's AA-Briefcase test, which scores multi-week knowledge tasks built from thousands of fragmented inputs, Fable 5 led with 1,587 Elo, followed by Opus 4.8 at 1,356, and GLM-5.2 in third place at 1,266, before the export ban took Fable out of contention.

It also trails on raw terminal work, scoring 81.0 on Terminal-Bench 2.1 against Opus 4.8's 85.0 and GPT-5.5's 84.0, while clearing Google's Gemini 3.1 Pro at 74.0. GLM-5.2 holds the top accessible position today, partly because the models that beat it on these benchmarks are largely an Anthropic pair, and Fable is now switched off.

No Nvidia

GLM-5.2’s training stack is a slap in the face of Washington’s efforts to curtail Chinese model development. Z.ai has been on the U.S. Entity List since January 2025, cutting it off from Nvidia's H100, H200, and B200 accelerators, and it says the GLM-5 family was trained on roughly 100,000 Huawei Ascend 910B processors using the MindSpore framework, with no Nvidia silicon at any stage. The export controls on advanced AI chips were designed to keep this kind of result out of reach, but they’ve evidently failed to do so.

That said, the Ascend 910C sits at roughly 60% of an Nvidia H100’s inference performance, per a December report from the Council on Foreign Relations, with a wide gap on efficiency and cluster scale. The same report projects that by as early as next year, the best U.S. chips could be more than 17 times more powerful than Huawei's top parts.

At the same time, Huawei has claimed that a 1,000-chip Ascend cluster handled full-parameter post-training of DeepSeek's V4. If true, this shows that Chinese domestic silicon can now carry training-class jobs, just not at Nvidia’s per-chip throughput or scale. So, while GLM-5.2 demonstrates that a frontier-class open model can be produced on a fully domestic stack, it doesn’t demonstrate that the chips underneath have caught up with Nvidia; model parity =/= hardware parity.

The Fable 5 shutdown

Triangle as a weighing scale — (Image credit: Anthropic)

Anthropic released Fable 5 to the public on June 10th, a safety-restricted build of its Mythos 5 model designed to block the cyber and bio capabilities of the underlying system. Just two days later, the Commerce Department suddenly and unexpectedly ordered access to be pulled for all foreign nationals, including Anthropic's own non-citizen staff, after officials cited a technique for bypassing Fable 5's safeguards.

Anthropic said in an announcement following the restriction that the jailbreak it understood to be at issue was narrow rather than universal, surfaced only previously known minor vulnerabilities, and produced behavior also obtainable from other public models, including OpenAI's GPT-5.5. The company said in its statement that it believes the order rests on a “misunderstanding” and is working to restore access. But because the directive covered all foreign nationals, Anthropic had no way to keep the models live for U.S. users alone and disabled them for everyone.

Meanwhile, GLM-5.2’s MIT license lets anyone download, fine-tune, and self-host its weights, which is the basis for calling it a freely available model. Running it, however, is a separate matter: the model carries around 744 billion total parameters, 40 billion of them active per token, with a one-million-token context window. That’s no small footprint and calls for enterprise GPU clusters or high-memory workstations — it’s not something you’re ever going to get running on a desktop — and throughput drops sharply once context runs past tens of thousands of tokens.

The most practical way to use GLM-5.2 is via the API, where Z.ai prices the model at about $1.40 per million input tokens and $4.40 per million output, against $5 and $25 for Claude Opus 4.8, or $10 and $50 for Fable 5. On the AA-Briefcase runs, Fable 5 averaged $31 per task to GLM-5.2's $2.40, a roughly 13-times spread that holds even where Fable scored higher.

The market moved fast with GLM-5.2’s release. Z.ai, which is listed on the Hong Kong exchange as Knowledge Atlas Technology, saw its shares jump as much as 42% intraday on June 22nd to HK$2,980, carrying its market capitalization past HK$1 trillion. Founder Tang Jie has said publicly that a Chinese model matching Fable 5 will arrive sooner than the first-quarter timeline Elon Musk recently floated. There's a nearer date, too. On July 8th, the lock-up on Z.ai's first cornerstone investors expires, freeing a large block of shares to trade, which will give the GLM-5.2 rally its first real test.

Normal view

Trailing Anthropic

No Nvidia

The Fable 5 shutdown