Reading view

Analysts Say Bitcoin’s Future Lies in Infrastructure, Not Gold Comparison

8 November 2025 at 15:00

For a long time, mainstream investors compared Bitcoin to a precious hedge and called it digital gold. That line helped people grasp scarcity and self custody. It did not tell the whole story. A better description is starting to win out in boardrooms and dev chats. Bitcoin is infrastructure explains how the network actually works in the world. It settles value with finality, anchors transparent collateral, and supports services that keep time with global markets.

Why the infrastructure lens is more useful than the vault metaphor

Infrastructure is the quiet backbone behind daily life. People rarely talk about power lines or fiber cables until something breaks. Value networks should be judged the same way. When analysts say Bitcoin is infrastructure, they are arguing that the asset and the ledger behave like productive capital.

Coins can be pledged as collateral, routed across borders, or priced into working capital cycles. The older label, digital gold, still matters as a store narrative. It simply covers one slice of the picture, not the whole panorama.

From passive store to productive capital

The base layer confirms transactions on a schedule that does not beg for headlines. Each block reduces settlement risk. On top of that, secure custody and policy controls create the operating environment enterprises expect. In that environment, lenders can mark risks, venues can clear trades, and treasurers can plan cash cycles. The practical outcome is simple.

Bitcoin is infrastructure captures how teams deploy the asset, not only how they hold it. The phrase digital gold explains scarcity and durability, but it does not teach a finance lead on margin rules, event playbooks, or audits.

Indicators that separate hope from use

Good decisions follow good data as on chain velocity shows whether value sits or moves. A rising share of long term holders supports the store thesis. A rising share of active addresses and settlement value signals use at scale. Liquidity depth and realized volatility explain how much size a market can absorb before price slides. Basis and funding rates show ease or stress across derivatives.

Bitcoin is infrastructure

In healthy tapes, basis is narrow and funding is stable. In stressed moments, gaps widen and risk managers step in. When the numbers point to increasing throughput and predictable operations, the label Bitcoin is infrastructure sounds less like marketing and more like measurement. The label digital gold points to patience and caution, which are valuable, but incomplete.

Payments, remittances, and the parts most users never see

Most users will never read a settlement explorer and they care about two taps and a clear receipt. Cross-border remittances, e-commerce checkout, and machine payments all benefit from neutral finality. Service providers can move on-chain when it reduces friction, or off-chain when that lowers cost without hiding risk.

With the right controls, a payment company can quote steady prices and keep promises during busy hours. These are classic infrastructure wins. To call the network digital gold only is to ignore the daily work of routing value. When that work becomes normal, people begin to understand why Bitcoin is infrastructure for real world flows.

Collateral, yield, and the honest sources of return

In the infrastructure frame, returns come from visible services. Lending desks earn a spread for placing capital with risk controls. Market makers earn fees for quoting two sided markets. Settlement providers earn for reliable processing.

None of these flows require smoke and mirrors. They require process discipline, asset liability matching, and clean reporting. The message hidden in Bitcoin is infrastructure is that sound operations beat clever slogans. The message hidden in digital gold is that saving alone cannot explain an economy built on programmable value.

Governance, compliance, and the adult questions

Serious capital does not move without answers. Who holds the keys. What policies define access? Which audits confirm that a process works as described? Mature custody uses role-based permissions, hardware-backed isolation, and recovery plans that are tested, not imagined. Clear licensing and disclosure rules channel innovation into products that enterprises can actually use.

With those pillars in place, the claim that Bitcoin is infrastructure reads as common sense. The older comfort phrase, digital gold, keeps its place for savers but does not guide an operations team through policy and oversight. Institutions lean toward systems that run every hour of the week.

Education that respects both stories

Education is changing. Instead of telling newcomers to buy and wait, educators now explain how settlement finality reduces counterparty risk, how collateral makes lending safer, and how risk is priced across time.

This does not toss the store story in the trash. It folds it into a broader playbook. The network can serve the saver and the operator at once. That dual use is the strongest proof that Bitcoin is an infrastructure is a practical description. The phrase digital gold helps a person start, then the infrastructure frame shows how teams keep going.

Reading the road ahead without drama

Cycles will continue. Prices will swing. Critics will point to every drop and call the model broken. The better habit is to grade the network on uptime, cost to settle, and resilience during stress. Those metrics tell the truth over time.

As more services plug into neutral rails, interoperability improves and the end user sees faster confirmations and fewer surprises. Step by step, enterprises begin to treat the system like any other critical dependency. At that point, repeating that Bitcoin is infrastructure sounds less like a slogan and more like a weather report.

Conclusion

The early metaphor served the market well. It gave people a simple way to think about scarcity and self-custody. The market is larger now and the demands are tougher. Savings still matter and the phrase digital gold will not disappear. It will sit beside a broader reality. The asset stores value, and it moves value with intent at a global scale. In that full picture, Bitcoin is infrastructure is the plain description that fits.

Frequently Asked Questions (FAQ)

What does it mean to say Bitcoin is infrastructure.
It means the asset and the ledger function together as productive capital. The system stores value and also settles payments, anchors collateral, and supports yield with transparent rules.

How is this different from calling Bitcoin digital gold.
The store frame points to scarcity and patience. The infrastructure frame points to throughput, reliability, and cash cycle planning. Both can be true at once, but the second gives operators a clearer map.

Which indicators help validate the infrastructure view.
Settlement value, active addresses, liquidity depth, basis, and funding paint a picture of actual use. Collateralized lending and custody adoption add further proof.

Is the infrastructure view only for institutions.
No. Faster confirmations, clear fees, and neutral rails help retail users, merchants, and developers. Better plumbing improves the whole street.

What risks remain.
Market volatility, custody mistakes, and unclear rules remain the big three. The answer is risk limits, tested recovery plans, and steady education.

Glossary of Key Terms

Collateralization
The practice of pledging an asset to secure a loan. If the borrower fails to repay, the lender can claim the collateral according to agreed rules.

Basis
The difference between spot and futures prices. It signals stress or ease in funding markets and helps traders read risk.

Funding rate
A periodic payment between long and short positions in perpetual futures that keeps the contract near the spot price.

Custody
The service of holding and safeguarding assets with security controls, permissions, and audits that reduce operational risk.

Liquidity depth
A measure of how much size the market can handle before price moves. Deeper books improve execution and reduce slippage.

OpenAI Seeks CHIPS Act Expansion for AI Infrastructure, Sparking Industry Debate

StartupHub.ai

By:StartupHub.ai Staff

8 November 2025 at 03:16

The post OpenAI Seeks CHIPS Act Expansion for AI Infrastructure, Sparking Industry Debate appeared first on StartupHub.ai.

The cost of frontier AI development is escalating to unprecedented levels, prompting leading innovators like OpenAI to actively seek government intervention. CNBC’s MacKenzie Sigalos, reporting live on ‘Closing Bell Overtime,’ detailed a significant development: OpenAI is reportedly urging the U.S. administration to expand the CHIPS and Science Act’s tax credits to encompass AI data centers […]

The post OpenAI Seeks CHIPS Act Expansion for AI Infrastructure, Sparking Industry Debate appeared first on StartupHub.ai.

Ship fast, optimize later: top AI engineers don't care about cost — they're prioritizing deployment

VentureBeat

By:taryn.plumb@venturebeat.com (Taryn Plumb)

7 November 2025 at 09:00

Across industries, rising compute expenses are often cited as a barrier to AI adoption — but leading companies are finding that cost is no longer the real constraint. The tougher challenges (and the ones top of mind for many tech leaders)? Latency, flexibility and capacity. At Wonder, for instance, AI adds a mere few cents per order; the food delivery and takeout company is much more concerned with cloud capacity with skyrocketing demands. Recursion, for its part, has been focused on balancing small and larger-scale training and deployment via on-premises clusters and the cloud; this has afforded the biotech company flexibility for rapid experimentation. The companies’ true in-the-wild experiences highlight a broader industry trend: For enterprises operating AI at scale, economics aren't the key decisive factor — the conversation has shifted from how to pay for AI to how fast it can be deployed and sustained. AI leaders from the two companies recently sat down with Venturebeat’s CEO and editor-in-chief Matt Marshall as part of VB’s traveling AI Impact Series. Here’s what they shared.

Wonder: Rethink what you assume about capacity

Wonder uses AI to power everything from recommendations to logistics — yet, as of now, reported CTO James Chen, AI adds just a few cents per order.

Chen explained that the technology component of a meal order costs 14 cents, the AI adds 2 to 3 cents, although that’s “going up really rapidly” to 5 to 8 cents. Still, that seems almost immaterial compared to total operating costs. Instead, the 100% cloud-native AI company’s main concern has been capacity with growing demand. Wonder was built with “the assumption” (which proved to be incorrect) that there would be “unlimited capacity” so they could move “super fast” and wouldn’t have to worry about managing infrastructure, Chen noted. But the company has grown quite a bit over the last few years, he said; as a result, about six months ago, “we started getting little signals from the cloud providers, ‘Hey, you might need to consider going to region two,’” because they were running out of capacity for CPU or data storage at their facilities as demand grew. It was “very shocking” that they had to move to plan B earlier than they anticipated. “Obviously it's good practice to be multi-region, but we were thinking maybe two more years down the road,” said Chen.

What's not economically feasible (yet)

Wonder built its own model to maximize its conversion rate, Chen noted; the goal is to surface new restaurants to relevant customers as much as possible. These are “isolated scenarios” where models are trained over time to be “very, very efficient and very fast.” Currently, the best bet for Wonder’s use case is large models, Chen noted. But in the long term, they’d like to move to small models that are hyper-customized to individuals (via AI agents or concierges) based on their purchase history and even their clickstream. “Having these micro models is definitely the best, but right now the cost is very expensive,” Chen noted. “If you try to create one for each person, it's just not economically feasible.”

Budgeting is an art, not a science

Wonder gives its devs and data scientists as much playroom as possible to experiment, and internal teams review the costs of use to make sure nobody turned on a model and “jacked up massive compute around a huge bill,” said Chen. The company is trying different things to offload to AI and operate within margins. “But then it's very hard to budget because you have no idea,” he said. One of the challenging things is the pace of development; when a new model comes out, “we can’t just sit there, right? We have to use it.” Budgeting for the unknown economics of a token-based system is “definitely art versus science.” A critical component in the software development lifecycle is preserving context when using large native models, he explained. When you find something that works, you can add it to your company’s “corpus of context” that can be sent with every request. That’s big and it costs money each time. “Over 50%, up to 80% of your costs is just resending the same information back into the same engine again on every request,” said Chen.

In theory, the more they do should require less cost per unit. “I know when a transaction happens, I'll pay the X cent tax for each one, but I don't want to be limited to use the technology for all these other creative ideas."

The 'vindication moment' for Recursion

Recursion, for its part, has focused on meeting broad-ranging compute needs via a hybrid infrastructure of on-premise clusters and cloud inference. When initially looking to build out its AI infrastructure, the company had to go with its own setup, as “the cloud providers didn't have very many good offerings,” explained CTO Ben Mabey. “The vindication moment was that we needed more compute and we looked to the cloud providers and they were like, ‘Maybe in a year or so.’” The company’s first cluster in 2017 incorporated Nvidia gaming GPUs (1080s, launched in 2016); they have since added Nvidia H100s and A100s, and use a Kubernetes cluster that they run in the cloud or on-prem. Addressing the longevity question, Mabey noted: “These gaming GPUs are actually still being used today, which is crazy, right? The myth that a GPU's life span is only three years, that's definitely not the case. A100s are still top of the list, they're the workhorse of the industry.”

Best use cases on-prem vs cloud; cost differences

More recently, Mabey’s team has been training a foundation model on Recursion’s image repository (which consists of petabytes of data and more than 200 pictures). This and other types of big training jobs have required a “massive cluster” and connected, multi-node setups. “When we need that fully-connected network and access to a lot of our data in a high parallel file system, we go on-prem,” he explained. On the other hand, shorter workloads run in the cloud. Recursion’s method is to “pre-empt” GPUs and Google tensor processing units (TPUs), which is the process of interrupting running GPU tasks to work on higher-priority ones. “Because we don't care about the speed in some of these inference workloads where we're uploading biological data, whether that's an image or sequencing data, DNA data,” Mabey explained. “We can say, ‘Give this to us in an hour,’ and we're fine if it kills the job.” From a cost perspective, moving large workloads on-prem is “conservatively” 10 times cheaper, Mabey noted; for a five year TCO, it's half the cost. On the other hand, for smaller storage needs, the cloud can be “pretty competitive” cost-wise. Ultimately, Mabey urged tech leaders to step back and determine whether they’re truly willing to commit to AI; cost-effective solutions typically require multi-year buy-ins. “From a psychological perspective, I've seen peers of ours who will not invest in compute, and as a result they're always paying on demand," said Mabey. "Their teams use far less compute because they don't want to run up the cloud bill. Innovation really gets hampered by people not wanting to burn money.”

OpenAI’s Trillion-Dollar Ambition Meets Wall Street Skepticism

StartupHub.ai

By:StartupHub.ai Staff

8 November 2025 at 00:45

The post OpenAI’s Trillion-Dollar Ambition Meets Wall Street Skepticism appeared first on StartupHub.ai.

OpenAI’s audacious financial projections, targeting $20 billion in annualized revenue this year and “hundreds of billions by 2030,” have ignited a fierce debate on Wall Street, challenging the very math underpinning its staggering $1.4 trillion compute bill. This ambitious roadmap, articulated by CEO Sam Altman, suggests a future where the AI giant’s revenue alone will […]

The post OpenAI’s Trillion-Dollar Ambition Meets Wall Street Skepticism appeared first on StartupHub.ai.

U.S. Must Build Sovereign AI, Not Backstop Private Giants

StartupHub.ai

By:StartupHub.ai Staff

7 November 2025 at 16:45

The post U.S. Must Build Sovereign AI, Not Backstop Private Giants appeared first on StartupHub.ai.

The United States faces a pivotal choice in the burgeoning AI landscape: whether to directly fund private sector pioneers or strategically invest in its own national AI infrastructure. Alex Kantrowitz, Founder of Big Technology and a CNBC contributor, recently articulated a compelling argument for the latter, suggesting that while government intervention in the private AI […]

The post U.S. Must Build Sovereign AI, Not Backstop Private Giants appeared first on StartupHub.ai.

UAE’s Investment in Africa’s Tourism to Create Thousands of Jobs and Expand Opportunities, Here’s All You Need to Know

Travel And Tour World

By:Moulina Dhara

7 November 2025 at 12:35

UAE’s Investment in Africa’s Tourism to Create Thousands of Jobs and Expand Opportunities, Here’s All You Need to Know

The UAE has revealed a $6 billion investment aimed at revolutionizing Africa’s travel and hospitality market. Such an investment is expected to produce approximately 70,000 new job opportunities, solidifying Africa’s place as one of the quickest growing regions in the global tourism economy.

The announcement was made in Dubai during the UAE Africa Tourism Investment Summit 2025, which was held under the patronage of His Highness Sheikh Mohammed bin Rashid Al Maktoum, Vice President and Prime Minister of the UAE and Ruler of Dubai. This summit aimed at unveiling new investment opportunities and aligning stakeholders on the priorities for Africa’s tourism, infrastructure, and aviation, and was attended by high level government officials which included investors and government executives.

A Strategic Investment in Africa’s Booming Tourism Sector

Africa’s tourism industry has become one of the fastest-growing globally, driven by the continent’s diverse landscapes, cultural richness, and unique heritage. The UAE’s $6 billion investment is set to accelerate this growth by focusing on key areas such as aviation, tourism infrastructure, logistics, and digital innovation. These investments will not only create thousands of new jobs but also improve the travel experience for visitors, enhance connectivity between African countries, and promote sustainable tourism initiatives that benefit local communities and the environment.

Abdulla bin Touq Al Marri, UAE Minister of Economy and Tourism, highlighted the importance of leveraging Africa’s unique tourism assets, which include coastal resorts, pristine beaches, historical landmarks, and wildlife reserves. With the continent’s growing appeal to international travelers, the UAE sees this as a prime opportunity for investment and expansion. The move will also strengthen ties between the UAE and Africa, creating a gateway for further collaboration and sustainable development across various tourism sectors.

Key Focus Areas for Investment: Aviation, Infrastructure, and Digital Innovation

The $6 billion investment plan will focus on several key areas that are essential to the long-term growth of Africa’s tourism industry:

Aviation: One of the key pillars of the investment is to enhance air connectivity across Africa. Improving regional and international flights, expanding airport infrastructure, and increasing the availability of direct routes will make it easier for tourists to visit African destinations. This will also foster greater economic integration within the continent, facilitating the movement of people, goods, and services.
Tourism Infrastructure: The UAE’s investment will be used to build and upgrade critical tourism infrastructure such as hotels, resorts, transportation systems, and visitor centers. By improving these facilities, Africa will be able to cater to an increasing number of visitors while offering them a more comfortable and efficient travel experience.
Logistics and Supply Chains: The development of logistics networks and supply chain infrastructure is vital for the efficient movement of tourists and goods across the continent. The UAE’s investment will support the growth of these systems, which will be key in ensuring that Africa can handle a growing influx of international travelers while minimizing logistical bottlenecks.
Digital Innovation: The UAE is also focused on enhancing the digital transformation of Africa’s tourism sector. This includes the development of smart tourism platforms, online booking systems, and digital marketing campaigns that can help promote African destinations to a wider audience. By integrating technology into the tourism experience, Africa can offer modern and seamless travel solutions to both international and local visitors.

Collaboration Between the UAE and African Governments

A central element of the summit was a ministerial roundtable, which gathered ministers from over 20 African nations to discuss the future of tourism in Africa. During the roundtable, a joint statement was issued outlining plans for greater collaboration between the UAE and African governments in advancing tourism infrastructure, aviation, and sustainable growth.

The statement emphasized shared goals such as improving tourism services, increasing air connectivity, and supporting small and medium-sized enterprises (SMEs) within the tourism sector. The UAE’s commitment to green growth and inclusive development was also highlighted as an essential part of the investment strategy, aiming to ensure that tourism growth is environmentally responsible and beneficial to local communities.

The UAE’s expertise in sustainable tourism and digital innovation will help African nations develop tourism strategies that align with global best practices while maintaining cultural and environmental integrity. The partnership between the UAE and African countries is expected to create a more resilient and sustainable tourism industry that benefits both tourists and host communities.

FHS Africa 2026: Shaping the Future of Africa’s Hospitality Industry

The UAE’s investment is just the beginning of a broader strategy to develop Africa’s tourism sector. The Future Hospitality Summit Africa (FHS Africa) 2026 will be a key event in the coming years, bringing together industry leaders, policymakers, and investors to discuss the future of hospitality investment in Africa. The summit will provide a platform for further deal-making, the sharing of ideas, and the development of new partnerships that will shape the continent’s tourism landscape for years to come.

Roy Bannister, Head of Strategic Partnerships for Africa at The Bench, which organized the summit, emphasized that FHS Africa will be instrumental in turning the $6 billion investment plan into actionable projects. “We will focus on transforming the framework into tangible projects and opportunities that will deliver lasting impact for Africa’s tourism economy,” Bannister said.

The FHS Africa event will continue to foster partnerships between the UAE and African governments, helping to accelerate the development of hospitality infrastructure, tourism products, and destination marketing. With the $6 billion investment as a foundation, FHS Africa aims to position Africa as a leading destination for both regional and international travelers.

Impact on Tourists: Enhanced Travel Experience Across Africa

For tourists, the UAE’s $6 billion investment will significantly enhance the travel experience across Africa. The development of tourism infrastructure will make it easier for travelers to visit popular destinations, including Africa’s beach resorts, national parks, and historical landmarks. Improved air connectivity will allow tourists to seamlessly travel between African countries, increasing the number of multi-destination trips and making it easier to explore the continent’s diverse offerings.

Moreover, the focus on digital innovation will improve the online booking process, create smart travel apps, and enhance tourist information services. As a result, visitors will experience greater convenience, better access to information, and a smoother overall trip. With the sustainable tourism initiatives underway, travelers can also enjoy their vacations with peace of mind, knowing that their visits are contributing to environmental conservation and local community development.

A Transformative Investment for Africa’s Tourism Future

Africa’s tourism sector is an estimated $6 billion investment from Ubuntu Hospitality Group. In conjunction with developed investments in transportation, groundbreaking sustainability and digital initiatives, projected returns on investments in tourism through the hospitality group focus on value for travelers and the socio-economic viability for the African community members. Investments developed in tourism with African jurisdictions shaped the efforts of the African Union on the continent’s sustainability tourism economy. Forecasting the initiatives of the FHS Africa, unique tourism activities positioned the continent economically and environmentally responsible. The continent is strengthening projected returns on responsible tourism through value initiatives for African community members.

The post UAE’s Investment in Africa’s Tourism to Create Thousands of Jobs and Expand Opportunities, Here’s All You Need to Know appeared first on Travel And Tour World.

AI’s Menacing Blob: Market Uncertainty Amidst Data Center Expansion and Government Inaction

StartupHub.ai

By:StartupHub.ai Staff

7 November 2025 at 04:45

The post AI’s Menacing Blob: Market Uncertainty Amidst Data Center Expansion and Government Inaction appeared first on StartupHub.ai.

“We’ve been witnessing the increasingly menacing blob, with the expansion of OpenAI,” declared Jim Cramer, host of CNBC’s *Mad Money*, capturing the palpable anxiety permeating Wall Street. His commentary underscored a market grappling with unprecedented challenges, from a protracted government shutdown to the dizzying, and potentially precarious, build-out of artificial intelligence infrastructure. This confluence of […]

The post AI’s Menacing Blob: Market Uncertainty Amidst Data Center Expansion and Government Inaction appeared first on StartupHub.ai.

AI Hyperscalers Fuel Debt Surge, Offering Unique Investor Opportunity

StartupHub.ai

By:StartupHub.ai Staff

7 November 2025 at 03:15

The post AI Hyperscalers Fuel Debt Surge, Offering Unique Investor Opportunity appeared first on StartupHub.ai.

The burgeoning artificial intelligence arms race is not just reshaping technological paradigms; it’s fundamentally recalibrating the global debt markets. As CNBC’s Fast Money hosts Melissa Lee, Guy Adami, and Karen Finerman recently discussed with Chris White, Founder & CEO of BondCliq, the sheer volume of debt being issued by AI hyperscalers marks an unprecedented financial […]

The post AI Hyperscalers Fuel Debt Surge, Offering Unique Investor Opportunity appeared first on StartupHub.ai.

David Sacks Rejects AI Bailouts, Emphasizing Market Competition Over Government Backstops

StartupHub.ai

By:StartupHub.ai Staff

6 November 2025 at 22:15

The post David Sacks Rejects AI Bailouts, Emphasizing Market Competition Over Government Backstops appeared first on StartupHub.ai.

“There will be no federal bailout for AI.” This unequivocal statement from David Sacks, identified as a White House AI and Crypto Czar, cuts directly to the heart of a burgeoning debate within the tech industry: the role of government support in the capital-intensive world of artificial intelligence. His remarks, disseminated via social media platform […]

The post David Sacks Rejects AI Bailouts, Emphasizing Market Competition Over Government Backstops appeared first on StartupHub.ai.

Google debuts AI chips with 4X performance boost, secures Anthropic megadeal worth billions

VentureBeat

By:michael.nunez@venturebeat.com (Michael Nuñez)

6 November 2025 at 17:00

Google Cloud is introducing what it calls its most powerful artificial intelligence infrastructure to date, unveiling a seventh-generation Tensor Processing Unit and expanded Arm-based computing options designed to meet surging demand for AI model deployment — what the company characterizes as a fundamental industry shift from training models to serving them to billions of users.

The announcement, made Thursday, centers on Ironwood, Google's latest custom AI accelerator chip, which will become generally available in the coming weeks. In a striking validation of the technology, Anthropic, the AI safety company behind the Claude family of models, disclosed plans to access up to one million of these TPU chips — a commitment worth tens of billions of dollars and among the largest known AI infrastructure deals to date.

The move underscores an intensifying competition among cloud providers to control the infrastructure layer powering artificial intelligence, even as questions mount about whether the industry can sustain its current pace of capital expenditure. Google's approach — building custom silicon rather than relying solely on Nvidia's dominant GPU chips — amounts to a long-term bet that vertical integration from chip design through software will deliver superior economics and performance.

Why companies are racing to serve AI models, not just train them

Google executives framed the announcements around what they call "the age of inference" — a transition point where companies shift resources from training frontier AI models to deploying them in production applications serving millions or billions of requests daily.

"Today's frontier models, including Google's Gemini, Veo, and Imagen and Anthropic's Claude train and serve on Tensor Processing Units," said Amin Vahdat, vice president and general manager of AI and Infrastructure at Google Cloud. "For many organizations, the focus is shifting from training these models to powering useful, responsive interactions with them."

This transition has profound implications for infrastructure requirements. Where training workloads can often tolerate batch processing and longer completion times, inference — the process of actually running a trained model to generate responses — demands consistently low latency, high throughput, and unwavering reliability. A chatbot that takes 30 seconds to respond, or a coding assistant that frequently times out, becomes unusable regardless of the underlying model's capabilities.

Agentic workflows — where AI systems take autonomous actions rather than simply responding to prompts — create particularly complex infrastructure challenges, requiring tight coordination between specialized AI accelerators and general-purpose computing.

Inside Ironwood's architecture: 9,216 chips working as one supercomputer

Ironwood is more than incremental improvement over Google's sixth-generation TPUs. According to technical specifications shared by the company, it delivers more than four times better performance for both training and inference workloads compared to its predecessor — gains that Google attributes to a system-level co-design approach rather than simply increasing transistor counts.

The architecture's most striking feature is its scale. A single Ironwood "pod" — a tightly integrated unit of TPU chips functioning as one supercomputer — can connect up to 9,216 individual chips through Google's proprietary Inter-Chip Interconnect network operating at 9.6 terabits per second. To put that bandwidth in perspective, it's roughly equivalent to downloading the entire Library of Congress in under two seconds.

This massive interconnect fabric allows the 9,216 chips to share access to 1.77 petabytes of High Bandwidth Memory — memory fast enough to keep pace with the chips' processing speeds. That's approximately 40,000 high-definition Blu-ray movies' worth of working memory, instantly accessible by thousands of processors simultaneously. "For context, that means Ironwood Pods can deliver 118x more FP8 ExaFLOPS versus the next closest competitor," Google stated in technical documentation.

The system employs Optical Circuit Switching technology that acts as a "dynamic, reconfigurable fabric." When individual components fail or require maintenance — inevitable at this scale — the OCS technology automatically reroutes data traffic around the interruption within milliseconds, allowing workloads to continue running without user-visible disruption.

This reliability focus reflects lessons learned from deploying five previous TPU generations. Google reported that its fleet-wide uptime for liquid-cooled systems has maintained approximately 99.999% availability since 2020 — equivalent to less than six minutes of downtime per year.

Anthropic's billion-dollar bet validates Google's custom silicon strategy

Perhaps the most significant external validation of Ironwood's capabilities comes from Anthropic's commitment to access up to one million TPU chips — a staggering figure in an industry where even clusters of 10,000 to 50,000 accelerators are considered massive.

"Anthropic and Google have a longstanding partnership and this latest expansion will help us continue to grow the compute we need to define the frontier of AI," said Krishna Rao, Anthropic's chief financial officer, in the official partnership agreement. "Our customers — from Fortune 500 companies to AI-native startups — depend on Claude for their most important work, and this expanded capacity ensures we can meet our exponentially growing demand."

According to a separate statement, Anthropic will have access to "well over a gigawatt of capacity coming online in 2026" — enough electricity to power a small city. The company specifically cited TPUs' "price-performance and efficiency" as key factors in the decision, along with "existing experience in training and serving its models with TPUs."

Industry analysts estimate that a commitment to access one million TPU chips, with associated infrastructure, networking, power, and cooling, likely represents a multi-year contract worth tens of billions of dollars — among the largest known cloud infrastructure commitments in history.

James Bradbury, Anthropic's head of compute, elaborated on the inference focus: "Ironwood's improvements in both inference performance and training scalability will help us scale efficiently while maintaining the speed and reliability our customers expect."

Google's Axion processors target the computing workloads that make AI possible

Alongside Ironwood, Google introduced expanded options for its Axion processor family — custom Arm-based CPUs designed for general-purpose workloads that support AI applications but don't require specialized accelerators.

The N4A instance type, now entering preview, targets what Google describes as "microservices, containerized applications, open-source databases, batch, data analytics, development environments, experimentation, data preparation and web serving jobs that make AI applications possible." The company claims N4A delivers up to 2X better price-performance than comparable current-generation x86-based virtual machines.

Google is also previewing C4A metal, its first bare-metal Arm instance, which provides dedicated physical servers for specialized workloads such as Android development, automotive systems, and software with strict licensing requirements.

The Axion strategy reflects a growing conviction that the future of computing infrastructure requires both specialized AI accelerators and highly efficient general-purpose processors. While a TPU handles the computationally intensive task of running an AI model, Axion-class processors manage data ingestion, preprocessing, application logic, API serving, and countless other tasks in a modern AI application stack.

Early customer results suggest the approach delivers measurable economic benefits. Vimeo reported observing "a 30% improvement in performance for our core transcoding workload compared to comparable x86 VMs" in initial N4A tests. ZoomInfo measured "a 60% improvement in price-performance" for data processing pipelines running on Java services, according to Sergei Koren, the company's chief infrastructure architect.

Software tools turn raw silicon performance into developer productivity

Hardware performance means little if developers cannot easily harness it. Google emphasized that Ironwood and Axion are integrated into what it calls AI Hypercomputer — "an integrated supercomputing system that brings together compute, networking, storage, and software to improve system-level performance and efficiency."

According to an October 2025 IDC Business Value Snapshot study, AI Hypercomputer customers achieved on average 353% three-year return on investment, 28% lower IT costs, and 55% more efficient IT teams.

Google disclosed several software enhancements designed to maximize Ironwood utilization. Google Kubernetes Engine now offers advanced maintenance and topology awareness for TPU clusters, enabling intelligent scheduling and highly resilient deployments. The company's open-source MaxText framework now supports advanced training techniques including Supervised Fine-Tuning and Generative Reinforcement Policy Optimization.

Perhaps most significant for production deployments, Google's Inference Gateway intelligently load-balances requests across model servers to optimize critical metrics. According to Google, it can reduce time-to-first-token latency by 96% and serving costs by up to 30% through techniques like prefix-cache-aware routing.

The Inference Gateway monitors key metrics including KV cache hits, GPU or TPU utilization, and request queue length, then routes incoming requests to the optimal replica. For conversational AI applications where multiple requests might share context, routing requests with shared prefixes to the same server instance can dramatically reduce redundant computation.

The hidden challenge: powering and cooling one-megawatt server racks

Behind these announcements lies a massive physical infrastructure challenge that Google addressed at the recent Open Compute Project EMEA Summit. The company disclosed that it's implementing +/-400 volt direct current power delivery capable of supporting up to one megawatt per rack — a tenfold increase from typical deployments.

"The AI era requires even greater power delivery capabilities," explained Madhusudan Iyengar and Amber Huffman, Google principal engineers, in an April 2025 blog post. "ML will require more than 500 kW per IT rack before 2030."

Google is collaborating with Meta and Microsoft to standardize electrical and mechanical interfaces for high-voltage DC distribution. The company selected 400 VDC specifically to leverage the supply chain established by electric vehicles, "for greater economies of scale, more efficient manufacturing, and improved quality and scale."

On cooling, Google revealed it will contribute its fifth-generation cooling distribution unit design to the Open Compute Project. The company has deployed liquid cooling "at GigaWatt scale across more than 2,000 TPU Pods in the past seven years" with fleet-wide availability of approximately 99.999%.

Water can transport approximately 4,000 times more heat per unit volume than air for a given temperature change — critical as individual AI accelerator chips increasingly dissipate 1,000 watts or more.

Custom silicon gambit challenges Nvidia's AI accelerator dominance

Google's announcements come as the AI infrastructure market reaches an inflection point. While Nvidia maintains overwhelming dominance in AI accelerators — holding an estimated 80-95% market share — cloud providers are increasingly investing in custom silicon to differentiate their offerings and improve unit economics.

Amazon Web Services pioneered this approach with Graviton Arm-based CPUs and Inferentia / Trainium AI chips. Microsoft has developed Cobalt processors and is reportedly working on AI accelerators. Google now offers the most comprehensive custom silicon portfolio among major cloud providers.

The strategy faces inherent challenges. Custom chip development requires enormous upfront investment — often billions of dollars. The software ecosystem for specialized accelerators lags behind Nvidia's CUDA platform, which benefits from 15+ years of developer tools. And rapid AI model architecture evolution creates risk that custom silicon optimized for today's models becomes less relevant as new techniques emerge.

Yet Google argues its approach delivers unique advantages. "This is how we built the first TPU ten years ago, which in turn unlocked the invention of the Transformer eight years ago — the very architecture that powers most of modern AI," the company noted, referring to the seminal "Attention Is All You Need" paper from Google researchers in 2017.

The argument is that tight integration — "model research, software, and hardware development under one roof" — enables optimizations impossible with off-the-shelf components.

Beyond Anthropic, several other customers provided early feedback. Lightricks, which develops creative AI tools, reported that early Ironwood testing "makes us highly enthusiastic" about creating "more nuanced, precise, and higher-fidelity image and video generation for our millions of global customers," said Yoav HaCohen, the company's research director.

Google's announcements raise questions that will play out over coming quarters. Can the industry sustain current infrastructure spending, with major AI companies collectively committing hundreds of billions of dollars? Will custom silicon prove economically superior to Nvidia GPUs? How will model architectures evolve?

For now, Google appears committed to a strategy that has defined the company for decades: building custom infrastructure to enable applications impossible on commodity hardware, then making that infrastructure available to customers who want similar capabilities without the capital investment.

As the AI industry transitions from research labs to production deployments serving billions of users, that infrastructure layer — the silicon, software, networking, power, and cooling that make it all run — may prove as important as the models themselves.

And if Anthropic's willingness to commit to accessing up to one million chips is any indication, Google's bet on custom silicon designed specifically for the age of inference may be paying off just as demand reaches its inflection point.

AI Infrastructure and Connectivity Drive Shifting Investor Focus

StartupHub.ai

By:StartupHub.ai Staff

6 November 2025 at 17:45

The post AI Infrastructure and Connectivity Drive Shifting Investor Focus appeared first on StartupHub.ai.

In a recent CNBC interview, Kevin Mahn, President and CIO at Hennion & Walsh Asset Management, articulated a compelling shift in investor priorities, moving away from conventional macroeconomic indicators to the burgeoning landscape of artificial intelligence. Mahn highlighted that market participants are currently fixated on “big tech earnings, the AI infrastructure spend, and exactly where […]

The post AI Infrastructure and Connectivity Drive Shifting Investor Focus appeared first on StartupHub.ai.

Leil raises €1.5M to advance SMR management software

StartupHub.ai

By:StartupHub.ai Staff

6 November 2025 at 11:04

The post Leil raises €1.5M to advance SMR management software appeared first on StartupHub.ai.

Data storage software company Leil raised €1.5 million to help enterprises adopt high-density SMR hard drives, increasing capacity and reducing energy costs.

The post Leil raises €1.5M to advance SMR management software appeared first on StartupHub.ai.

DualBird Funding Aims to Supercharge AI Data Processing in the Cloud

StartupHub.ai

By:StartupHub.ai Staff

5 November 2025 at 21:58

The post DualBird Funding Aims to Supercharge AI Data Processing in the Cloud appeared first on StartupHub.ai.

DualBird's $25 million funding round positions its cloud-native engine to dramatically accelerate data processing for AI workloads, promising significant cost reductions.

The post DualBird Funding Aims to Supercharge AI Data Processing in the Cloud appeared first on StartupHub.ai.

Tsuga raises $10M to advance its AI-native observability platform

StartupHub.ai

By:StartupHub.ai Staff

5 November 2025 at 13:04

The post Tsuga raises $10M to advance its AI-native observability platform appeared first on StartupHub.ai.

Tsuga raised $10 million to launch its AI-native observability platform, which gives companies control over their data and costs using a bring-your-own-cloud model.

The post Tsuga raises $10M to advance its AI-native observability platform appeared first on StartupHub.ai.

Perplexity Cracks Code for Trillion Parameter Models on AWS

StartupHub.ai

By:StartupHub.ai Staff

4 November 2025 at 23:57

The post Perplexity Cracks Code for Trillion Parameter Models on AWS appeared first on StartupHub.ai.

Perplexity's new open-source kernels make it possible to run massive trillion-parameter AI models on standard AWS cloud infrastructure for the first time.

The post Perplexity Cracks Code for Trillion Parameter Models on AWS appeared first on StartupHub.ai.

Snowflake builds new intelligence that goes beyond RAG to query and aggregate thousands of documents at once

VentureBeat

4 November 2025 at 20:00

Enterprise AI has a data problem. Despite billions in investment and increasingly capable language models, most organizations still can't answer basic analytical questions about their document repositories. The culprit isn't model quality but architecture: Traditional retrieval augmented generation (RAG) systems were designed to retrieve and summarize, not analyze and aggregate across large document sets.

Snowflake is tackling this limitation head-on with a comprehensive platform strategy announced at its BUILD 2025 conference. The company unveiled Snowflake Intelligence, an enterprise intelligence agent platform designed to unify structured and unstructured data analysis, along with infrastructure improvements spanning data integration with Openflow, database consolidation with Snowflake Postgres and real-time analytics with interactive tables. The goal: Eliminate the data silos and architectural bottlenecks that prevent enterprises from operationalizing AI at scale.

A key innovation is Agentic Document Analytics, a new capability within Snowflake Intelligence that can analyze thousands of documents simultaneously. This moves enterprises from basic lookups like "What is our password reset policy?" to complex analytical queries like "Show me a count of weekly mentions by product area in my customer support tickets for the last six months."

The RAG bottleneck: Why sampling fails for analytics

Traditional RAG systems work by embedding documents into vector representations, storing them in a vector database and retrieving the most semantically similar documents when a user asks a question.

"For RAG to work, it requires that all of the answers that you are searching for already exist in some published way today," Jeff Hollan, head of Cortex AI Agents at Snowflake explained to VentureBeat during a press briefing. "The pattern I think about with RAG is it's like a librarian, you get a question and it tells you, 'This book has the answer on this specific page.'"

However, this architecture fundamentally breaks when organizations need to perform aggregate analysis. If, for example, an enterprise has 100,000 reports and wants to identify all of the reports that talk about a specific business entity and sum up all the revenue discussed in those reports, that's a non-trivial task.

"That's a much more complex thing than just traditional RAG," Hollan said.

This limitation has typically forced enterprises to maintain separate analytics pipelines for structured data in data warehouses and unstructured data in vector databases or document stores. The result is data silos and governance challenges for enterprises.

How Agentic Document Analytics works differently

Snowflake's approach unifies structured and unstructured data analysis within its platform by treating documents as queryable data sources rather than retrieval targets. The system uses AI to extract, structure and index document content in ways that enable SQL-like analytical operations across thousands of documents.

The capability leverages Snowflake's existing architecture. Cortex AISQL handles document parsing and extraction. Interactive Tables and Warehouses deliver sub-second query performance on large datasets. By processing documents within the same governed data platform that houses structured data, enterprises can join document insights with transactional data, customer records and other business information.

"The value of AI, the power of AI, the productivity and disruptive potential of AI, is created and enabled by connecting with enterprise data," said Christian Kleinerman, EVP of product at Snowflake.

The company's architecture keeps all data processing within its security boundary, addressing governance concerns that have slowed enterprise AI adoption. The system works with documents across multiple sources. These include PDFs in SharePoint, Slack conversations, Microsoft Teams data and Salesforce records through Snowflake's zero-copy integration capabilities. This eliminates the need to extract and move data into separate AI processing systems.

Comparison with current market approaches

The announcement positions Snowflake differently from both traditional data warehouse vendors and AI-native startups.

Companies like Databricks have focused on bringing AI capabilities to lakehouses, but typically still rely on vector databases and traditional RAG patterns for unstructured data. OpenAI's Assistants API and Anthropic's Claude both offer document analysis, but are limited by context window sizes.

Vector database providers like Pinecone and Weaviate have built businesses around RAG use cases but sometimes face challenges when customers need analytical queries rather than retrieval-based ones. These systems excel at finding relevant documents but cannot easily aggregate information across large document sets.

Among the key high-value use cases that were previously difficult with RAG-only architectures that Snowflow highlights for its approach is customer support analysis. Instead of manually reviewing support tickets, organizations can query patterns across thousands of interactions. Questions like "What are the top 10 product issues mentioned in support tickets this quarter, broken down by customer segment?" become answerable in seconds.

What this means for enterprise AI strategy

For enterprises building AI strategies, Agentic Document Analytics represents a shift from the "search and retrieve" paradigm of RAG to a "query and analyze" paradigm more familiar from business intelligence tools.

Rather than deploying separate vector databases and RAG systems for each use case, enterprises can consolidate document analytics into their existing data platform. This reduces infrastructure complexity while extending business intelligence practices to unstructured data.

The capability also democratizes access. Making document analysis queryable through natural language means insights that previously required data science teams become available to business users.

For enterprises looking to lead in AI, the competitive advantage comes not from having better language models, but from analyzing proprietary unstructured data at scale alongside structured business data. Organizations that can query their entire document corpus as easily as they query their data warehouse will gain insights competitors cannot easily replicate.

"AI is a reality today," Kleinerman said. "We have lots of organizations already getting value out of AI, and if anyone is still waiting it out or sitting on the sidelines, our call to action is to start building now."

AI Nature Solutions: Google & WRI Chart a New Course

StartupHub.ai

By:StartupHub.ai Staff

4 November 2025 at 21:19

The post AI Nature Solutions: Google & WRI Chart a New Course appeared first on StartupHub.ai.

Google and WRI's new roadmap details how AI nature solutions can overcome systemic barriers, making conservation more accessible, affordable, and effective globally.

The post AI Nature Solutions: Google & WRI Chart a New Course appeared first on StartupHub.ai.

Lambda inks multibillion-dollar AI infrastructure deal with Microsoft

TechCrunch

By:Rebecca Szkutak

4 November 2025 at 01:21

This deal was announced just hours after Microsoft announced a $9.7 billion deal with Australian data center company IREN.

AI coding transforms data engineering: How dltHub's open-source Python library helps developers create data pipelines for AI in minutes

VentureBeat

3 November 2025 at 19:00

A quiet revolution is reshaping enterprise data engineering. Python developers are building production data pipelines in minutes using tools that would have required entire specialized teams just months ago.

The catalyst is dlt, an open-source Python library that automates complex data engineering tasks. The tool has reached 3 million monthly downloads and powers data workflows for more than 5,000 companies across regulated industries, including finance, healthcare and manufacturing. That technology is getting another solid vote of confidence today as dltHub, the Berlin-based company behind the open-source dlt library, is raising $8 million in seed funding led by Bessemer Venture Partners.

What makes this significant isn't just adoption numbers; it's how developers are using the tool in combination with AI coding assistants to accomplish tasks that previously required infrastructure engineers, DevOps specialists and on-call personnel.

The company is building a cloud-hosted platform that extends its open-source library into a complete end-to-end solution. The platform will allow developers to deploy pipelines, transformations and notebooks with a single command without worrying about infrastructure. This represents a fundamental shift from data engineering requiring specialized teams to becoming accessible to any Python developer.

"Any Python developer should be able to bring their business users closer to fresh, reliable data," Matthaus Krzykowski, dltHub's co-founder and CEO, told VentureBeat in an exclusive interview. "Our mission is to make data engineering as accessible, collaborative and as frictionless as writing Python itself."

From SQL to Python-native data engineering

The problem the company set out to solve emerged from real-world frustrations.

One comes from a fundamental clash between how different generations of developers work with data. Krzykowski pointed to the generation of developers grounded in SQL and relational database technology. On the other hand, a generation of developers is building AI agents with Python.

This divide reflects deeper technical challenges. SQL-based data engineering locks teams into specific platforms and requires extensive infrastructure knowledge. Python developers working on AI need lightweight, platform-agnostic tools that work in notebooks and integrate with large language model (LLM) coding assistants.

The dlt library changes this equation by automating complex data engineering tasks in simple Python code.

"If you know what a function in Python is, what a list is, a source and resource, then you can write this very declarative, very simple code," Krzykowski explained.

The key technical breakthrough addresses schema evolution automatically. When data sources change their output format, traditional pipelines break.

"DLT has mechanisms to automatically resolve these issues," Thierry Jean, founding engineer at dltHub, told VentureBeat. "So it will push data, and you can say, 'Alert me if things change upstream,' or just make it flexible enough and change the data and the destination in a way to accommodate."

Real-world developer experience

Hoyt Emerson, data consultant and content creator at The Full Data Stack, recently adopted the tool to move data from Google Cloud Storage to multiple destinations, including Amazon S3 and a data warehouse. Traditional approaches would require platform-specific knowledge for each destination. Emerson told VentureBeat that what he really wanted was a much more lightweight, platform-agnostic way to send data from one spot to another.

"That's when DLT gave me the aha moment," Emerson said.

He completed the entire pipeline in five minutes using the library's documentation, which made it easy to get up and running quickly and without issue.

The process gets even more powerful when combined with AI coding assistants. Emerson noted that he's using agentic AI coding principles and realized that the dlt documentation could be sent as context to an LLM to accelerate and automate his data work. With documentation as context, Emerson was able to create reusable templates for future projects and used AI assistants to generate deployment configurations.

"It's extremely LLM-friendly because it's very well documented," he said.

The LLM-native development pattern

This combination of well-documented tools and AI assistance represents a new development pattern. The company has optimized specifically for what they call "YOLO mode" development, where developers copy error messages and paste them into AI coding assistants.

"A lot of these people are literally just copying and pasting error messages and are trying the code editors to figure it out," Krzykowski said. The company takes this behavior seriously enough that they fix issues specifically for AI-assisted workflows.

The results speak to the approach's effectiveness. In September alone, users created more than 50,000 custom connectors using the library. That represents a 20X increase since January, driven largely by LLM-assisted development.

Technical architecture for enterprise scale

The dlt design philosophy prioritizes interoperability over platform lock-in. The tool can deploy anywhere from AWS Lambda to existing enterprise data stacks. It integrates with platforms like Snowflake, while maintaining the flexibility to work with any destination.

"We always believe that DLT needs to be interoperable and modular," Krzykowski explained. "It can be deployed anywhere. It can be on Lambda. It often becomes part of other people's data infrastructures."

Key technical capabilities include:

Automatic schema evolution: Handles upstream data changes without breaking pipelines or requiring manual intervention.
Incremental loading: Processes only new or changed records, reducing computational overhead and costs.
Platform agnostic deployment: Works across cloud providers and on-premises infrastructure without modification.
LLM-optimized documentation: Structured specifically for AI assistant consumption, enabling rapid problem-solving and template generation.

The platform currently supports more than 4,600 REST API data sources with continuous expansion driven by user-generated connectors.

Competing against ETL giants with a code-first approach

The data engineering landscape splits into distinct camps, each serving different enterprise needs and developer preferences.

Traditional ETL platforms like Informatica and Talend dominate enterprise environments with GUI-based tools that require specialized training but offer comprehensive governance features.

Newer SaaS platforms like Fivetran have gained traction by emphasizing pre-built connectors and managed infrastructure, reducing operational overhead but creating vendor dependency.

The open-source dlt library occupies a fundamentally different position as code-first, LLM-native infrastructure that developers can extend and customize.

This positioning reflects the broader shift toward what the industry calls the composable data stack, where enterprises build infrastructure from interoperable components rather than monolithic platforms.

More importantly, the intersection with AI creates new market dynamics. "LLMs aren't replacing data engineers," Krzykowski said. "But they radically expand their reach and productivity."

What this means for enterprise data leaders

For enterprises looking to lead in AI-driven operations, this development represents an opportunity to fundamentally rethink data engineering strategies.

The immediate tactical advantages are clear. Organizations can leverage existing Python developers instead of hiring specialized data engineering teams. Organizations that adapt their tooling and hiring approaches to leverage this trend may find significant cost and agility advantages over competitors still dependent on traditional, team-intensive data engineering.

The question isn't whether this shift toward democratized data engineering will occur. It's how quickly enterprises will adapt to capitalize on it.

Microsoft’s $15.2B UAE investment turns Gulf State into test case for US AI diplomacy

TechCrunch

By:Rebecca Bellan

3 November 2025 at 18:22

For the first time, the U.S. has granted Microsoft a license to export Nvidia chips to the UAE — a move that positions the country as both a proving ground for U.S. export-control diplomacy and a regional anchor of American AI influence.

SK Telecom to expand Ulsan AI data center to 1 GW

Tech in Asia

By:Diya Lal

3 November 2025 at 07:53

The company plans to use energy-focused AIDC solutions and develop hubs in key regions including the Seoul metropolitan area, southern Ulsan.

Moving past speculation: How deterministic CPUs deliver predictable AI performance

VentureBeat

2 November 2025 at 09:00

For more than three decades, modern CPUs have relied on speculative execution to keep pipelines full. When it emerged in the 1990s, speculation was hailed as a breakthrough — just as pipelining and superscalar execution had been in earlier decades. Each marked a generational leap in microarchitecture. By predicting the outcomes of branches and memory loads, processors could avoid stalls and keep execution units busy.

But this architectural shift came at a cost: Wasted energy when predictions failed, increased complexity and vulnerabilities such as Spectre and Meltdown. These challenges set the stage for an alternative: A deterministic, time-based execution model. As David Patterson observed in 1980, “A RISC potentially gains in speed merely from a simpler design.” Patterson’s principle of simplicity underpins a new alternative to speculation: A deterministic, time-based execution model."

For the first time since speculative execution became the dominant paradigm, a fundamentally new approach has been invented. This breakthrough is embodied in a series of six recently issued U.S. patents, sailing through the U.S. Patent and Trademark Office (USPTO). Together, they introduce a radically different instruction execution model. Departing sharply from conventional speculative techniques, this novel deterministic framework replaces guesswork with a time-based, latency-tolerant mechanism. Each instruction is assigned a precise execution slot within the pipeline, resulting in a rigorously ordered and predictable flow of execution. This reimagined model redefines how modern processors can handle latency and concurrency with greater efficiency and reliability.

A simple time counter is used to deterministically set the exact time of when instructions should be executed in the future. Each instruction is dispatched to an execution queue with a preset execution time based on resolving its data dependencies and availability of resources — read buses, execution units and the write bus to the register file. Each instruction remains queued until its scheduled execution slot arrives. This new deterministic approach may represent the first major architectural challenge to speculation since it became the standard.

The architecture extends naturally into matrix computation, with a RISC-V instruction set proposal under community review. Configurable general matrix multiply (GEMM) units, ranging from 8×8 to 64×64, can operate using either register-based or direct-memory acceess (DMA)-fed operands. This flexibility supports a wide range of AI and high-performance computing (HPC) workloads. Early analysis suggests scalability that rivals Google’s TPU cores, while maintaining significantly lower cost and power requirements.

Rather than a direct comparison with general-purpose CPUs, the more accurate reference point is vector and matrix engines: Traditional CPUs still depend on speculation and branch prediction, whereas this design applies deterministic scheduling directly to GEMM and vector units. This efficiency stems not only from the configurable GEMM blocks but also from the time-based execution model, where instructions are decoded and assigned precise execution slots based on operand readiness and resource availability.

Execution is never a random or heuristic choice among many candidates, but a predictable, pre-planned flow that keeps compute resources continuously busy. Planned matrix benchmarks will provide direct comparisons with TPU GEMM implementations, highlighting the ability to deliver datacenter-class performance without datacenter-class overhead.

Critics may argue that static scheduling introduces latency into instruction execution. In reality, the latency already exists — waiting on data dependencies or memory fetches. Conventional CPUs attempt to hide it with speculation, but when predictions fail, the resulting pipeline flush introduces delay and wastes power.

The time-counter approach acknowledges this latency and fills it deterministically with useful work, avoiding rollbacks. As the first patent notes, instructions retain out-of-order efficiency: “A microprocessor with a time counter for statically dispatching instructions enables execution based on predicted timing rather than speculative issue and recovery," with preset execution times but without the overhead of register renaming or speculative comparators.

Why speculation stalled

Speculative execution boosts performance by predicting outcomes before they’re known — executing instructions ahead of time and discarding them if the guess was wrong. While this approach can accelerate workloads, it also introduces unpredictability and power inefficiency. Mispredictions inject “No Ops” into the pipeline, stalling progress and wasting energy on work that never completes.

These issues are magnified in modern AI and machine learning (ML) workloads, where vector and matrix operations dominate and memory access patterns are irregular. Long fetches, non-cacheable loads and misaligned vectors frequently trigger pipeline flushes in speculative architectures.

The result is performance cliffs that vary wildly across datasets and problem sizes, making consistent tuning nearly impossible. Worse still, speculative side effects have exposed vulnerabilities that led to high-profile security exploits. As data intensity grows and memory systems strain, speculation struggles to keep pace — undermining its original promise of seamless acceleration.

Time-based execution and deterministic scheduling

At the core of this invention is a vector coprocessor with a time counter for statically dispatching instructions. Rather than relying on speculation, instructions are issued only when data dependencies and latency windows are fully known. This eliminates guesswork and costly pipeline flushes while preserving the throughput advantages of out-of-order execution. Architectures built on this patented framework feature deep pipelines — typically spanning 12 stages — combined with wide front ends supporting up to 8-way decode and large reorder buffers exceeding 250 entries

As illustrated in Figure 1, the architecture mirrors a conventional RISC-V processor at the top level, with instruction fetch and decode stages feeding into execution units. The innovation emerges in the integration of a time counter and register scoreboard, strategically positioned between fetch/decode and the vector execution units. Instead of relying on speculative comparators or register renaming, they utilize a Register Scoreboard and Time Resource Matrix (TRM) to deterministically schedule instructions based on operand readiness and resource availability.

Figure 1: High-level block diagram of deterministic processor. A time counter and scoreboard sit between fetch/decode and vector execution units, ensuring instructions issue only when operands are ready.

A typical program running on the deterministic processor begins much like it does on any conventional RISC-V system: Instructions are fetched from memory and decoded to determine whether they are scalar, vector, matrix or custom extensions. The difference emerges at the point of dispatch. Instead of issuing instructions speculatively, the processor employs a cycle-accurate time counter, working with a register scoreboard, to decide exactly when each instruction can be executed. This mechanism provides a deterministic execution contract, ensuring instructions complete at predictable cycles and reducing wasted issue slots.

In conjunction with a register scoreboard, the time-resource matrix associates instructions with execution cycles, allowing the processor to plan dispatch deterministically across available resources. The scoreboard tracks operand readiness and hazard information, enabling scheduling without register renaming or speculative comparators. By monitoring dependencies such as read-after-write (RAW) and write-after-read, it ensures hazards are resolved without costly pipeline flushes. As noted in the patent, “in a multi-threaded microprocessor, the time counter and scoreboard permit rescheduling around cache misses, branch flushes, and RAW hazards without speculative rollback.”

Once operands are ready, the instruction is dispatched to the appropriate execution unit. Scalar operations use standard artithmetic logic units (ALUs), while vector and matrix instructions execute in wide execution units connected to a large vector register file. Because instructions launch only when conditions are safe, these units stay highly utilized without the wasted work or recovery cycles caused by mis-predicted speculation.

The key enabler of this approach is a simple time counter that orchestrates execution according to data readiness and resource availability, ensuring instructions advance only when operands are ready and resources available. The same principle applies to memory operations: The interface predicts latency windows for loads and stores, allowing the processor to fill those slots with independent instructions and keep execution flowing.

Programming model differences

From the programmer’s perspective, the flow remains familiar — RISC-V code compiles and executes in the usual way. The crucial difference lies in the execution contract: Rather than relying on dynamic speculation to hide latency, the processor guarantees predictable dispatch and completion times. This eliminates the performance cliffs and wasted energy of speculation while still providing the throughput benefits of out-of-order execution.

This perspective underscores how deterministic execution preserves the familiar RISC-V programming model while eliminating the unpredictability and wasted effort of speculation. As John Hennessy put it: "It’s stupid to do work in run time that you can do in compile time”— a remark reflecting the foundations of RISC and its forward-looking design philosophy.

The RISC-V ISA provides opcodes for custom and extension instructions, including floating-point, DSP, and vector operations. The result is a processor that executes instructions deterministically while retaining the benefits of out-of-order performance. By eliminating speculation, the design simplifies hardware, reduces power consumption and avoids pipeline flushes.

These efficiency gains grow even more significant in vector and matrix operations, where wide execution units require consistent utilization to reach peak performance. Vector extensions require wide register files and large execution units, which in speculative processors necessitate expensive register renaming to recover from branch mispredictions. In the deterministic design, vector instructions are executed only after commit, eliminating the need for renaming.

Each instruction is scheduled against a cycle-accurate time counter: “The time counter provides a deterministic execution contract, ensuring instructions complete at predictable cycles and reducing wasted issue slots.” The vector register scoreboard resolves data dependency before issuing instructions to execution pipeline. Instructions are dispatched in a known order at the correct cycle, making execution both predictable and efficient.

Vector execution units (integer and floating point) connect directly to a large vector register file. Because instructions are never flushed, there is no renaming overhead. The scoreboard ensures safe access, while the time counter aligns execution with memory readiness. A dedicated memory block predicts the return cycle of loads. Instead of stalling or speculating, the processor schedules independent instructions into latency slots, keeping execution units busy. “A vector coprocessor with a time counter for statically dispatching instructions ensures high utilization of wide execution units while avoiding misprediction penalties.”

In today’s CPUs, compilers and programmers write code assuming the hardware will dynamically reorder instructions and speculatively execute branches. The hardware handles hazards with register renaming, branch prediction and recovery mechanisms. Programmers benefit from performance, but at the cost of unpredictability and power consumption.

In the deterministic time-based architecture, instructions are dispatched only when the time counter indicates their operands will be ready. This means the compiler (or runtime system) doesn’t need to insert guard code for misprediction recovery. Instead, compiler scheduling becomes simpler, as instructions are guaranteed to issue at the correct cycle without rollbacks. For programmers, the ISA remains RISC-V compatible, but deterministic extensions reduce reliance on speculative safety nets.

Application in AI and ML

In AI/ML kernels, vector loads and matrix operations often dominate runtime. On a speculative CPU, misaligned or non-cacheable loads can trigger stalls or flushes, starving wide vector and matrix units and wasting energy on discarded work. A deterministic design instead issues these operations with cycle-accurate timing, ensuring high utilization and steady throughput. For programmers, this means fewer performance cliffs and more predictable scaling across problem sizes. And because the patents extend the RISC-V ISA rather than replace it, deterministic processors remain fully compatible with the RVA23 profile and mainstream toolchains such as GCC, LLVM, FreeRTOS, and Zephyr.

In practice, the deterministic model doesn’t change how code is written — it remains RISC-V assembly or high-level languages compiled to RISC-V instructions. What changes is the execution contract: Rather than relying on speculative guesswork, programmers can expect predictable latency behavior and higher efficiency without tuning code around microarchitectural quirks.

The industry is at an inflection point. AI/ML workloads are dominated by vector and matrix math, where GPUs and TPUs excel — but only by consuming massive power and adding architectural complexity. In contrast, general-purpose CPUs, still tied to speculative execution models, lag behind.

A deterministic processor delivers predictable performance across a wide range of workloads, ensuring consistent behavior regardless of task complexity. Eliminating speculative execution enhances energy efficiency and avoids unnecessary computational overhead. Furthermore, deterministic design scales naturally to vector and matrix operations, making it especially well-suited for AI workloads that rely on high-throughput parallelism. This new deterministic approach may represent the next such leap: The first major architectural challenge to speculation since speculation itself became the standard.

Will deterministic CPUs replace speculation in mainstream computing? That remains to be seen. But with issued patents, proven novelty and growing pressure from AI workloads, the timing is right for a paradigm shift. Taken together, these advances signal deterministic execution as the next architectural leap — redefining performance and efficiency just as speculation once did.

Speculation marked the last revolution in CPU design; determinism may well represent the next.

Thang Tran is the founder and CTO of Simplex Micro.

Read more from our guest writers. Or, consider submitting a post of your own! See our guidelines here.

AWS exceeds Wall Street’s expectations as demand for cloud infra remains high

TechCrunch

By:Rebecca Szkutak

31 October 2025 at 20:59

AWS continues to see strong demand as companies gobble up its cloud infrastructure services in the age of AI.

The missing data link in enterprise AI: Why agents need streaming context, not just better prompts

VentureBeat

29 October 2025 at 19:00

Enterprise AI agents today face a fundamental timing problem: They can't easily act on critical business events because they aren't always aware of them in real-time.

The challenge is infrastructure. Most enterprise data lives in databases fed by extract-transform-load (ETL) jobs that run hourly or daily — ultimately too slow for agents that must respond in real time.

One potential way to tackle that challenge is to have agents directly interface with streaming data systems. Among the primary approaches in use today are the open source Apache Kafka and Apache Flink technologies. There are multiple commercial implementations based on those technologies, too, Confluent, which is led by the original creators behind Kafka, being one of them.

Today, Confluent is introducing a real-time context engine designed to solve this latency problem. The technology builds on Apache Kafka, the distributed event streaming platform that captures data as events occur, and open-source Apache Flink, the stream processing engine that transforms those events in real time.

The company is also releasing an open-source framework, Flink Agents, developed in collaboration with Alibaba Cloud, LinkedIn and Ververica. The framework brings event-driven AI agent capabilities directly to Apache Flink, allowing organizations to build agents that monitor data streams and trigger automatically based on conditions without committing to Confluent's managed platform.

"Today, most enterprise AI systems can't respond automatically to important events in a business without someone prompting them first," Sean Falconer, Confluent's head of AI, told VentureBeat. "This leads to lost revenue, unhappy customers or added risk when a payment fails or a network malfunctions."

The significance extends beyond Confluent's specific products. The industry is recognizing that AI agents require different data infrastructure than traditional applications. Agents don't just retrieve information when asked. They need to observe continuous streams of business events and act automatically when conditions warrant. This requires streaming architecture, not batch pipelines.

Batch versus streaming: Why RAG alone isn't enough

To understand the problem, it's important to distinguish between the different approaches to moving data through enterprise systems and how they can connect to agentic AI.

In batch processing, data accumulates in source systems until a scheduled job runs. That job extracts the data, transforms it and loads it into a target database or data warehouse. This might occur hourly, daily or even weekly. The approach works well for analytical workloads, but it creates latency between when something happens in the business and when systems can act on it.

Data streaming inverts this model. Instead of waiting for scheduled jobs, streaming platforms like Apache Kafka capture events as they occur. Each database update, user action, transaction or sensor reading becomes an event published to a stream. Apache Flink then processes these streams to join, filter and aggregate data in real time. The result is processed data that reflects the current state of the business, updating continuously as new events arrive.

This distinction becomes critical when you consider what kinds of context AI agents actually need. Much of the current enterprise AI discussion focuses on retrieval-augmented generation (RAG), which handles semantic search over knowledge bases to find relevant documentation, policies or historical information. RAG works well for questions like "What's our refund policy?" where the answer exists in static documents.

But many enterprise use cases require what Falconer calls "structural context" — precise, up-to-date information from multiple operational systems stitched together in real time. Consider a job recommendation agent that requires user profile data from the HR database, browsing behavior from the last hour, search queries from minutes ago and current open positions across multiple systems.

"The part that we're unlocking for businesses is the ability to essentially serve that structural context needed to deliver the freshest version," Falconer said.

The MCP connection problem: Stale data and fragmented context

The challenge isn't simply connecting AI to enterprise data. Model Context Protocol (MCP), introduced by Anthropic earlier this year, already standardized how agents access data sources. The problem is what happens after the connection is made.

In most enterprise architectures today, AI agents connect via MCP to data lakes or warehouses fed by batch ETL pipelines. This creates two critical failures: The data is stale, reflecting yesterday's reality rather than current events, and it's fragmented across multiple systems, requiring significant preprocessing before an agent can reason about it effectively.

The alternative — putting MCP servers directly in front of operational databases and APIs — creates different problems. Those endpoints weren't designed for agent consumption, which can lead to high token costs as agents process excessive raw data and multiple inference loops as they try to make sense of unstructured responses.

"Enterprises have the data, but it's often stale, fragmented or locked in formats that AI can't use effectively," Falconer explained. "The real-time context engine solves this by unifying data processing, reprocessing and serving, turning continuous data streams into live context for smarter, faster and more reliable AI decisions."

The technical architecture: Three layers for real-time agent context

Confluent's platform encompasses three elements that work together or adopted separately.

The real-time context engine is the managed data infrastructure layer on Confluent Cloud. Connectors pull data into Kafka topics as events occur. Flink jobs process these streams into "derived datasets" — materialized views joining historical and real-time signals. For customer support, this might combine account history, current session behavior and inventory status into one unified context object. The Engine exposes this through a managed MCP server.

Streaming agents is Confluent's proprietary framework for building AI agents that run natively on Flink. These agents monitor data streams and trigger automatically based on conditions — they don't wait for prompts. The framework includes simplified agent definitions, built-in observability and native Claude integration from Anthropic. It's available in open preview on Confluent's platform.

Flink Agents is the open-source framework developed with Alibaba Cloud, LinkedIn and Ververica. It brings event-driven agent capabilities directly to Apache Flink, allowing organizations to build streaming agents without committing to Confluent's managed platform. They handle operational complexity themselves but avoid vendor lock-in.

Competition heats up for agent-ready data infrastructure

Confluent isn't alone in recognizing that AI agents need different data infrastructure.

The day before Confluent's announcement, rival Redpanda introduced its own Agentic Data Plane — combining streaming, SQL and governance specifically for AI agents. Redpanda acquired Oxla's distributed SQL engine to give agents standard SQL endpoints for querying data in motion or at rest. The platform emphasizes MCP-aware connectivity, full observability of agent interactions and what it calls "agentic access control" with fine-grained, short-lived tokens.

The architectural approaches differ. Confluent emphasizes stream processing with Flink to create derived datasets optimized for agents. Redpanda emphasizes federated SQL querying across disparate sources. Both recognize agents need real-time context with governance and observability.

Beyond direct streaming competitors, Databricks and Snowflake are fundamentally analytical platforms adding streaming capabilities. Their strength is complex queries over large datasets, with streaming as an enhancement. Confluent and Redpanda invert this: Streaming is the foundation, with analytical and AI workloads built on top of data in motion.

How streaming context works in practice

Among the users of Confluent's system is transportation vendor Busie. The company is building a modern operating system for charter bus companies that helps them manage quotes, trips, payments and drivers in real time.

"Data streaming is what makes that possible," Louis Bookoff, Busie co-founder and CEO told VentureBeat. "Using Confluent, we move data instantly between different parts of our system instead of waiting for overnight updates or batch reports. That keeps everything in sync and helps us ship new features faster.

Bookoff noted that the same foundation is what will make gen AI valuable for his customers.

"In our case, every action like a quote sent or a driver assigned becomes an event that streams through the system immediately," Bookoff said. "That live feed of information is what will let our AI tools respond in real time with low latency rather than just summarize what already happened."

The challenge, however, is how to understand context. When thousands of live events flow through the system every minute, AI models need relevant, accurate data without getting overwhelmed.

"If the data isn't grounded in what is happening in the real world, AI can easily make wrong assumptions and in turn take wrong actions," Bookoff said. "Stream processing solves that by continuously validating and reconciling live data against activity in Busie."

What this means for enterprise AI strategy

Streaming context architecture signals a fundamental shift in how AI agents consume enterprise data.

AI agents require continuous context that blends historical understanding with real-time awareness — they need to know what happened, what's happening and what might happen next, all at once.

For enterprises evaluating this approach, start by identifying use cases where data staleness breaks the agent. Fraud detection, anomaly investigation and real-time customer intervention fail with batch pipelines that refresh hourly or daily. If your agents need to act on events within seconds or minutes of them occurring, streaming context becomes necessary rather than optional.

"When you're building applications on top of foundation models, because they're inherently probabilistic, you use data and context to steer the model in a direction where you want to get some kind of outcome," Falconer said. "The better you can do that, the more reliable and better the outcome."

Intuit learned to build AI agents for finance the hard way: Trust lost in buckets, earned back in spoonfuls

VentureBeat

28 October 2025 at 16:30

Building AI for financial software requires a different playbook than consumer AI, and Intuit's latest QuickBooks release provides an example.

The company has announced Intuit Intelligence, a system that orchestrates specialized AI agents across its QuickBooks platform to handle tasks including sales tax compliance and payroll processing. These new agents augment existing accounting and project management agents (which have also been updated) as well as a unified interface that lets users query data across QuickBooks, third-party systems and uploaded files using natural language.

The new development follow years of investment and improvement in Intuit's GenOS, allowing the company to build AI capabilities that reduce latency and improve accuracy.

But the real news isn't what Intuit built — it's how they built it and why their design decisions will make AI more usable. The company's latest AI rollout represents an evolution built on hard-won lessons about what works and what doesn't when deploying AI in financial contexts.

What the company learned is sobering: Even when its accounting agent improved transaction categorization accuracy by 20 percentage points on average, they still received complaints about errors.

"The use cases that we're trying to solve for customers include tax and finance; if you make a mistake in this world, you lose trust with customers in buckets and we only get it back in spoonfuls," Joe Preston, Intuit's VP of product and design, told VentureBeat.

The architecture of trust: Real data queries over generative responses

Intuit's technical strategy centers on a fundamental design decision. For financial queries and business intelligence, the system queries actual data, rather than generating responses through large language models (LLMs).

Also critically important: That data isn't all in one place. Intuit's technical implementation allows QuickBooks to ingest data from multiple distinct sources: native Intuit data, OAuth-connected third-party systems like Square for payments and user-uploaded files such as spreadsheets containing vendor pricing lists or marketing campaign data. This creates a unified data layer that AI agents can query reliably.

"We're actually querying your real data," Preston explained. "That's very different than if you were to just copy, paste out a spreadsheet or a PDF and paste into ChatGPT."

This architectural choice means that the Intuit Intelligence system functions more as an orchestration layer. It's a natural language interface to structured data operations. When a user asks about projected profitability or wants to run payroll, the system translates the natural language query into database operations against verified financial data.

This matters because Intuit's internal research has uncovered widespread shadow AI usage. When surveyed, 25% of accountants using QuickBooks admitted they were already copying and pasting data into ChatGPT or Google Gemini for analysis.

Intuit's approach treats AI as a query translation and orchestration mechanism, not a content generator. This reduces the hallucination risk that has plagued AI deployments in financial contexts.

Explainability as a design requirement, not an afterthought

Beyond the technical architecture, Intuit has made explainability a core user experience across its AI agents. This goes beyond simply providing correct answers: It means showing users the reasoning behind automated decisions.

When Intuit's accounting agent categorizes a transaction, it doesn't just display the result; it shows the reasoning. This isn't marketing copy about explainable AI, it's actual UI displaying data points and logic.

"It's about closing that trust loop and making sure customers understand the why," Alastair Simpson, Intuit's VP of design, told VentureBeat.

This becomes particularly critical when you consider Intuit's user research: While half of small businesses describe AI as helpful, nearly a quarter haven't used AI at all. The explanation layer serves both populations: Building confidence for newcomers, while giving experienced users the context to verify accuracy.

The design also enforces human control at critical decision points. This approach extends beyond the interface. Intuit connects users directly with human experts, embedded in the same workflows, when automation reaches its limits or when users want validation.

Navigating the transition from forms to conversations

One of Intuit's more interesting challenges involves managing a fundamental shift in user interfaces. Preston described it as having one foot in the past and one foot in the future.

"This isn't just Intuit, this is the market as a whole," said Preston. "Today we still have a lot of customers filling out forms and going through tables full of data. We're investing a lot into leaning in and questioning the ways that we do it across our products today, where you're basically just filling out, form after form, or table after table, because we see where the world is headed, which is really a different form of interacting with these products."

This creates a product design challenge: How do you serve users who are comfortable with traditional interfaces while gradually introducing conversational and agentic capabilities?

Intuit's approach has been to embed AI agents directly into existing workflows. This means not forcing users to adopt entirely new interaction patterns. The payments agent appears alongside invoicing workflows; the accounting agent enhances the existing reconciliation process rather than replacing it. This incremental approach lets users experience AI benefits without abandoning familiar processes.

What enterprise AI builders can learn from Intuit's approach

Intuit's experience deploying AI in financial contexts surfaces several principles that apply broadly to enterprise AI initiatives.

Architecture matters for trust: In domains where accuracy is critical, consider whether you need content generation or data query translation. Intuit's decision to treat AI as an orchestration and natural language interface layer dramatically reduces hallucination risk and avoids using AI as a generative system.

Explainability must be designed in, not bolted on: Showing users why the AI made a decision isn't optional when trust is at stake. This requires deliberate UX design. It may constrain model choices.

User control preserves trust during accuracy improvements: Intuit's accounting agent improved categorization accuracy by 20 percentage points. Yet, maintaining user override capabilities was essential for adoption.

Transition gradually from familiar interfaces: Don't force users to abandon forms for conversations. Embed AI capabilities into existing workflows first. Let users experience benefits before asking them to change behavior.

Be honest about what's reactive versus proactive: Current AI agents primarily respond to prompts and automate defined tasks. True proactive intelligence that makes unprompted strategic recommendations remains an evolving capability.

Address workforce concerns with tooling, not just messaging: If AI is meant to augment rather than replace workers, provide workers with AI tools. Show them how to leverage the technology.

For enterprises navigating AI adoption, Intuit's journey offers a clear directive. The winning approach prioritizes trustworthiness over capability demonstrations. In domains where mistakes have real consequences, that means investing in accuracy, transparency and human oversight before pursuing conversational sophistication or autonomous action.

Simpson frames the challenge succinctly: "We didn't want it to be a bolted-on layer. We wanted customers to be in their natural workflow, and have agents doing work for customers, embedded in the workflow."

How AI-powered cameras are redefining business intelligence

VentureBeat

27 October 2025 at 08:00

Presented by Axis Communications

Many businesses are equipped with a network of intelligent eyes that span operations. These IP cameras and intelligent edge devices were once solely focused on ensuring the safety of employees, customers, and inventory. These technologies have long proved to be essential tools for businesses, and while this sentiment still rings true, they’re now emerging as powerful resources.

These cameras and edge devices have rapidly evolved into real-time data producers. IP cameras can now see and understand, and the accompanying artificial intelligence helps companies and decision-makers generate business intelligence, improve operational efficiency, and gain a competitive advantage.

By treating cameras as vision sensors and sources of operational insight, businesses can transform everyday visibility into measurable business value.

Intelligence on the edge

Network cameras have come a long way since Axis Communications first introduced this technology in 1996. Over time, innovations like the ARTPEC chip, the first chip purpose-built for IP video, helped enhance image quality, analytics, and encoding performance.

Today, these intelligent devices are powering a new generation of business intelligence and operational efficiency solutions via embedded AI. Actionable insights are now fed directly into intelligence platforms, ERP systems, and real-time dashboards, and the results are significant and far-reaching.

In manufacturing, intelligent cameras are detecting defects on the production line early, before an entire production run is compromised. In retail, these cameras can run software that maps customer journeys and optimizes product placement. In healthcare, these solutions help facilities enhance patient care while improving operational efficiency and reducing costs.

The combination of video and artificial intelligence has significantly expanded what cameras can do — transforming them into vital tools for improving business performance.

Proof in practice

Companies are creatively taking advantage of edge devices like AI-enabled cameras to improve business intelligence and operational efficiencies.

BMW has relied on intelligent IP cameras to optimize efficiency and product quality, with AI-driven video systems catching defects that are often invisible to the human eye. Or take Google Cloud’s shelf-checking AI technology, an innovative software that allows retailers to make instant restocking decisions using real-time data.

These technologies appeal to far more than retailers and vendors. The A.C. Camargo Cancer Center in Brazil uses network cameras to reduce theft, assure visitor and employee safety, and optimize patient flow. By relying on newfound business intelligence, the facility has saved more than $2 million in operational costs through two years, with those savings being reinvested directly into patient care.

Urban projects can also benefit from edge devices and artificial intelligence. For example, Vanderbilt University turned to video analytics to study traffic flow, relying on AI to uncover the causes of phantom congestion and enabling smarter traffic management. These studies will have additional impact on the local environment and public, as the learnings can be used to optimize safety, air quality, and fuel efficiency.

Each case illustrates the same point: AI-powered cameras can fuel a tangible return on investment and crucial business intelligence, regardless of the industry.

Preparing for the next phase

The role of AI in video intelligence is still expanding, with several emerging trends driving greater advancements and impact in the years ahead:

Predictive operations: cameras that are capable of forecasting needs or risks through predictive analytics
Versatile analytics: systems that incorporate audio, thermal, and environmental sensors for more comprehensive and accurate insights
Technological collaboration: cameras that integrate with other intelligent edge devices to autonomously manage tasks
Sustainability initiatives: intelligent technologies that reduce energy use and support resource efficiency

Axis Communications helps advance these possibilities with open-source, scalable systems engineered to address both today’s challenges and tomorrow’s opportunities. By staying ahead of this ever-changing environment, Axis helps ensure that organizations continue to benefit from actionable business intelligence while maintaining the highest standards of security and safety.

Cameras have evolved beyond simple surveillance tools. They are strategic assets that inform operations, foster innovation, and enable future readiness. Business leaders who cling to traditional views of IP cameras and edge devices risk missing opportunities for efficiency and innovation. Those who embrace an AI-driven approach can expect not only stronger security but also better business outcomes.

Ultimately, the value of IP cameras and edge devices lies not in categories but in capabilities. In an era of rapidly evolving artificial intelligence, these unique technologies will become indispensable to overall business success.

About Axis Communications

Axis enables a smarter and safer world by improving security, safety, operational efficiency, and business intelligence. As a network technology company and industry leader, Axis offers video surveillance, access control, intercoms, and audio solutions. These are enhanced by intelligent analytics applications and supported by high-quality training.

Axis has around 5,000 dedicated employees in over 50 countries and collaborates with technology and system integration partners worldwide to deliver customer solutions. Axis was founded in 1984, and the headquarters are in Lund, Sweden.

Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact sales@venturebeat.com.

Thinking Machines challenges OpenAI's AI scaling strategy: 'First superintelligence will be a superhuman learner'

VentureBeat

By:michael.nunez@venturebeat.com (Michael Nuñez)

24 October 2025 at 13:30

While the world's leading artificial intelligence companies race to build ever-larger models, betting billions that scale alone will unlock artificial general intelligence, a researcher at one of the industry's most secretive and valuable startups delivered a pointed challenge to that orthodoxy this week: The path forward isn't about training bigger — it's about learning better.

"I believe that the first superintelligence will be a superhuman learner," Rafael Rafailov, a reinforcement learning researcher at Thinking Machines Lab, told an audience at TED AI San Francisco on Tuesday. "It will be able to very efficiently figure out and adapt, propose its own theories, propose experiments, use the environment to verify that, get information, and iterate that process."

This breaks sharply with the approach pursued by OpenAI, Anthropic, Google DeepMind, and other leading laboratories, which have bet billions on scaling up model size, data, and compute to achieve increasingly sophisticated reasoning capabilities. Rafailov argues these companies have the strategy backwards: what's missing from today's most advanced AI systems isn't more scale — it's the ability to actually learn from experience.

"Learning is something an intelligent being does," Rafailov said, citing a quote he described as recently compelling. "Training is something that's being done to it."

The distinction cuts to the core of how AI systems improve — and whether the industry's current trajectory can deliver on its most ambitious promises. Rafailov's comments offer a rare window into the thinking at Thinking Machines Lab, the startup co-founded in February by former OpenAI chief technology officer Mira Murati that raised a record-breaking $2 billion in seed funding at a $12 billion valuation.

Why today's AI coding assistants forget everything they learned yesterday

To illustrate the problem with current AI systems, Rafailov offered a scenario familiar to anyone who has worked with today's most advanced coding assistants.

"If you use a coding agent, ask it to do something really difficult — to implement a feature, go read your code, try to understand your code, reason about your code, implement something, iterate — it might be successful," he explained. "And then come back the next day and ask it to implement the next feature, and it will do the same thing."

The issue, he argued, is that these systems don't internalize what they learn. "In a sense, for the models we have today, every day is their first day of the job," Rafailov said. "But an intelligent being should be able to internalize information. It should be able to adapt. It should be able to modify its behavior so every day it becomes better, every day it knows more, every day it works faster — the way a human you hire gets better at the job."

The duct tape problem: How current training methods teach AI to take shortcuts instead of solving problems

Rafailov pointed to a specific behavior in coding agents that reveals the deeper problem: their tendency to wrap uncertain code in try/except blocks — a programming construct that catches errors and allows a program to continue running.

"If you use coding agents, you might have observed a very annoying tendency of them to use try/except pass," he said. "And in general, that is basically just like duct tape to save the entire program from a single error."

Why do agents do this? "They do this because they understand that part of the code might not be right," Rafailov explained. "They understand there might be something wrong, that it might be risky. But under the limited constraint—they have a limited amount of time solving the problem, limited amount of interaction—they must only focus on their objective, which is implement this feature and solve this bug."

The result: "They're kicking the can down the road."

This behavior stems from training systems that optimize for immediate task completion. "The only thing that matters to our current generation is solving the task," he said. "And anything that's general, anything that's not related to just that one objective, is a waste of computation."

Why throwing more compute at AI won't create superintelligence, according to Thinking Machines researcher

Rafailov's most direct challenge to the industry came in his assertion that continued scaling won't be sufficient to reach AGI.

"I don't believe we're hitting any sort of saturation points," he clarified. "I think we're just at the beginning of the next paradigm—the scale of reinforcement learning, in which we move from teaching our models how to think, how to explore thinking space, into endowing them with the capability of general agents."

In other words, current approaches will produce increasingly capable systems that can interact with the world, browse the web, write code. "I believe a year or two from now, we'll look at our coding agents today, research agents or browsing agents, the way we look at summarization models or translation models from several years ago," he said.

But general agency, he argued, is not the same as general intelligence. "The much more interesting question is: Is that going to be AGI? And are we done — do we just need one more round of scaling, one more round of environments, one more round of RL, one more round of compute, and we're kind of done?"

His answer was unequivocal: "I don't believe this is the case. I believe that under our current paradigms, under any scale, we are not enough to deal with artificial general intelligence and artificial superintelligence. And I believe that under our current paradigms, our current models will lack one core capability, and that is learning."

Teaching AI like students, not calculators: The textbook approach to machine learning

To explain the alternative approach, Rafailov turned to an analogy from mathematics education.

"Think about how we train our current generation of reasoning models," he said. "We take a particular math problem, make it very hard, and try to solve it, rewarding the model for solving it. And that's it. Once that experience is done, the model submits a solution. Anything it discovers—any abstractions it learned, any theorems—we discard, and then we ask it to solve a new problem, and it has to come up with the same abstractions all over again."

That approach misunderstands how knowledge accumulates. "This is not how science or mathematics works," he said. "We build abstractions not necessarily because they solve our current problems, but because they're important. For example, we developed the field of topology to extend Euclidean geometry — not to solve a particular problem that Euclidean geometry couldn't handle, but because mathematicians and physicists understood these concepts were fundamentally important."

The solution: "Instead of giving our models a single problem, we might give them a textbook. Imagine a very advanced graduate-level textbook, and we ask our models to work through the first chapter, then the first exercise, the second exercise, the third, the fourth, then move to the second chapter, and so on—the way a real student might teach themselves a topic."

The objective would fundamentally change: "Instead of rewarding their success — how many problems they solved — we need to reward their progress, their ability to learn, and their ability to improve."

This approach, known as "meta-learning" or "learning to learn," has precedents in earlier AI systems. "Just like the ideas of scaling test-time compute and search and test-time exploration played out in the domain of games first" — in systems like DeepMind's AlphaGo — "the same is true for meta learning. We know that these ideas do work at a small scale, but we need to adapt them to the scale and the capability of foundation models."

The missing ingredients for AI that truly learns aren't new architectures—they're better data and smarter objectives

When Rafailov addressed why current models lack this learning capability, he offered a surprisingly straightforward answer.

"Unfortunately, I think the answer is quite prosaic," he said. "I think we just don't have the right data, and we don't have the right objectives. I fundamentally believe a lot of the core architectural engineering design is in place."

Rather than arguing for entirely new model architectures, Rafailov suggested the path forward lies in redesigning the data distributions and reward structures used to train models.

"Learning, in of itself, is an algorithm," he explained. "It has inputs — the current state of the model. It has data and compute. You process it through some sort of structure, choose your favorite optimization algorithm, and you produce, hopefully, a stronger model."

The question: "If reasoning models are able to learn general reasoning algorithms, general search algorithms, and agent models are able to learn general agency, can the next generation of AI learn a learning algorithm itself?"

His answer: "I strongly believe that the answer to this question is yes."

The technical approach would involve creating training environments where "learning, adaptation, exploration, and self-improvement, as well as generalization, are necessary for success."

"I believe that under enough computational resources and with broad enough coverage, general purpose learning algorithms can emerge from large scale training," Rafailov said. "The way we train our models to reason in general over just math and code, and potentially act in general domains, we might be able to teach them how to learn efficiently across many different applications."

Forget god-like reasoners: The first superintelligence will be a master student

This vision leads to a fundamentally different conception of what artificial superintelligence might look like.

"I believe that if this is possible, that's the final missing piece to achieve truly efficient general intelligence," Rafailov said. "Now imagine such an intelligence with the core objective of exploring, learning, acquiring information, self-improving, equipped with general agency capability—the ability to understand and explore the external world, the ability to use computers, ability to do research, ability to manage and control robots."

Such a system would constitute artificial superintelligence. But not the kind often imagined in science fiction.

"I believe that intelligence is not going to be a single god model that's a god-level reasoner or a god-level mathematical problem solver," Rafailov said. "I believe that the first superintelligence will be a superhuman learner, and it will be able to very efficiently figure out and adapt, propose its own theories, propose experiments, use the environment to verify that, get information, and iterate that process."

This vision stands in contrast to OpenAI's emphasis on building increasingly powerful reasoning systems, or Anthropic's focus on "constitutional AI." Instead, Thinking Machines Lab appears to be betting that the path to superintelligence runs through systems that can continuously improve themselves through interaction with their environment.

The $12 billion bet on learning over scaling faces formidable challenges

Rafailov's appearance comes at a complex moment for Thinking Machines Lab. The company has assembled an impressive team of approximately 30 researchers from OpenAI, Google, Meta, and other leading labs. But it suffered a setback in early October when Andrew Tulloch, a co-founder and machine learning expert, departed to return to Meta after the company launched what The Wall Street Journal called a "full-scale raid" on the startup, approaching more than a dozen employees with compensation packages ranging from $200 million to $1.5 billion over multiple years.

Despite these pressures, Rafailov's comments suggest the company remains committed to its differentiated technical approach. The company launched its first product, Tinker, an API for fine-tuning open-source language models, in October. But Rafailov's talk suggests Tinker is just the foundation for a much more ambitious research agenda focused on meta-learning and self-improving systems.

"This is not easy. This is going to be very difficult," Rafailov acknowledged. "We'll need a lot of breakthroughs in memory and engineering and data and optimization, but I think it's fundamentally possible."

He concluded with a play on words: "The world is not enough, but we need the right experiences, and we need the right type of rewards for learning."

The question for Thinking Machines Lab — and the broader AI industry — is whether this vision can be realized, and on what timeline. Rafailov notably did not offer specific predictions about when such systems might emerge.

In an industry where executives routinely make bold predictions about AGI arriving within years or even months, that restraint is notable. It suggests either unusual scientific humility — or an acknowledgment that Thinking Machines Lab is pursuing a much longer, harder path than its competitors.

For now, the most revealing detail may be what Rafailov didn't say during his TED AI presentation. No timeline for when superhuman learners might emerge. No prediction about when the technical breakthroughs would arrive. Just a conviction that the capability was "fundamentally possible" — and that without it, all the scaling in the world won't be enough.

Research finds that 77% of data engineers have heavier workloads despite AI tools: Here's why and what to do about it

VentureBeat

23 October 2025 at 17:00

Data engineers should be working faster than ever. AI-powered tools promise to automate pipeline optimization, accelerate data integration and handle the repetitive grunt work that has defined the profession for decades.

Yet, according to a new survey of 400 senior technology executives by MIT Technology Review Insights in partnership with Snowflake, 77% say their data engineering teams' workloads are getting heavier, not lighter.

The culprit? The very AI tools meant to help are creating a new set of problems.

While 83% of organizations have already deployed AI-based data engineering tools, 45% cite integration complexity as a top challenge. Another 38% are struggling with tool sprawl and fragmentation.

"Many data engineers are using one tool to collect data, one tool to process data and another to run analytics on that data," Chris Child, VP of product for data engineering at Snowflake, told VentureBeat. "Using several tools along this data lifecycle introduces complexity, risk and increased infrastructure management, which data engineers can't afford to take on."

The result is a productivity paradox. AI tools are making individual tasks faster, but the proliferation of disconnected tools is making the overall system more complex to manage. For enterprises racing to deploy AI at scale, this fragmentation represents a critical bottleneck.

From SQL queries to LLM pipelines: The daily workflow shift

The survey found that data engineers spent an average of 19% of their time on AI projects two years ago. Today, that figure has jumped to 37%. Respondents expect it to hit 61% within two years.

But what does that shift actually look like in practice?

Child offered a concrete example. Previously, if the CFO of a company needed to make forecast predictions, they would tap the data engineering team to help build a system that correlates unstructured data like vendor contracts with structured data like revenue numbers into a static dashboard. Connecting these two worlds of different data types was extremely time-consuming and expensive, requiring lawyers to manually read through each document for key contract terms and upload that information into a database.

Today, that same workflow looks radically different.

"Data engineers can use a tool like Snowflake Openflow to seamlessly bring the unstructured PDF contracts living in a source like Box, together with the structured financial figures into a single platform like Snowflake, making the data accessible to LLMs," Child said. "What used to take hours of manual work is now near instantaneous."

The shift isn't just about speed. It's about the nature of the work itself.

Two years ago, a typical data engineer's day consisted of tuning clusters, writing SQL transformations and ensuring data readiness for human analysts. Today, that same engineer is more likely to be debugging LLM-powered transformation pipelines and setting up governance rules for AI model workflows.

"Data engineers' core skill isn't just coding," Child said. "It's orchestrating the data foundation and ensuring trust, context and governance so AI outputs are reliable."

The tool stack problem: When help becomes hindrance

Here's where enterprises are getting stuck.

The promise of AI-powered data tools is compelling: automate pipeline optimization, accelerate debugging, streamline integration. But in practice, many organizations are discovering that each new AI tool they add creates its own integration headaches.

The survey data bears this out. While AI has led to improvements in output quantity (74% report increases) and quality (77% report improvements), those gains are being offset by the operational overhead of managing disconnected tools.

"The other problem we're seeing is that AI tools often make it easy to build a prototype by stitching together several data sources with an out-of-the-box LLM," Child said. "But then when you want to take that into production, you realize that you don't have the data accessible and you don't know what governance you need, so it becomes difficult to roll the tool out to your users."

For technical decision-makers evaluating their data engineering stack right now, Child offered a clear framework.

"Teams should prioritize AI tools that accelerate productivity, while at the same time eliminate infrastructure and operational complexity," he said. "This allows engineers to move their focus away from managing the 'glue work' of data engineering and closer to business outcomes."

The agentic AI deployment window: 12 months to get it right

The survey revealed that 54% of organizations plan to deploy agentic AI within the next 12 months. Agentic AI refers to autonomous agents that can make decisions and take actions without human intervention. Another 20% have already begun doing so.

For data engineering teams, agentic AI represents both an enormous opportunity and a significant risk. Done right, autonomous agents can handle repetitive tasks like detecting schema drift or debugging transformation errors. Done wrong, they can corrupt datasets or expose sensitive information.

"Data engineers must prioritize pipeline optimization and monitoring in order to truly deploy agentic AI at scale," Child said. "It's a low-risk, high-return starting point that allows agentic AI to safely automate repetitive tasks like detecting schema drift or debugging transformation errors when done correctly."

But Child was emphatic about the guardrails that must be in place first.

"Before organizations let agents near production data, two safeguards must be in place: strong governance and lineage tracking, and active human oversight," he said. "Agents must inherit fine-grained permissions and operate within an established governance framework."

The risks of skipping those steps are real. "Without proper lineage or access governance, an agent could unintentionally corrupt datasets or expose sensitive information," Child warned.

The perception gap that's costing enterprises AI success

Perhaps the most striking finding in the survey is a disconnect at the C-suite level.

While 80% of chief data officers and 82% of chief AI officers consider data engineers integral to business success, only 55% of CIOs share that view.

"This shows that the data-forward leaders are seeing data engineering's strategic value, but we need to do more work to help the rest of the C-suite recognize that investing in a unified, scalable data foundation and the people helping drive this is an investment in AI success, not just IT operations," Child said.

That perception gap has real consequences.

Data engineers in the surveyed organizations are already influential in decisions about AI use-case feasibility (53% of respondents) and business units' use of AI models (56%). But if CIOs don't recognize data engineers as strategic partners, they're unlikely to give those teams the resources, authority or seat at the table they need to prevent the kinds of tool sprawl and integration problems the survey identified.

The gap appears to correlate with visibility. Chief data officers and chief AI officers work directly with data engineering teams daily and understand the complexity of what they're managing. CIOs, focused more broadly on infrastructure and operations, may not see the strategic architecture work that data engineers are increasingly doing.

This disconnect also shows up in how different executives rate the challenges facing data engineering teams. Chief AI officers are significantly more likely than CIOs to agree that data engineers' workloads are becoming increasingly heavy (93% vs. 75%). They're also more likely to recognize data engineers' influence on overall AI strategy.

What data engineers need to learn now

The survey identified three critical skills data engineers need to develop: AI expertise, business acumen and communication abilities.

For an enterprise with a 20-person data engineering team, that presents a practical challenge. Do you hire for these skills, train existing engineers or restructure the team? Child's answer suggested the priority should be business understanding.

"The most important skill right now is for data engineers to understand what is critical to their end business users and prioritize how they can make those questions easier and faster to answer," he said.

The lesson for enterprises: Business context matters more than adding technical certifications. Child stressed that understanding the business impact of 'why' data engineers are performing certain tasks will allow them to anticipate the needs of customers better, delivering value more immediately to the business.

"The organizations with data engineering teams that prioritize this business understanding will set themselves apart from competition."

For enterprises looking to lead in AI, the solution to the data engineering productivity crisis isn't more AI tools. The organizations that will move fastest are consolidating their tool stacks now, deploying governance infrastructure before agents go into production and elevating data engineers from support staff to strategic architects.

The window is narrow. With 54% planning agentic AI deployment within 12 months and data engineers expected to spend 61% of their time on AI projects within two years, teams that haven't addressed tool sprawl and governance gaps will find their AI initiatives stuck in permanent pilot mode.

Kai-Fu Lee's brutal assessment: America is already losing the AI hardware war to China

VentureBeat

By:michael.nunez@venturebeat.com (Michael Nuñez)

22 October 2025 at 16:30

China is on track to dominate consumer artificial intelligence applications and robotics manufacturing within years, but the United States will maintain its substantial lead in enterprise AI adoption and cutting-edge research, according to Kai-Fu Lee, one of the world's most prominent AI scientists and investors.

In a rare, unvarnished assessment delivered via video link from Beijing to the TED AI conference in San Francisco Tuesday, Lee — a former executive at Apple, Microsoft, and Google who now runs both a major venture capital firm and his own AI company — laid out a technology landscape splitting along geographic and economic lines, with profound implications for both commercial competition and national security.

"China's robotics has the advantage of having integrated AI into much lower costs, better supply chain and fast turnaround, so companies like Unitree are actually the farthest ahead in the world in terms of building affordable, embodied humanoid AI," Lee said, referring to a Chinese robotics manufacturer that has undercut Western competitors on price while advancing capabilities.

The comments, made to a room filled with Silicon Valley executives, investors, and researchers, represented one of the most detailed public assessments from Lee about the comparative strengths and weaknesses of the world's two AI superpowers — and suggested that the race for artificial intelligence leadership is becoming less a single contest than a series of parallel competitions with different winners.

Why venture capital is flowing in opposite directions in the U.S. and China

At the heart of Lee's analysis lies a fundamental difference in how capital flows in the two countries' innovation ecosystems. American venture capitalists, Lee said, are pouring money into generative AI companies building large language models and enterprise software, while Chinese investors are betting heavily on robotics and hardware.

"The VCs in the US don't fund robotics the way the VCs do in China," Lee said. "Just like the VCs in China don't fund generative AI the way the VCs do in the US."

This investment divergence reflects different economic incentives and market structures. In the United States, where companies have grown accustomed to paying for software subscriptions and where labor costs are high, enterprise AI tools that boost white-collar productivity command premium prices. In China, where software subscription models have historically struggled to gain traction but manufacturing dominates the economy, robotics offers a clearer path to commercialization.

The result, Lee suggested, is that each country is pulling ahead in different domains — and may continue to do so.

"China's got some challenges to overcome in getting a company funded as well as OpenAI or Anthropic," Lee acknowledged, referring to the leading American AI labs. "But I think U.S., on the flip side, will have trouble developing the investment interest and value creation in the robotics" sector.

Why American companies dominate enterprise AI while Chinese firms struggle with subscriptions

Lee was explicit about one area where the United States maintains what appears to be a durable advantage: getting businesses to actually adopt and pay for AI software.

"The enterprise adoption will clearly be led by the United States," Lee said. "The Chinese companies have not yet developed a habit of paying for software on a subscription."

This seemingly mundane difference in business culture — whether companies will pay monthly fees for software — has become a critical factor in the AI race. The explosion of spending on tools like GitHub Copilot, ChatGPT Enterprise, and other AI-powered productivity software has fueled American companies' ability to invest billions in further research and development.

Lee noted that China has historically overcome similar challenges in consumer technology by developing alternative business models. "In the early days of internet software, China was also well behind because people weren't willing to pay for software," he said. "But then advertising models, e-commerce models really propelled China forward."

Still, he suggested, someone will need to "find a new business model that isn't just pay per software per use or per month basis. That's going to not happen in China anytime soon."

The implication: American companies building enterprise AI tools have a window — perhaps a substantial one — where they can generate revenue and reinvest in R&D without facing serious Chinese competition in their core market.

How ByteDance, Alibaba and Tencent will outpace Meta and Google in consumer AI

Where Lee sees China pulling ahead decisively is in consumer-facing AI applications — the kind embedded in social media, e-commerce, and entertainment platforms that billions of people use daily.

"In terms of consumer usage, that's likely to happen," Lee said, referring to China matching or surpassing the United States in AI deployment. "The Chinese giants, like ByteDance and Alibaba and Tencent, will definitely move a lot faster than their equivalent in the United States, companies like Meta, YouTube and so on."

Lee pointed to a cultural advantage: Chinese technology companies have spent the past decade obsessively optimizing for user engagement and product-market fit in brutally competitive markets. "The Chinese giants really work tenaciously, and they have mastered the art of figuring out product market fit," he said. "Now they have to add technology to it. So that is inevitably going to happen."

This assessment aligns with recent industry observations. ByteDance's TikTok became the world's most downloaded app through sophisticated AI-driven content recommendation, and Chinese companies have pioneered AI-powered features in areas like live-streaming commerce and short-form video that Western companies later copied.

Lee also noted that China has already deployed AI more widely in certain domains. "There are a lot of areas where China has also done a great job, such as using computer vision, speech recognition, and translation more widely," he said.

The surprising open-source shift that has Chinese models beating Meta's Llama

Perhaps Lee's most striking data point concerned open-source AI development — an area where China appears to have seized leadership from American companies in a remarkably short time.

"The 10 highest rated open source [models] are from China," Lee said. "These companies have now eclipsed Meta's Llama, which used to be number one."

This represents a significant shift. Meta's Llama models were widely viewed as the gold standard for open-source large language models as recently as early 2024. But Chinese companies — including Lee's own firm, 01.AI, along with Alibaba, Baidu, and others — have released a flood of open-source models that, according to various benchmarks, now outperform their American counterparts.

The open-source question has become a flashpoint in AI development. Lee made an extensive case for why open-source models will prove essential to the technology's future, even as closed models from companies like OpenAI command higher prices and, often, superior performance.

"I think open source has a number of major advantages," Lee argued. With open-source models, "you can examine it, tune it, improve it. It's yours, and it's free, and it's important for building if you want to build an application or tune the model to do something specific."

He drew an analogy to operating systems: "People who work in operating systems loved Linux, and that's why its adoption went through the roof. And I think in the future, open source will also allow people to tune a sovereign model for a country, make it work better for a particular language."

Still, Lee predicted both approaches will coexist. "I don't think open source models will win," he said. "I think just like we have Apple, which is closed, but provides a somewhat better experience than Android... I think we're going to see more apps using open-source models, more engineers wanting to build open-source models, but I think more money will remain in the closed model."

Why China's manufacturing advantage makes the robotics race 'not over, but' nearly decided

On robotics, Lee's message was blunt: the combination of China's manufacturing prowess, lower costs, and aggressive investment has created an advantage that will be difficult for American companies to overcome.

When asked directly whether the robotics race was already over with China victorious, Lee hedged only slightly. "It's not over, but I think the U.S. is still capable of coming up with the best robotic research ideas," he said. "But the VCs in the U.S. don't fund robotics the way the VCs do in China."

The challenge is structural. Building robots requires not just software and AI, but hardware manufacturing at scale — precisely the kind of integrated supply chain and low-cost production that China has spent decades perfecting. While American labs at universities and companies like Boston Dynamics continue to produce impressive research prototypes, turning those prototypes into affordable commercial products requires the manufacturing ecosystem that China possesses.

Companies like Unitree have demonstrated this advantage concretely. The company's humanoid robots and quadrupedal robots cost a fraction of their American-made equivalents while offering comparable or superior capabilities — a price-to-performance ratio that could prove decisive in commercial markets.

What worries Lee most: not AGI, but the race itself

Despite his generally measured tone about China's AI development, Lee expressed concern about one area where he believes the global AI community faces real danger — not the far-future risk of superintelligent AI, but the near-term consequences of moving too fast.

When asked about AGI risks, Lee reframed the question. "I'm less afraid of AI becoming self-aware and causing danger for humans in the short term," he said, "but more worried about it being used by bad people to do terrible things, or by the AI race pushing people to work so hard, so fast and furious and move fast and break things that they build products that have problems and holes to be exploited."

He continued: "I'm very worried about that. In fact, I think some terrible event will happen that will be a wake up call from this sort of problem."

Lee's perspective carries unusual weight because of his unique vantage point spanning both Chinese and American AI development. Over a career spanning more than three decades, he has held senior positions at Apple, Microsoft, and Google, while also founding Sinovation Ventures, which has invested in more than 400 companies across both countries. His AI company, 01.AI, founded in 2023, has released several open-source models that rank among the most capable in the world.

For American companies and policymakers, Lee's analysis presents a complex strategic picture. The United States appears to have clear advantages in enterprise AI software, fundamental research, and computing infrastructure. But China is moving faster in consumer applications, manufacturing robotics at lower costs, and potentially pulling ahead in open-source model development.

The bifurcation suggests that rather than a single "winner" in AI, the world may be heading toward a technology landscape where different countries excel in different domains — with all the economic and geopolitical complications that implies.

As the TED AI conference continued Wednesday, Lee's assessment hung over subsequent discussions. His message seemed clear: the AI race is not one contest, but many — and the United States and China are each winning different races.

Standing in the conference hall afterward, one venture capitalist, who asked not to be named, summed up the mood in the room: "We're not competing with China anymore. We're competing on parallel tracks." Whether those tracks eventually converge — or diverge into entirely separate technology ecosystems — may be the defining question of the next decade.