Decentralized AI Training Reduces Energy Costs

Researchers and companies are developing decentralised AI training systems that spread model training across independent nodes worldwide, aiming to reduce the industry's growing energy footprint without waiting for nuclear-powered data centres or next-generation hardware.

Editor's Note: This article is based on an official announcement from the source organization. Claims regarding performance, benchmarks, and capabilities have not been independently verified.

AI's energy consumption has become one of the industry's most pressing problems. Data centres supporting large model training already carry a substantial carbon footprint, and that footprint grows with each new generation of frontier models. While major tech companies are exploring long-term fixes — including nuclear energy partnerships — a parallel effort is targeting the training process itself, which represents one of the most energy-intensive phases of any model's life cycle.

Spreading the Compute Load Across the Globe

Decentralised training distributes model training across a network of independent nodes rather than concentrating it in a single data centre or provider. The core appeal is geographic flexibility: compute can run wherever energy already exists, whether in a dormant research-lab server or a solar-powered home, rather than requiring new grid infrastructure to be built around a centralised facility.

The hardware industry is already moving in this direction. Nvidia launched the Spectrum-XGS Ethernet for scale-across networking, which, according to the company, can deliver performance for large-scale AI training across geographically separated data centres. Cisco introduced its 8223 router designed to connect dispersed AI clusters. Both products reflect a shift away from the assumption that world-class training demands a single, tightly coupled supercluster.

"We want to convert your home into a fully functional data center." — Greg Osuri, Akash Network

Companies are also harvesting idle compute in existing servers. Akash Network, a peer-to-peer cloud computing marketplace, operates what cofounder and CEO Greg Osuri describes as the "Airbnb for data centres": owners of unused or underused GPUs register as providers, and those needing compute rent capacity from them. Osuri notes that the industry is "transitioning from only relying on large, high-density GPUs to now considering smaller GPUs," which broadens the pool of usable hardware considerably.

The Software Problem: Synchronising Across Slow, Unreliable Links

Distributed hardware alone is not sufficient. Training AI models requires constant synchronisation of model parameters — a process designed for the ultra-fast interconnects inside a single data centre, not the consumer-grade internet links that connect dispersed nodes. Two specific challenges exist: high communication costs from constantly exchanging model weights, and poor fault tolerance, where a single failed node can force an entire training batch to restart.

Federated learning addresses part of this problem. As explained by Lalana Kagal, a principal research scientist at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), a central server distributes a global model to participating organisations, which train locally on their own data and return only model weights — not raw data — to be aggregated. The cycle repeats until training is complete.

But communication overhead remains significant. Researchers at Google DeepMind developed DiLoCo — a distributed low-communication optimisation algorithm — specifically to reduce how often nodes need to synchronise. DiLoCo organises training into what Google DeepMind research scientist Arthur Douillard calls "islands of compute": groups of chips that train independently and synchronise only occasionally. A chip failure within one island does not interrupt the others. In testing, however, the team observed diminishing performance beyond eight islands.

From Research Paper to Real-World Deployments

An improved variant, Streaming DiLoCo, reduces bandwidth requirements further by synchronising model knowledge gradually in the background during computation — analogous, Douillard says, to streaming a video before it has fully downloaded. The approach means training can proceed without stopping for synchronisation steps.

The research has moved quickly from paper to deployment. AI development platform Prime Intellect used a DiLoCo variant as a core component of its 10-billion-parameter INTELLECT-1 model, trained across five countries on three continents. 0G Labs, which makes a decentralised AI operating system, adapted DiLoCo to train a 107-billion-parameter foundation model across segregated clusters with limited bandwidth. The popular open-source deep learning framework PyTorch has added DiLoCo to its repository of fault tolerance techniques.

"A lot of engineering has been done by the community to take our DiLoCo paper and integrate it in a system learning over consumer-grade internet," Douillard says.

Tapping Solar-Powered Homes as Training Nodes

Akash Network is pushing the concept furthest with its Starcluster programme, which aims to enrol solar-powered homes — using the desktops and laptops inside them — as active training nodes. Osuri acknowledges the requirements are not trivial: participants would need solar panels, consumer-grade GPUs, backup batteries, and redundant internet connections. The programme is working with industry partners to subsidise battery costs and simplify the setup for homeowners.

Backend integration to allow homes to participate as providers in the Akash Network is already underway, with the team targeting 2027 for broader rollout. The programme also envisions expanding to schools and community sites.

Douillard acknowledges that decentralised training methods "are arguably more complex" than conventional approaches but argues they offer a trade-off: the ability to use data centres in distant locations without needing to build ultra-fast interconnects between them, with fault tolerance built into the architecture rather than bolted on.

What This Means

Decentralised training represents a credible near-term path to reducing AI's energy demands by utilising existing, often renewable-powered hardware — and early deployments at the billion-parameter scale suggest the approach can work in practice, not just in research settings.

Decentralized AI Training Could Reduce Energy Cost of Frontier Models

Spreading the Compute Load Across the Globe

The Software Problem: Synchronising Across Slow, Unreliable Links

From Research Paper to Real-World Deployments

Tapping Solar-Powered Homes as Training Nodes

What This Means

Berkeley Researchers Propose GRASP, a Gradient-Based Planner for Long-Horizon World Models

Stanford AI Index Reports US-China Model Performance Gap Narrowed to 2.7%

Vidoc Security Says It Replicated Anthropic's Mythos Findings Using Public Models

Decentralized AI Training Could Reduce Energy Cost of Frontier Models

Spreading the Compute Load Across the Globe

The Software Problem: Synchronising Across Slow, Unreliable Links

From Research Paper to Real-World Deployments

Tapping Solar-Powered Homes as Training Nodes

What This Means

Related

Berkeley Researchers Propose GRASP, a Gradient-Based Planner for Long-Horizon World Models

Stanford AI Index Reports US-China Model Performance Gap Narrowed to 2.7%

Vidoc Security Says It Replicated Anthropic's Mythos Findings Using Public Models