The Environmental Cost of AI
The Energy Cost of Training
Training a large language model requires running thousands of GPUs continuously for weeks or months, consuming enormous amounts of electricity. A 2019 paper by Strubell, Ganesh, and McCallum estimated that training a single large NLP model emitted approximately 284 tons of CO2, roughly five times the lifetime emissions of an average American car. That estimate was for a model with a fraction of the parameters of current systems. GPT-4, with an estimated 1.8 trillion parameters, reportedly required approximately 25,000 NVIDIA A100 GPUs running for 90 to 100 days, consuming an estimated 50 to 100 gigawatt-hours of electricity.
The energy cost of training scales with model size and training duration. Scaling laws, discovered empirically by researchers at OpenAI and DeepMind, show that model performance improves predictably with more parameters, more training data, and more compute. This creates pressure to train ever-larger models, because each generation of larger models achieves meaningfully better performance. The compute used for the largest training runs has doubled approximately every 6 months since 2010, far outpacing improvements in hardware energy efficiency. The result is that total energy consumption for frontier model training has increased by roughly 1,000x over the past decade.
The carbon footprint of training depends heavily on the electricity source. A model trained in Quebec, where over 99% of electricity comes from hydropower, produces a fraction of the carbon emissions of the same model trained in a region powered by coal or natural gas. Google reports a carbon-free energy percentage of approximately 64% across its global operations. Microsoft has committed to 100% renewable energy for its data centers. These commitments are meaningful but have not kept pace with the growth in AI energy demand: both companies reported significant increases in total carbon emissions in 2023 and 2024, driven substantially by data center expansion for AI workloads.
Failed training runs, hyperparameter searches, and experimentation multiply the energy cost beyond the final model. For every published model, dozens of experimental runs were conducted during development, each consuming significant energy but producing no deployed system. The total energy cost of developing a frontier model, including all experimentation, is estimated at 2 to 5 times the energy cost of the final training run alone. This experimentation overhead is inherent to the research process but is rarely accounted for in published energy estimates.
The Energy Cost of Inference
Once trained, an AI model must be deployed and run on servers that consume energy for every query they process. A single ChatGPT query consumes an estimated 10 times the electricity of a Google search, approximately 0.01 to 0.03 kilowatt-hours compared to 0.0003 kilowatt-hours. This per-query cost is small, but ChatGPT alone handles hundreds of millions of queries daily. The aggregate inference energy for a major AI service can reach megawatt-hours per hour, running continuously 24/7.
The International Energy Agency projected that global data center electricity consumption, driven substantially by AI inference workloads, could reach 1,000 to 1,500 terawatt-hours annually by 2026, roughly equivalent to Japan's total electricity consumption. Goldman Sachs estimated that AI data center power demand in the United States alone could increase from approximately 8% of total U.S. electricity consumption in 2024 to 12% by 2028. This growing demand is already affecting energy infrastructure: utilities in Virginia, Texas, and Georgia have reported that data center power requests are straining grid capacity, and some have delayed connecting new data centers due to insufficient power supply.
Efficiency improvements partially offset growing demand. Model distillation creates smaller, faster models that approximate the performance of larger ones at a fraction of the energy cost. Quantization reduces the precision of model weights, decreasing memory and compute requirements. Sparse attention mechanisms reduce the computational cost per token. Specialized AI accelerator chips, like Google's TPUs and custom silicon from other providers, achieve higher performance per watt than general-purpose GPUs. These improvements are significant, often reducing inference costs by 2 to 10 times, but the growth in model size and deployment scale has outpaced efficiency gains, resulting in net increases in total energy consumption.
Water Consumption and Cooling
Data centers generate enormous amounts of heat that must be removed to keep servers operating within safe temperature ranges. Evaporative cooling, the most cost-effective cooling method for large data centers, consumes substantial amounts of water. A large data center can consume 3 to 5 million gallons of water per day, and AI workloads, which drive GPUs at near-maximum utilization, generate more heat per rack than traditional computing workloads, increasing cooling requirements.
Microsoft reported a 34% increase in global water consumption in its fiscal year 2023 compared to the previous year, attributing the increase largely to AI-related data center expansion. Google's water consumption increased by 20% over the same period. These increases occur in the context of growing water scarcity in many regions where data centers are concentrated. Data centers in water-stressed areas of the American Southwest, the Middle East, and parts of India draw from the same water supplies that serve agriculture, industry, and residential needs.
Alternative cooling technologies reduce but do not eliminate water consumption. Liquid cooling systems circulate coolant directly through server racks, achieving higher efficiency than air-based evaporative cooling. Immersion cooling submerges servers in non-conductive liquid, removing heat even more efficiently. Free-air cooling, used in cold climates, draws outside air through the data center when temperatures permit. Some data centers, like those operated by Google in Finland and Microsoft in Sweden, use seawater or other non-potable water sources. These innovations reduce the fresh water impact per unit of compute, but the rapid growth in total AI compute means that aggregate water consumption continues to rise.
Hardware Manufacturing and E-Waste
The environmental impact of AI extends beyond operational energy and water to the manufacturing of the hardware that runs AI workloads. Semiconductor fabrication is among the most resource-intensive manufacturing processes in existence. Producing a single NVIDIA H100 GPU requires ultrapure water (approximately 8,000 to 10,000 gallons per wafer processed), hazardous chemicals (hydrofluoric acid, sulfuric acid, and hundreds of specialty chemicals), rare earth elements and conflict minerals, and enormous amounts of energy for fabrication (a leading-edge chip fabrication facility consumes roughly 100 megawatts of continuous power). The supply chain spans global mining, refining, fabrication, assembly, and testing, with environmental impacts at every stage.
AI hardware has a shorter useful lifespan than traditional computing hardware because the rapid pace of AI capability improvement makes older hardware economically obsolete. A GPU purchased for AI training in 2022 may be outperformed by 5 to 10 times by hardware available in 2025, creating strong economic incentives to replace equipment frequently. This accelerated replacement cycle increases the volume of e-waste, which contains toxic materials including lead, mercury, cadmium, and flame retardants that can contaminate soil and groundwater if improperly disposed of. The global e-waste problem is already severe, with the UN estimating that only 17% of global e-waste is formally recycled.
AI as an Environmental Solution
The environmental ledger of AI includes significant positive entries alongside the costs. AI is being applied to some of the most important environmental challenges, and these applications could produce environmental benefits that exceed the technology's direct costs. Climate modeling uses AI to improve the accuracy and resolution of climate projections, helping societies prepare for and mitigate climate change. DeepMind's machine learning models have improved medium-range weather forecasting accuracy while reducing computational cost by orders of magnitude compared to traditional physics-based models.
Energy grid optimization uses AI to balance supply and demand in electrical grids with high penetrations of intermittent renewable sources like wind and solar. DeepMind's application to Google's own data center cooling systems reduced cooling energy by 40%. Applied to electrical grids more broadly, similar optimization could reduce the energy wasted through inefficient generation, transmission, and distribution. AI-optimized grid management is considered essential for integrating the large amounts of renewable energy needed to decarbonize electricity production.
Materials discovery uses AI to accelerate the identification of new materials for clean energy technology. Google DeepMind's GNoME system predicted the stability of 2.2 million new crystal structures, a massive expansion of known stable materials that could include candidates for better batteries, solar cells, catalysts, and energy storage systems. Traditional materials discovery requires synthesizing and testing candidates in the laboratory, a process that takes months per candidate. AI screening can evaluate thousands of candidates computationally in days, dramatically accelerating the search for materials that could enable the energy transition.
Environmental monitoring uses AI to process satellite imagery for deforestation tracking, biodiversity monitoring, ocean health assessment, and pollution detection at scales impossible for human analysts. Conservation organizations use AI-powered camera traps to monitor endangered species across vast wilderness areas. Agricultural AI optimizes irrigation, fertilization, and pesticide application, reducing resource consumption and environmental runoff while maintaining crop yields. These applications demonstrate that AI can be a powerful tool for environmental protection, not just an environmental burden.
Accounting for the Full Picture
Assessing AI's net environmental impact requires accounting for both direct costs (energy, water, hardware, e-waste) and indirect benefits (efficiency gains, environmental applications, research acceleration). This accounting is difficult because the costs are concentrated and measurable while the benefits are diffuse and counterfactual (how much energy would have been wasted without AI grid optimization? how much faster did AI accelerate materials discovery compared to traditional methods?). Honest assessment requires acknowledging both sides without letting either dominate the narrative.
The trajectory matters as much as the current state. AI's environmental footprint is growing rapidly, and the path it follows depends on choices being made now: where data centers are located (renewable vs. fossil fuel electricity), what hardware is used (efficiency-optimized vs. maximum performance), what models are deployed (efficient architectures vs. brute-force scaling), and whether environmental costs are internalized by AI companies or externalized onto communities and ecosystems. Transparency about AI's environmental impact, which major companies have improved but still falls short of comprehensive lifecycle accounting, is a prerequisite for making these choices well.
AI's environmental cost is real and growing, with training energy, inference energy, water consumption, and hardware manufacturing all scaling rapidly. AI also enables significant environmental benefits through climate modeling, grid optimization, materials discovery, and environmental monitoring. The net impact depends on deployment choices that society is making right now.