Loading Scale Physics...
Your device does not support WebGL2, so interactive animations are not available. All text content and images are fully accessible.
Updated Jun 2026
16 min read

AI in Physics

A New Instrument, Not a New Theory

A Method, Not a Topic

Most pages on this site describe pieces of nature. This one is different. Artificial intelligence is not a piece of nature. It is a method, a way of doing science, the way the telescope and the microscope were methods. The interesting question is not whether AI is part of physics but where in physics it actually moves the needle, and where the noise around it is louder than the signal. The honest answer in 2026 is: AI is now indispensable for a few specific jobs, useful but oversold for several more, and so far entirely silent on the deepest questions. That balance is what this page tries to draw.

The framing is recent enough that the Nobel Committee felt the need to formalize it. The 2024 Nobel Prize in Physics was awarded to John Hopfield and Geoffrey Hinton for foundational discoveries that made modern neural networks possible. The same year, the Chemistry Nobel went to Demis Hassabis and John Jumper at Google DeepMind for AlphaFold and to David Baker for computational protein design. Two Nobel Prizes in the same year for what is essentially the same idea: that a sufficiently large neural network, trained on enough data, can learn the structure of physical systems well enough to predict them. The committees are not awarding new physics. They are awarding a new instrument.

Solving the Equations We Could Not Solve

The first place AI has earned its keep is on equations that humans wrote down decades ago but cannot solve. The classic example is the Schrödinger equation for a system of many electrons. We have known the equation since 1926. We have never been able to solve it exactly for anything more complex than the hydrogen molecule. Every method physicists invented since then is some kind of approximation that breaks down somewhere.

In 2020 David Pfau and colleagues at DeepMind introduced FermiNet, a neural network whose architecture is constructed to respect the antisymmetry required of electronic wave functions. Train the network with variational Monte Carlo and you get ground-state energies for atoms and small molecules that beat coupled-cluster methods, the previous gold standard. A series of follow-ups extended this to molecules with strong correlations, dissociation curves, and excited states. A 2024 Science paper from the same group used neural networks to compute quantum excited states from first principles. A 2025 framework called QiankunNet replaced the original architecture with a transformer, the same machinery that powers large language models, and pushed accuracy further.

The same idea applies elsewhere. In lattice quantum chromodynamics, neural networks now serve as proposal distributions in Monte Carlo sampling, speeding up calculations of proton mass and meson form factors that used to take months. In fluid dynamics, neural-network closures for turbulence are starting to replace hand-tuned eddy-viscosity models in large eddy simulations. In condensed matter, neural quantum states learn ground-state wave functions of frustrated spin systems where no traditional method works. The pattern is consistent: when the equation is known but intractable, a neural network trained on simulation data or on the variational principle can outperform every method that came before. This is real progress, not hype.

Searching the Materials Combinatorial Space

The number of possible crystalline arrangements of the chemical elements is roughly the number of stars in the observable universe. A lab can synthesize and test one a day. A century of materials science has explored a tiny fraction of the space. AI has changed the search rate by orders of magnitude.

In November 2023, Google DeepMind announced GNoME, a graph neural network that predicted 2.2 million new candidate crystal structures, of which 381,000 were classified as stable enough to be worth synthesizing. The number was breathtaking: equivalent, the announcement claimed, to nearly 800 years of conventional crystallography. A companion paper from Berkeley Lab demonstrated that an autonomous robotic lab could synthesize 41 of 58 GNoME-suggested compounds over 17 days. Within a year, independent groups around the world had separately confirmed roughly 736 of the structures by physical synthesis. The Materials Project database, the standard reference, expanded by several multiples overnight.

The honest follow-up came in April 2024 when Anthony Cheetham and Ram Seshadri published a critical reanalysis in Chemistry of Materials. They concluded that the GNoME database contained "scant evidence for compounds that fulfill the trifecta of novelty, credibility, and utility." A large fraction of the predictions were trivial variations on known structures, others were radioactive compounds with no plausible application, and many entries posited orderings of metal ions that are unlikely to occur in reality. The 380,000 stable candidates remained, but the implicit claim that AI had increased the catalog of useful materials by ten times had to be retracted in spirit if not formally. The 2024 Nobel-flavored excitement around GNoME settled into a more modest narrative: AI is good at proposing candidates, but the human chemist still picks the few worth making.

Microsoft released MatterGen in January 2025 in Nature, a diffusion model that generates inorganic crystal structures conditioned on desired properties. Rather than scoring random candidates the way GNoME did, MatterGen generates structures directly toward a target band gap or bulk modulus. The team validated the approach by synthesizing one suggested compound and measuring a property within 20% of the target. The model is open-source and now widely used. On the protein side AlphaFold 3 arrived in May 2024 and extended the AlphaFold pipeline from solo proteins to complexes with DNA, RNA, ions, and small-molecule ligands. The structural-biology field has effectively been re-tooled around it. Two million researchers across 190 countries have used AlphaFold 2 or 3 to date, a footprint that few other physical-science tools have ever achieved.

A vast lattice of candidate crystal structures scanned by a search network
381,000 stable crystal candidates proposed in weeks. A human chemist still picks the few worth making.

Steering a Plasma in Real Time

A magnetic confinement fusion reactor is a control problem before it is a physics problem. The plasma inside a tokamak sits at a hundred million degrees, and if it touches the wall the experiment ends. Holding it in shape requires adjusting nineteen or so superconducting coils at kilohertz rates based on dozens of measurements. The control laws used to be hand-tuned by physicists, fiddling slowly during shot-by-shot debugging.

In February 2022, a team from DeepMind and the Swiss Plasma Center at EPFL trained a deep reinforcement-learning agent to control the Variable Configuration Tokamak in Lausanne. The agent took 90 measurements in, computed 19 voltage commands out, ran at 10 kilohertz, and was deployed on the real device without further tuning. It maintained plasma shapes that would have required weeks of hand-tuning by experts, and even discovered new configurations that the team had not anticipated. The Nature paper was a landmark: it was the first time a learned controller had successfully run a real fusion plasma.

Follow-up work between 2023 and 2025 closed several open gaps. Newer versions reduced steady-state error by up to 65% and shortened training time from weeks to days. Other groups built RL controllers for DIII-D, KSTAR, and EAST. The most consequential application is not on existing experimental devices but on ITER and the next generation of demonstration reactors, where the control problem becomes high-stakes: a disruption in a reactor-scale plasma can damage the machine. Whether AI-trained controllers can be certified safe enough for that role is the active engineering question.

Glowing torus of fusion plasma held in shape by magnetic field lines inside a tokamak
A learned controller running a real fusion plasma at ten thousand decisions per second.

Weather Forecasting Beats Its Own Equations

Weather forecasting is the most public success story of AI in physics. Numerical weather prediction is the largest operational physics calculation on Earth: petabytes of observations, exaflops of compute, twice-daily runs that government agencies bet billions of dollars on. The European Centre for Medium-Range Weather Forecasts has dominated the global rankings for decades. In late 2022, DeepMind's GraphCast outperformed the ECMWF's flagship HRES model on 90 percent of more than 1,300 verification targets, using a model trained for hours on archived reanalysis data and running in seconds on a single GPU. Huawei's Pangu-Weather, Nvidia's FourCastNet, and Microsoft's Aurora hit similar benchmarks shortly afterward.

The ECMWF itself responded by building its own neural-network operational model, the Artificial Intelligence Forecasting System, or AIFS. AIFS Single went operational on 25 February 2025, and the ensemble version on 1 July 2025, with further upgrades rolling out since. ECMWF stopped publishing graphical products from the external research-AI weather models because their own performed at least as well. NOAA's National Hurricane Center now references GraphCast and Pangu output in its operational forecast discussions. For routine medium-range forecasting, AI has effectively replaced the equations that physicists wrote down for a century.

The caveat is severe events. A 2026 Science Advances study showed that the major AI weather models systematically underpredict record-breaking heat, cold, and wind events, and that the gap widens with how extreme the event is. AIFS specifically underpredicts peak storm winds because its mean-squared-error training pressure smooths the atmospheric gradients that produce sharp peaks. Physics-based models still hold an edge on the tail of the distribution where extreme events live. The frontier in 2026 is "physics-augmented" AI: hybrid systems that keep the speed and accuracy of neural networks for typical days while constraining the network with conservation laws so that storms still look like storms. ECMWF's published roadmap targets a fully hybrid ML-augmented model by the end of the decade.

Global weather pattern forecast overlaid with a neural network's prediction grid
Trained for hours, runs in seconds, beats a century of equation-based forecasting on the typical day.

Hunting Anomalies in Mountains of Data

The Large Hadron Collider produces about a billion proton-proton collisions per second. Only a fraction can be saved to disk, so a multi-tier trigger system has to decide in microseconds which events to keep. Choosing the right events used to be a hand-written rule set written by physicists who knew, more or less, what they were looking for. Anomaly-detection neural networks now help by flagging events that look unlike anything in the rule book. The ATLAS and CMS experiments both run trained autoencoders directly in their data-selection pipelines, alongside traditional jet-tagging networks that classify the particles produced.

The promise is that an anomaly the physicist did not expect might be saved instead of discarded. The pitfall is that the network is trained on Standard Model simulations: if the new physics it should detect looks nothing like what was simulated, the network might still throw it away. So far no anomaly-detection method has produced a genuinely new particle. But the technique is now standard, and it lets the experiments search wider than the original cuts permitted.

The same pattern applies in gravitational-wave astronomy: convolutional networks now denoise LIGO and Virgo strain data faster than matched filtering, allowing real-time alerts to electromagnetic-follow-up telescopes. In astronomy more broadly, neural networks classify galaxy morphologies, identify strong-lens candidates in Euclid and Rubin Observatory imaging, and detect transients in the firehose of new data the Vera Rubin Observatory has been producing since 2024. None of these tasks would be intractable for a human, but they would require armies of humans, and the human gets bored faster than the network.

Cascade of LHC collision tracks with one event highlighted by an anomaly-detection network
A billion collisions per second. The network keeps the ones that do not fit the rule book.

Can a Machine Discover a Law?

The most ambitious claim made for AI in physics is the one this section is about: that a network, given experimental data, will write down a new physical law. The technical term is symbolic regression. The headline implementation was AI Feynman, an algorithm released in 2019 by a group around Max Tegmark at MIT. Given numerical data for any of the hundred equations in the Feynman Lectures on Physics, AI Feynman could recover all hundred. Previous public software could only solve 71. The benchmark has been expanded several times since.

But the honest assessment is that no AI system has yet discovered a fundamental physical law that was not already known. AI Feynman rediscovers Newton, Coulomb, Lorentz. It does not invent new ones. Modern symbolic-regression engines are useful for fitting effective models in soft-matter physics, geophysics, biology, and chemistry, where the underlying equations may not yet be known. They have produced useful new compact expressions for, say, planetary climate sensitivity or galaxy halo properties. They have not produced anything anyone calls a new fundamental law.

Several theorists have argued this is the wrong way to ask the question. A genuine new law of physics, in the sense of relativity or quantum mechanics, is not the next-best curve through a data set. It is a conceptual reorganization that explains why several apparently unrelated curves match. Whether any neural network is capable of that kind of conceptual leap is a question about intelligence itself, and physicists who write about it tend to land on either "not yet, probably someday" or "not ever, in principle." The honest summary in 2026 is that we do not know.

What Makes a Physics AI Different

If you are reading this page on a phone or a laptop, there is a fair chance the page itself was edited with help from a model of the same general kind being discussed here. The same broad technology underlies the tools described above. So what is the difference between an AI that helps a physicist do physics and a chat assistant that helps anyone write an email? It is not what the network is made of. It is what the physicist decided to build into the network before any training began.

A general-purpose language model has one job during training: given a stretch of text, predict the next word. That objective is purely statistical. Nothing in the model forbids it from describing a process that violates the conservation of energy, or assigning the wrong sign to a magnetic force, or confidently inventing a citation that does not exist. The model can sound credible about physics because a lot of physics writing went into its training, but no physical law is built into its structure. The laws live in the prose it absorbed, not in its wiring. That is why a model of this kind is a useful writing assistant for a working physicist and a poor calculator. It is trained to sound right, not to be right.

FermiNet, the network mentioned earlier for solving the electron equations, is the cleanest contrast. Its layers are mathematically constructed so that swapping any two electrons in the input automatically flips the sign of the output. This is the Pauli exclusion principle, the rule that no two electrons can share the same state, hard-coded directly into the architecture. The network cannot violate it. Not because it was trained to obey, but because the structure of the layers makes obedience the only behavior the network is allowed to have. The physics is not in the data the network learns from. It is in the geometry of the network itself.

The pattern repeats across every example on this page. GNoME treats a crystal as a graph, with atoms as nodes and bonds as connections, because a graph-based network gives the same answer no matter how the atoms are numbered, and atoms in a crystal really are interchangeable. MatterGen uses the same engine that generates AI images, except instead of moving through pixel space it moves through the space of crystal structures. The reinforcement-learning controller that ran the fusion plasma at the Swiss Plasma Center was built on a formal model of "make a decision, watch what happens, make the next decision," which is what plasma control actually is. In every case a physicist looked at the problem first, identified the symmetries and the structure of the situation, and chose an architecture that respects them automatically. The network never has to learn that atoms can be renumbered or that electrons obey the exclusion principle. It is structurally incapable of doing otherwise.

This is the line between a general AI and a physics AI. The general one is a statistical compressor of human-written text. The physics one is general machine learning with the relevant physical law engineered into the structure as a hard constraint. The general AI has to guess. The physics AI gets the guess for free and uses its capacity to learn the part that is genuinely uncertain. That is why an ordinary language model is useful for writing emails and weak for predicting the energy of a molecule, and why FermiNet beats a century of quantum chemistry but cannot help you compose a sonnet. They are tools for different jobs, and the architectural choice — which symmetries and structures to wire in by hand before training — is most of what separates a physics instrument from a writing assistant.

What none of this engineering reaches is the part of the network that decides which specific answer to give once all the hard constraints have been satisfied. The constraints govern what the network is allowed to output. They do not govern what the network thinks the right answer is. That is the part physicists still worry about, and it is what the next section is about.

Why Physicists Are Still Suspicious

The skeptical case against AI in physics comes down to one word: interpretability. A textbook derivation produces an equation that says what it depends on and why. A neural network produces a prediction with no associated explanation. The 380,000 stable crystals from GNoME exist as a list with no narrative. The fusion-control policy in the EPFL tokamak makes ten thousand correct decisions per second with no physical principle anyone can read off from the network weights. If the network is wrong, finding out why is its own research project.

This bites hardest at the edge of the training distribution. A neural network trained on solar-quiet weather data will not have learned what a record-breaking heat wave looks like, because no record-breaking heat wave was in its training set. The 2026 Science Advances result on extreme-weather underprediction is the textbook example. The same problem haunts every domain. A force-field model trained on equilibrium molecular dynamics will not extrapolate to far-from-equilibrium chemistry. A galaxy emulator trained on the standard cosmological model will return whatever the network learned to imitate even if the true cosmology is something else.

A related and quieter problem is reproducibility. A 2025 review found that a large fraction of published "AI for X" papers in physics and chemistry could not be reproduced from the released code and data, sometimes because the code was not really released, sometimes because the test data overlapped with the training data, sometimes because tiny details in preprocessing made a large difference and were not documented. The fix is mechanical: better datasets, better benchmarks, mandatory open code. The fix has not happened yet, and several specific influential results have been quietly retracted.

None of this means AI does not work. It means AI works in the same domain it was trained in, and is dangerous to extrapolate outside that domain without a separate check. Physicists who use AI well treat it the way an experimenter treats an instrument: useful, fast, but always cross-checked against another method before publication. The ones who get it wrong forget that step.

The Negative Space

It is worth listing what AI has not done, because the surrounding coverage often blurs the line. AI has not produced a theory of quantum gravity. AI has not discovered a new fundamental particle. AI has not solved the hierarchy problem, the cosmological constant problem, or the measurement problem. AI has not derived dark matter, dark energy, or the value of the fine-structure constant from anything more basic. AI has not even fully solved the much smaller engineering problem of bringing a fusion reactor to net energy gain. The frontier of fundamental physics looks essentially the same as it did before the deep-learning revolution started.

That is not a surprise. Neural networks are powerful interpolators within the distribution they were shown. The deep open questions of physics are exactly the ones for which no relevant training distribution exists. There are no historical examples of quantum-gravity experiments, only conceptual arguments. There is no data set of new particles waiting to be classified. The places where AI helps physics most are the places where the equations were already written down and the data already collected. The places where AI helps least are where neither is true. That is consistent, and it may be permanent.

A Telescope, Not a Theory

The clearest way to think about AI in physics is by analogy with the telescope. When Galileo turned a refracting tube toward Jupiter in 1610 he did not propose a new theory. He extended the reach of the instrument we had. The new theory came later, when humans worked out what the new view implied. AI is the same shape of contribution. It does not replace the equations. It extends the size of the system the equations can solve, the volume of data the equations can be tested against, the rate at which candidates can be generated. The equations themselves still wait for humans.

The deeper question is whether that distinction holds. There is a respectable view, held by some of the field's loudest voices, that any sufficient acceleration of search becomes a new kind of discovery in its own right, and that within a decade or two a neural network will write down a fundamental law its trainer did not know. There is a respectable counter-view that intelligence and understanding are deeply different things, and that a network compressing data will never have an insight. Both sides are doing physics. Both sides will be reading the journals carefully for the next decade. The honest part of the story is that nobody knows which side will be right.

For now, AI has changed which problems physicists can attempt, not which problems they want answered. The wish list at the top of the field has not moved. The toolbox underneath it has. That is exactly what a new instrument should do.

Best discoveries usually start with someone saying, 'that is weird'

An unhandled error has occurred. Reload [X]