Hybrid models are almost certainly the future, i.e., developing structural equations and filling in data or modeling gaps with relatively small NNs for approximation.
Hybrid models have the conceptual edge, but it's not yet obvious that they'll become the dominant AI forecasting paradigm.
The hybridization acts as a strong regularizer. This is a good thing, but it's not yet obvious that it's a necessary thing for short to medium-term forecasts. There seems to be enough extant data that pure learning models figure out the dynamics relatively easily.
Hybrid models are more obviously appropriate if we think about extending forecasts to poorly-constrained environments, like non-modern climates or exoweather. You can run an atmospheric model like WRF but with parameters set for Mars (no, not for colonization, but for understanding dust storms and the like), and we definitely don't have enough data to train a "Mars weather predictor."
The difficulty in training NeuralGCM is that one has to backpropagate over many dynamics steps (essentially all the time between data snapshots) to train the NN parameterizations. That's very memory-intensive, and for now NeuralGCM-like models run at coarser resolutions than fully-learned peers.