Data Driven PDE Solvers for Power Systems
\( \newcommand{\states}{\mathcal{S}} \newcommand{\actions}{\mathcal{A}} \newcommand{\observations}{\mathcal{O}} \newcommand{\rewards}{\mathcal{R}} \newcommand{\traces}{\mathbf{e}} \newcommand{\transition}{P} \newcommand{\reals}{\mathbb{R}} \newcommand{\naturals}{\mathbb{N}} \newcommand{\complexs}{\mathbb{C}} \newcommand{\field}{\mathbb{F}} \newcommand{\numfield}{\mathbb{F}} \newcommand{\expected}{\mathbb{E}} \newcommand{\var}{\mathbb{V}} \newcommand{\by}{\times} \newcommand{\partialderiv}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\defineq}{\stackrel{{\tiny\mbox{def}}}{=}} \newcommand{\defeq}{\stackrel{{\tiny\mbox{def}}}{=}} \newcommand{\eye}{\Imat} \newcommand{\hadamard}{\odot} \newcommand{\trans}{\top} \newcommand{\inv}{{-1}} \newcommand{\argmax}{\operatorname{argmax}} \newcommand{\Prob}{\mathbb{P}} \newcommand{\avec}{\mathbf{a}} \newcommand{\bvec}{\mathbf{b}} \newcommand{\cvec}{\mathbf{c}} \newcommand{\dvec}{\mathbf{d}} \newcommand{\evec}{\mathbf{e}} \newcommand{\fvec}{\mathbf{f}} \newcommand{\gvec}{\mathbf{g}} \newcommand{\hvec}{\mathbf{h}} \newcommand{\ivec}{\mathbf{i}} \newcommand{\jvec}{\mathbf{j}} \newcommand{\kvec}{\mathbf{k}} \newcommand{\lvec}{\mathbf{l}} \newcommand{\mvec}{\mathbf{m}} \newcommand{\nvec}{\mathbf{n}} \newcommand{\ovec}{\mathbf{o}} \newcommand{\pvec}{\mathbf{p}} \newcommand{\qvec}{\mathbf{q}} \newcommand{\rvec}{\mathbf{r}} \newcommand{\svec}{\mathbf{s}} \newcommand{\tvec}{\mathbf{t}} \newcommand{\uvec}{\mathbf{u}} \newcommand{\vvec}{\mathbf{v}} \newcommand{\wvec}{\mathbf{w}} \newcommand{\xvec}{\mathbf{x}} \newcommand{\yvec}{\mathbf{y}} \newcommand{\zvec}{\mathbf{z}} \newcommand{\Amat}{\mathbf{A}} \newcommand{\Bmat}{\mathbf{B}} \newcommand{\Cmat}{\mathbf{C}} \newcommand{\Dmat}{\mathbf{D}} \newcommand{\Emat}{\mathbf{E}} \newcommand{\Fmat}{\mathbf{F}} \newcommand{\Gmat}{\mathbf{G}} \newcommand{\Hmat}{\mathbf{H}} \newcommand{\Imat}{\mathbf{I}} \newcommand{\Jmat}{\mathbf{J}} \newcommand{\Kmat}{\mathbf{K}} \newcommand{\Lmat}{\mathbf{L}} \newcommand{\Mmat}{\mathbf{M}} \newcommand{\Nmat}{\mathbf{N}} \newcommand{\Omat}{\mathbf{O}} \newcommand{\Pmat}{\mathbf{P}} \newcommand{\Qmat}{\mathbf{Q}} \newcommand{\Rmat}{\mathbf{R}} \newcommand{\Smat}{\mathbf{S}} \newcommand{\Tmat}{\mathbf{T}} \newcommand{\Umat}{\mathbf{U}} \newcommand{\Vmat}{\mathbf{V}} \newcommand{\Wmat}{\mathbf{W}} \newcommand{\Xmat}{\mathbf{X}} \newcommand{\Ymat}{\mathbf{Y}} \newcommand{\Zmat}{\mathbf{Z}} \newcommand{\Sigmamat}{\boldsymbol{\Sigma}} \newcommand{\identity}{\Imat} \newcommand{\epsilonvec}{\boldsymbol{\epsilon}} \newcommand{\thetavec}{\boldsymbol{\theta}} \newcommand{\phivec}{\boldsymbol{\phi}} \newcommand{\muvec}{\boldsymbol{\mu}} \newcommand{\sigmavec}{\boldsymbol{\sigma}} \newcommand{\jacobian}{\mathbf{J}} \newcommand{\ind}{\perp!!!!\perp} \newcommand{\bigoh}{\text{O}} \)
citation | estimate pde dynamics from data | estimate PDE parameters with model | estimate quantity described by pdes |
---|---|---|---|
(Guo, Li, and Iorio 2016) | Y | N | N |
(Khoo, Lu, and Ying 2021) | N | N | Y |
(Raissi, Perdikaris, and Karniadakis 2019) | Y | Y | N |
(Kovachki et al. 2024) | Y | N | N |
(Stiasny et al. 2022) | Y | Y | N |
(Misyris, Venzke, and Chatzivasileiadis 2020) | Y | Y | N |
(Stiasny, Misyris, and Chatzivasileiadis 2021) | Y | Y | N |
(Stiasny, Chevalier, and Chatzivasileiadis 2021) | Y (Simulation) | N | N |
(Stiasny, Misyris, and Chatzivasileiadis 2023) | Y | N | N |
(Pagnier and Chertkov 2021) | Y | Y | N |
(Nellikkath and Chatzivasileiadis 2021) | N | Y | N |
(Nellikkath and Chatzivasileiadis 2022) | N | Y | N |
citation | example pdes/their field | Relevant to power systems | Novel NN approaches | Data Simulated |
---|---|---|---|---|
(Guo, Li, and Iorio 2016) | laminar flow of fluids around geometries | N | N (Conv Nets) | Y |
(Khoo, Lu, and Ying 2021) | NLSE, Effective Conductance | N | N | Y |
(Raissi, Perdikaris, and Karniadakis 2019) | Several | Y | N | Y |
(Kovachki et al. 2024) | Several | Y | Y | Y |
(Stiasny et al. 2022) | The North Sea Wind Power Hub | Y | N | Y |
(Misyris, Venzke, and Chatzivasileiadis 2020) | Single Machine Infinite Bus system | Y | N | Y |
(Stiasny, Misyris, and Chatzivasileiadis 2021) | 4-bus 2-generator power system | Y | N | Y |
(Stiasny, Chevalier, and Chatzivasileiadis 2021) | Single Machine Infinite Bus system | Y | Y | N (No data) |
(Stiasny, Misyris, and Chatzivasileiadis 2023) | Kundur Two-area system | Y | N | Y |
(Pagnier and Chertkov 2021) | IEEE Bus Systems (14, 118), PanTaGruEl | Y | N | Y |
(Nellikkath and Chatzivasileiadis 2021) | (Babaeinejadsarookolaee et al. 2021) | Y | N | Y |
(Nellikkath and Chatzivasileiadis 2022) | Y | N | Y | |
citation | generative | discriminative | Type of estimate | Category | Discretization/Sampling Method |
---|---|---|---|---|---|
(Guo, Li, and Iorio 2016) | Y | N | Field | Grid | |
(Khoo, Lu, and Ying 2021) | N | Y | Scalar | N/A | |
(Raissi, Perdikaris, and Karniadakis 2019) | Y | Y | Function/Field | Latin Hypercube Sampling, fixed grid, | |
(Kovachki et al. 2024) | Y | N | Function/Field | Avoids discretization through NN architecture | |
(Stiasny et al. 2022) | N | Y | Function/Field | Discretized on a set \(\Delta t\) | |
(Misyris, Venzke, and Chatzivasileiadis 2020) | Y | Y | System ID, Sim | NA | |
(Stiasny, Misyris, and Chatzivasileiadis 2021) | Y | Y | System ID | NA | |
(Stiasny, Chevalier, and Chatzivasileiadis 2021) | Y | N | Simulation | Variable time discretization | |
(Stiasny, Misyris, and Chatzivasileiadis 2023) | Y | N | Simulation | Time and input space discretizaiton | |
(Pagnier and Chertkov 2021) | Y | N | simulation | ||
(Nellikkath and Chatzivasileiadis 2021) | N | Y | Parameter Est. | Latin Hypercube Sampling | |
(Nellikkath and Chatzivasileiadis 2022) | N | Y | Parameter Est. | Latin Hypercube Sampling | |
citation | problem domain |
---|---|
(Guo, Li, and Iorio 2016) | Fluid dynamics around a geometry |
(Khoo, Lu, and Ying 2021) | property estimation of PDEs |
(Raissi, Perdikaris, and Karniadakis 2019) | PINNs introduction |
(Kovachki et al. 2024) | Neural Operators introduction |
(Stiasny et al. 2022) | Security constrained stability margins |
(Misyris, Venzke, and Chatzivasileiadis 2020) | System identification (parameter estimation) in Power Flow, some EMT simulation |
(Stiasny, Misyris, and Chatzivasileiadis 2021) | system identification (parameter estimation) in Power Flow |
(Stiasny, Chevalier, and Chatzivasileiadis 2021) | EMT simulation |
(Stiasny, Misyris, and Chatzivasileiadis 2023) | (Trainsient) Stability Analysis |
(Pagnier and Chertkov 2021) | Parameter estimation |
(Nellikkath and Chatzivasileiadis 2021) | DC-OPF, parameter estimation (i.e. optimal contingency) |
(Nellikkath and Chatzivasileiadis 2022) | AC-OPF |
Questions/Thoughts
- Are there methods for learning from data driven by controllers? How well does this work on real-world data? Can we get better explanations of how the controller performs? Can we learn the implicit PDE of a controller we can’t analyze? How do we validate we have learned it correctly?
- Can Data Driven PDEs be used for explainability in RL? What would that look like? Can we ask counterfactuals or specific stability related questions using
- Could we learn a mapping from rewards (a function over states) to policies (a function over states and actions) with samples from an off-policy dataset? What does the learning objective look like when we include stochasticity into the input and outputs (or must it only be deterministic?)
- Could we use a neural operator that learns over the PDE induced by the current agent’s policy. This could then be used in the explainability space to understand what we expect the agent to do over time, and simulate the PDE into the future to give confidence to the human operator.
- Could we use this as input to the agent? What policy would the PDE depend on, maybe a set of experts? How much data would this take to train, and how much data would we need for the subsequent RL learning.
- How can Neural Operators be used to understand or predict how well an agent will do in the future? Can this be done? How reliable would it be? Would it give a human operator some confidence? Maybe as a comparison with experts.
- Can the parameters of an expert be a parameter that we can learn over in the set \(\mathcal{A}\)? Would we need to smoothly transition between experts? Would we need a new set of parameters per expert?
Problem Settings
In these papers, there are several problem settings that each have their own flavour of solution. Not all of these are relevant to our work, but it is good to have an idea of what all of them are so we can know what kinds of papers are worth incorporating into this literature review.
Estimate PDE dynamics from data
This is the most difficult problem setting. The goal is to be able to estimate the dynamics of a PDE from a set of conditions or measurements.
One way this plays out is through knowing something about the physical characteristics of the underlying PDE. For instance, one can estimate the steady state flow of fluid around a geometry given a representation of that geometry (Guo, Li, and Iorio 2016). The output of the network should be a generative prediction of what the fluid flow output of a simulator would look like.
Another way this can be done is by taking a set of measurements of a system and approximating its dynamics in a discretization agnostic manner as in (Kovachki et al. 2024). The interesting part of neural operators is that thy might not need to have knowledge of the underlying physical mechanics as (Raissi, Perdikaris, and Karniadakis 2019) or (Guo, Li, and Iorio 2016) do. But a problem may arise in needing more data to estimate the actual PDE if it is complicated (my guess).
Estimate PDE model parameters from data
The unknonw parameters of a PDE are difficult to ascertain, but by enabling gradients to be informed by the underlying PDEs we could use data to estimate the unknown parameters, or use this information to better estimate other quantities/forecasts (Raissi, Perdikaris, and Karniadakis 2019). This is also known as system identification.
Estimate emergent quantity derived from PDE dynamics
This is not really estimating PDEs, but is related in some ways. In (Khoo, Lu, and Ying 2021), they estimate characteristic qualities of a system without needing to model the entire set of dynamics or learn the unknown variables of the underlying PDEs. This could be interesting for a number of applications in power systems related to stability analysis.
Power Systems domain
What we should do is:
- Organize the larger set of PDE solution methods into subsets based on the above table and general ideas from the literature (i.e. the
- Create
Power Flow analysis for steady state grid networks
Optimal Power Flow
Stability Analysis
EMT simulation
Non ML based Power Systems methods
Model Order Reduction
Model Analysis Techniques
Power Systems Specific Applications
(Stiasny et al. 2022): Closing the Loop: A Framework for Trustworthy Machine Learning in Power Systems
Proposes a method that uses (Raissi, Perdikaris, and Karniadakis 2019) as it’s core regulariztion method. It is following very closely to their previous works in (Stiasny, Misyris, and Chatzivasileiadis 2021; Misyris, Venzke, and Chatzivasileiadis 2020; Stiasny, Chevalier, and Chatzivasileiadis 2021). This requires data to be generated, and they don’t seem to use realistic data, but the example The North Sea Wind Power Hub is very interesting. Does a nice overview of all the problems related to using Machine Learning in power systems and how the community should be transparent in their research.
This paper mostly has a nice trove of papers they cite which applies PINNs to power systems, and other applications of NNs in power systems (including work on generating formal/informal guarantees on a NNs outputs).
(Bertozzi et al. 2024): Application of data-driven methods in power systems analysis and control
This is an ok review paper for a broad look at power systems research using data driven techniques. It does have some nice papers cited that should be looked through.
Generally, they don’t provide their own contribution a part from the review (the organization is surface level).
(Misyris, Venzke, and Chatzivasileiadis 2020): Physics-Informed Neural Networks for Power Systems
This is a direct application of physics informed neural networks (Raissi, Perdikaris, and Karniadakis 2019) onto the Single Machine Infinite Bus system. The swing equation for this toy example is
\[ m_1 \ddot{\delta} + d_1 \dot{\delta} + B_{12} V_1 V_2 \text{sin}(\delta) - P_1 = 0 \]
with the equations being incorporated in through the equations
\begin{align} u_\theta(t,x) &=& \delta(t, P_1) \\ f_\delta(t, P_1) = m_1 \ddot{\delta} + d_1 \dot{\delta} + B_{12}V_1V_2 \text{sin}(\delta) - P_1 \end{align}
Where U is parameterized by weights \(\theta \subset \reals\) and the partials \(\ddot{\delta}\) and \(\dot{\delta}\) are taken with respect to time through an autodiff package.
I highly recommend looking at https://github.com/gmisy/Physics-Informed-Neural-Networks-for-Power-Systems for more details on how this is implemented. Specifically in the `net_f` functions.
They do experiments in inference (when \(m_1\) and \(d_1\) are known), and in identification where \(m_1\) and \(d_1\) are unknown and need to be identified.
(Stiasny, Misyris, and Chatzivasileiadis 2021): Physics-Informed Neural Networks for Non-linear System Identification for Power System Dynmaics
This paper does exactly what (Misyris, Venzke, and Chatzivasileiadis 2020) but for a 4-Bus 2-Generator System and focuses on system identification (i.e. estimating the parameters of the PDEs). They compare PINNs to Unscented Kalman Filters. The results are mixed, but generally the PINNs are much more flexible to various undesirable conditions (noise, missing data).
(Stiasny, Chevalier, and Chatzivasileiadis 2021): Learning without Data: Physics-informed Neural Networks for Fast Time-Domain Simulation
This paper proposes a combination of Runga-Kutta and PINNs to perform time domain simulation with variable time-steps. The main idea is to use a neural network to estimate each of the stages in a Runga-Kutta step. From input \(\zvec_0 = [\xvec^0, \uvec]\) (where \(\xvec^0\) is the initial condition and \(\uvec\) is the control inputs to the system. The neural network makes predictions \(\yvec = [\hvec^1^\top, \ldots, \hvec^s^\top]^\top\) which are the individual stages of the Runga-Kutta method which can be calculated using the parametrized function \(\mathbf{f}\) describing the update rule of \(\xvec\). Note that this is not a PDE, and instead an ODE where the free variable for simulation is time once the initial condition \(\xvec_0\) is known and the external force \(\uvec\). WE ARE NOT DOING SYSTEM IDENTIFICATION.
Using the property of each stage in RK being calculated via
\[ \hvec^k = \mathbf{f}\left ( t_0 + \gamma^k \Delta t, \xvec^0 + \Delta t \sum_{l=1}^{k-1} \hvec^l \right ) \]
we can define an error for each of the stages (which will be dependent on the previous layer). \[ \epsilon^k(\xvec^0, \uvec) = \hvec^k - \mathbf{f}\left ( t_0 + \gamma^k \Delta t, \xvec^0 + \Delta t \sum_{l=1}^{k-1} \hvec^l \right ). \]
We can use this to also train an RK algorithm with variable time-steps. Because the ODE is known, the goal is the estimate the simulation of the ODE in the time domain, without using data.
They compare this method to full implicit RK schemes (with s set to 4 and 32), explicit RK-45, and Radau (an implicit RK method) on the Single Machine Infinite Bus system.
This method is orders of magnitude faster than the baseline RK methods, and provide reasonable accuracy.
(Stiasny, Misyris, and Chatzivasileiadis 2023): Transient Stability Analysis with Physics-Informed Neural Networks
This paper looks at the application of PINNs (Raissi, Perdikaris, and Karniadakis 2019) for power system transient stability assessment. “At frequent intervales, operators assess if probable contingencies result in loss of synchronism, frequency instability, or violations of component limits during the transient phase.” Modern grids add significant non-linearities and uncertainties through “converter-connected devices” and renewable sources of energy. This causes the computational complexity to increase drastically as a full EMT simulation is often required to evaluate the contingencies.
Some previous attempts to reduce the computational complexlity:
- SIME: (Zhang et al. 1997)
- Lyapunov functions (linear assumptions): (Gless 1966; El-abiad and Nagappan 1966)
- Lyapunov functions (non-linear assumptions): (Vu and Turitsyn 2016)
The basic approach is to use a PINN to approximate EMT simulation. From simulating the trajectories given a set of initial conditions you can see if there is a risk of instability at key time moments. They have three loss terms:
- Supervised learning loss: \(\mathscr{L}_x^i = \frac{1}{N_x} \sum_{j=1}^{N_x} (x^i_j - \hat{x}_i_j)^2.\)
- The state update function \(\mathbf{f}(t, x(t), \uvec)\) at the data matches the temporal derivative of the NNs approximation \(\frac{\partial}{\partial t} \hat{x}\) (calculated through autodiff) \[ \mathscr{L}^i_{dt} = \frac{1}{N_x} \sum_{j=1}^{N_x} \left ( f^i(t_j, x_j, \uvec_j) - \(\frac{\partial}{\partial t} \hat{x}\) \right )^2 \]
- Finally, using the governing equations we can estimate the governing equations with our estimate: \[
\mathscr{L}^if = \frac{1}{N_f} ∑j=1N_f \left ( f^i(t_j, \hat{x}_j, \uvec_j) - \(\frac{\partial}{\partial t} \hat{x}\) \right )^2 \]
The data-points used for \(\mathscr{L}_f\) are called collocation points and can be any point in the input domain.
In their results they show how faster PINNs can be compared with classical approaches like Runge-Kutta, but don’t adequately show how well the PINNs can predict instable points. While they do an in-depth analysis on the accuracy/error of the approach, this isn’t sufficient to get a full picture in my opinion. We need to see if there are scenarios in which this fails, does it miss instable points, how good can it generalize, etc.
(Pagnier and Chertkov 2021): Physics-Informed Graphical Neural Network for Parameter & State Estimations in Power Systems
This paper proposes a graph neural network approach to estimate the state and parameters of a power system in partially observable scenarios (i.e. when not all the nodes of a power system which have PMUs. They do this by modeling the system using a graph nueral network whwere the graph of the neural network is a reduced form using only observed currents \(\mathbf{I}^{(o)}\) and voltages \(\mathbf{V}^{(o)}\):
\[ \mathbf{I}^{(o)} = \mathbf{Y}^{( r)} \mathbf{V}^{(o)} \]
where the addmitance matrix \(\mathbf{Y}^{(o)}\) is a reduced form following the Kron Reduction.
They aim to solve the following \[ \min_{\psi, \mathbf{Y}^{( r)}} L_{\text{Power-GNN}} \]
where \[ L_{\text{Power-GNN}} = \frac{1}{N \mathcal{V}^{(o)}} \sum_{n=1}^N || \mathbf{S}_n^{(o)} - \Pi^{-1}_{\mathbf{Y}^{( r)}} (\mathbf{V}_n^{(o)}) - \sum_{\psi} (\mathcal{V}_n^{(o)}, S_n^{(o)}) ||^2 + R(\psi) \]
They use three increasingly complex graphs:
In their experiments, they find that the Power-GNN performs quite well, and is able to learn the addmitance matrix to a high degree of accuracy (which in-turn is able to estimate the state of the system to a high degree of accuracy). The method obviously outperforms the vanilla NN approach, but the comparison isn’t fair. Hard to say if this is actually a good method.
(Nellikkath and Chatzivasileiadis 2021): Physics-Informed Neural Networks for Minimising Worst-Case violations in DC Optimal Power flow
This paper uses PINNs with Karush-Kuhn-Tucker conditions to predict DC-OPF solutions. They use the simplified DC-OPF problem setting as a way to gain insight on the application of PINNs on AC-OPF.
Their method incorporates KKT-conditions into the loss function for the PINNs. See the paper for these specific conditions.
Next they transform the PINN into a Mixed Integer Linear Program to get worst case guarantees as done in their previous work (Venzke et al. 2020).
They compare to a normal neural network in terms of average violations to the constraints and Mean Absolute Error. While they claim the PINNs are superior, the results actually show a mixed bag. They also don’t show and hyperparameter studies or confidence intervals to get a sense of how well the algorithms work in general.
IN-PROGRESS (Kim et al. 2019)
TODO (Liao et al. 2022)
TODO (Huang and Wang 2023)
TODO (Zhao et al. 2019)
TODO (Duchesne, Karangelos, and Wehenkel 2020)
TODO (Schweppe and Handschin 1974)
TODO (Wu and Liu 1989)
TODO (Zamzam, Fu, and Sidiropoulos 2019)
TODO (Feng et al. 2025)
TODO (Mestav, Luengo-Rozas, and Tong 2018)
TODO (Pan et al. 2021)
TODO (Fioretto, Mak, and Hentenryck 2020)
TODO (Subedi et al. 2021)
TODO (Cheng et al. 2025)
TODO (Weng et al. 2017)
TODO (Alimi, Ouahada, and Abu-Mahfouz 2020)
TODO (Mestav, Luengo-Rozas, and Tong 2019)
TODO (Venikov et al. 1975)
TODO (Kabalan, Singh, and Niebur 2017)
TODO (Nellikkath and Chatzivasileiadis 2022)
Honestly, I believe it is the same approach as in (Nellikkath and Chatzivasileiadis 2021).
TODO (Stock et al. 2024)
TODO (Singh et al. 2020)
Likely something similar to physics informed work.
TODO (Bolz, Rueß, and Zell 2019)
TODO (Li et al. 2019)
TODO (Liao et al. 2021)
TODO (Zhang, Wang, and Giannakis 2019)
TODO (Donon et al. 2019)
TODO (Singh, Kekatos, and Giannakis 2022)
TODO (Lin, Liu, and Zhu 2022)
TODO (Chen et al. 2020)
TODO (Owerko, Gama, and Ribeiro 2020)
TODO (Venzke et al. 2020)
TODO (Zhang, Chen, and Zhang 2022)
TODO (Zamzam and Baker 2020)
TODO (Pan 2021)
TODO (Lei et al. 2021)
TODO (Cui, Jiang, and Zhang 2023)
TODO (Zhao et al. 2022)
Other resources to look through
Power systems related papers and Background
NEXT (Hatziargyriou et al. 2021)
NEXT (Gomez-Exposito et al. 2011)
(Venzke, Molzahn, and Chatzivasileiadis 2021): Efficient Creation of Datasets for Data-Driven Power System Applications
This paper, and their previous work (Thams et al. 2020), works to generate a balanced dataset of secure and insecure points for AC-OPF datasets. The main issue with current methods is they have to simulate the trajectories which is computationally complex. Because of the challenge of finding a balanced dataset of secure and insecure points often datasets do not have enough data to get a good approximation of the security envelope in ML methods. This paper proposes to use a convex relaxation of AC-OPF problems to consider N-1 security and uncertainty.
The math/power systems related stuff is currently beyond me so this will be deferred until later (or until we need it).
- TODO Re-read (Venzke, Molzahn, and Chatzivasileiadis 2021)
Kron Reduction
TODO (Molzahn and Hiskens 2019)
TODO (Stott, Jardim, and Alsac 2009)
NEXT (Babaeinejadsarookolaee et al. 2021)
Finite-dimensional operators (i.e. approximate the parametric map through convolutional networks)
(Guo, Li, and Iorio 2016) 2016
This paper focuses on using convolutional neural networks to estimate the stead state laminar flow of air around arbitrary geometries. The main idea is to use supervised learning to estimate the output of an LCB simulator and shorten the inference time of the design side. This paper is very specific to aerodynamics, and likely can’t be extended to Power Systems Control. There are several issues that cannot be overcome.
In any case, the pattern being set is gather data from a simulation and speed up inference time through an architecture and objective function designed for the problem setting.
(Khoo, Lu, and Ying 2021) 2021
This paper focuses on two PDEs from physics:
The main goal is to estimate a quantity described by these PDEs and their random coefficients. Usually, Monte-carlo sampling is typically used for this problem, which has an inherently noisy estimate of the desired quantities.
TODO (Jiang et al. 2020) 2020
TODO (Adler and Öktem 2017) 2017
Side Ideas:
- TODO (Zhu and Zabaras 2018)
- TODO Continuous convolution networks (Ummenhofer et al. 2019)
- TODO (Bhatnagar et al. 2019)
- TODO Theoretical analysis of deep neural networks and parametric pdes (Kutyniok et al. 2022)
Physics informed machine learning
(Raissi, Perdikaris, and Karniadakis 2019) 2019
The main idea of this paper is to use the underlying model derived from prior knowledge of the problem as a regularizer during the training phase of a neural network. This is a accomplished by allowing the auto differentiation tools to propogate through the space and time components of the input to the neural network.
The system looks like
\[ u_t + \mathcal{N}[u; \lambda] = 0, x\in \Omega, t\in[0, T] \]
Where \(u(t,x)\) denotes the latent solution, \(\mathcal{N}[u; \lambda]\) is a non-linear operator parameterized by \(\lambda\), and \(T, \Omega\) define the boundary spaces as a subset of the reals.
This paper also introduces a nice set of categories for data driven approaches to solving PDEs:
- Data-driven solutions of partial differential equations: Inference, filtering, and smoothing. Given a fixed model parameters \(\lambda\), what can be said about the unknown hidden state \(u(t,x)\) of the system?
- Data-driven discovery of partial differential equations: which is learning, system identification, or data-driven discovery of PDEs, i.e. What are the parameters \(\lambda\) which best describe the observed data?
TODO (Karniadakis et al. 2021)
TODO (Willard et al. 2022)
Runga-Kutta Physics Informed Neural Networks
TODO (Wang, Teng, and Perdikaris 2021)
(Ostrometzky et al. 2020)
TODO (Banerjee et al. 2023)
Neural Operators
IN-PROGRESS (Kovachki et al. 2024)
Neural operators are designed to learn general solutions for maps between two function spaces and to be discretization-invariant. Current data-driven solutions using neural networks are not discretization invariant and often require new networks/datasets/training for different levels of discretization. Neural operators describe a process to estimate these maps with the following properties:
- acts on any discretization of the input function, i.e. accepts any set of points in the input domain,
- can be evaluated at any point of the output domain
- converges toa continuum operator as the discretization is refined. (Converging to a continuum operator means that as the discretization is refined, the function more closely estimates the true continuous function).
While I think this is an interesting idea, I’m worried about the amount of data it might take to estimate a steady state pde. It is also unclear how this would interact w/