Surrogate model of a chaotic Lorenz waterwheel

In this tutorial we will walk through the process of simulating a chaotic waterwheel in Collimator and developing a simplified surrogate model using the SINDy nonlinear system identification algorithm.

The Malkus waterwheel is a mechanical analogue of the Lorenz equations. Actually, the Lorenz system can be derived exactly from the continuous form of the waterwheel model.

The Malkus waterwheel consists of a series of leaky "buckets" attached to a rotating wheel. The buckets are filled at the top of the wheel. This creates a time-varying center of mass and torque on the wheel. With certain parameters, the wheel will rotate back and forth chaotically. Although the system has $N+2$ degrees of freedom (where $N$ is the number of buckets and the extra two degrees of freedom are for the angle and angular velocity of the wheel), it can be closely approximated with a Lorenz-like system in three equations: the angular velocity of the wheel, and its $x$- and $y$-center of mass coordinates.

import numpy as np
import pysindy as ps
import pandas as pd

import matplotlib.pyplot as plt
from IPython.display import HTML
plt.style.use('dark_background')

Model derivation

The key feature of the model is the coupling between the buckets and the wheel dynamics. Since the relative position of the buckets is fixed, each bucket has only one degree of freedom corresponding to its mass $m_i(t)$, and its angular position can be defined as $\theta_n(t) = \theta(t) + 2 \pi n / N$, where $\theta(t)$ is the global orientation of the wheel (with zero at the "bottom" of the wheel). If the water is flowing into the system at a rate of $Q_0$ kg/s, and each bucket drains at a rate $Km_i(t)$ proportional to its mass, where $K$ has units of 1/s, then the mass conservation equation for the $n$-th bucket is

$$ \dot{m}_n = Q(\theta_n) - K m_n, \qquad n = 1, 2, \dots, N. $$

Here $Q(\theta_n)$ expresses the fact that even though the inflow rate is constant, only buckets in particular angular positions can catch the water. Assuming that the system is arranged so that the topmost bucket is always being filled,

$$ Q(\theta_n) = \begin{cases} Q_0, \qquad & | \theta_n(t) - \pi | < \pi / N \\ 0, \qquad & \mathrm{otherwise.} \end{cases} $$

The wheel dynamics are given by Newton's second law in angular form:

$$ \frac{d}{dt} I \omega = -\nu \omega + \sum_n \tau_n, $$

where $\nu$ models the braking effect of wheel friction and $\tau_n$ is the torque from the $n$-th bucket: $\tau_n = - m_n(t) g R \sin \theta_n(t)$.

The moment of inertia of the wheel is time-varying due to the changing masses of the buckets, so

$$ I(t) = I_0 + \sum_n m_n(t) R^2, $$

where $I_0$ is the moment of inertia of the wheel when all buckets are empty.

The full wheel dynamics are

$$ I \dot{\omega} = -\nu \omega - \sum_n m_n(t) g R \sin \theta_n(t) - \sum_n \dot{m}_n(t) \omega. $$

Nondimensionalization

In order to simplify the parameter of the space, we can introduce dimensionless equivalents of $t$ and $m_n$. First, define a length scale based on the wheel radius $R$ (i.e. set $R=1$). Then the unloaded moment of inertia $I_0$ can be used to define a mass scale $M_0 = I_0 / R^2$ so that dimensionless mass is given by $\tilde{m} = m / I_0 R^2$ and the dimensionless moment of inertia is $\tilde{I} = I_0(1 + \sum_n \tilde{m}_n)$. The wheel dynamics are then

$$ \tilde{I} \dot{\omega} = -\frac{\nu}{I_0} \omega - \frac{g}{R} \sum_n \tilde{m}_n \sin \theta_n - \omega \sum_n \dot{\tilde{m}}_n. $$

Note that $g / R$ would be the square of the natural oscillation frequency of a pendulum with arm length $R$. This suggests defining a time scale using the frequency $\omega_0 = \sqrt{g / R}$.

Then $\omega = \tilde{\omega} \omega_0 $ and $d / dt = \omega_0 d / d\tilde{t}$. Substituting into the dimensionless dynamics,

$$ \omega_0^2 \tilde{I} \frac{d \tilde{\omega}}{d \tilde{t}} = -\frac{\nu \omega_0}{I_0} \tilde{\omega} - \omega_0^2 \sum_n \tilde{m}_n \sin \theta_n - \omega_0^2 \tilde{\omega} \sum_n \frac{d \tilde{m}_n}{d\tilde{t}}. $$

Canceling out the frequency factors, defining the dimensionless damping constant $\gamma = \nu / I_0 \omega_0$, and dropping the tilde notation, this simplifies to

$$ I \dot{\omega} = -\gamma \omega - \sum_n m_n \sin \theta_n - \omega \sum_n \dot{m}_n. $$

Using the same nondimensionalization, the bucket mass conservation equations are

$$ \dot{m}_n = Q_0 q(\theta_n) - k m_n $$

where $q_0 = Q_0 R^2 / I_0 \omega_0$ is the dimensionless fill rate, $k = K / \omega_0$ is the dimensionless drain rate, and $q$ is the conditional equation that identifies the topmost bucket (above) with $Q_0$.

This is the form of the equations we will simulate, with dimensionless parameters for the fill rate ($q_0$), drain rate ($k$), and wheel friction ($\gamma$).

‍Simulating in Collimator

The waterwheel model derived above has a convenient structure for simulation. The individual buckets are only weakly coupled via the "global" state of the wheel, so we can use the Submodel and Replicator features in Collimator to easily develop a resuable, vectorized model. We can break the model into three pieces:

Model the mass conservation of an individual bucket as a reusable Submodel
Simultaneously simulate $N$ buckets using a Replicator block
Compute global variables for the wheel (center of mass, angular velocity, etc) and share these with the individual bucket models

1. Individual bucket model

The mass conservation equations for each individual bucket are a simple first-order linear system:

$$ \dot{m}_n = Q_0 q(\theta_n) - k m_n, \qquad n = 1, 2, \dots, N. $$

Although $\theta_n$ appears to be a state variable for the $n$-th bucket, since the relative position of the buckets with respect to one another is fixed, each $\theta_n$ is really determined by the global wheel position $\Theta$:

$$ \theta_n(t) = \Theta(t) + \frac{2 \pi n}{N}, \qquad n = 1, 2, \dots, N. $$

In other words, it makes sense to treat $\theta_n$ as an external input to the mass equation. Here is one way to implement the governing equation for an isolated bucket (in a Submodel called `BucketDynamics`):

The logic in this Submodel is straightforward: the fill rate of the bucket is $q_0$ if it is in the topmost position, otherwise it is zero. In either case water flows out of the bucket at a rate proportional to the weight of the current water.

We have also abstracted the logic of checking for the topmost bucket into the `IsTopBucket` Group:

The variable `fill_threshold` is a Submodel parameter that we set to be $-\cos \pi/N$. The Comparator block will therefore be true for one and only one bucket at any given time, giving a simple trick for checking which bucket is the closest to the top position without having to explicitly compute the positions of the other buckets.

One final note on the bucket submodel: often we can choose to use abstractions such as submodels either to represent the "right-hand-side" of an ODE (output the time-derivative of state variables) or to act as a "plant" model (output the integrated state or measurements). In this case we will need the time derivatives $\dot{m}$ in order to compute the time derivative of the moment of inertia, but in general this is a design decision.

2. Vectorize $N$ buckets

Now that we have defined the governing equations for a single bucket, the Replicator block in Collimator makes it easy to simulate as many copies of this Submodel as we want. The Replicator block consists of a single Submodel and an `Instances` parameter ($N$, in this case):

When you are planning to simulate $N$ Submodels using a Replicator block, one important feature is that the inputs to the Replicator should be arrays with leading dimension $N$. That is, the Replicator expects the leading dimensions of its inputs to be "batched" in a similar manner to machine learning libraries. The output will have a similar shape.

Conceptually, this collection of blocks is simple: the offset values `ThetaOffset_0` determine the local angular position $\theta_n$ as a function of the global position $\Theta$ as described above.

The Replicator block then computes the time derivative $\dot{m}_n$ of each bucket mass, and the vectorized $\dot{\boldsymbol{m}}$ is integrated in time with the Integrator block.

The more important feature of these blocks is the treatment of signals relating to the Replicator.

Here we handle the signals in three slightly different ways, depending on the nature of the signal:

The `ThetaOffset_0` Constant block is defined with a length-$N$ numpy array, so each bucket instance will receive a scalar value
The incoming value of $\Theta$ is a scalar. This would throw an error if passed directly to the Replicator, so we first need to use a ScalarBroadcast block to transform the signal into a length-$N$ vector (with identical entries).
The output of `VectorizedBuckets` is a length-$N$ vector signal. The Integrator block `Mass_0` handles this as it would any other signal, so we can feed the vector of bucket states back to the Replicator as usual

Again, each bucket instance within the Replicator will only have access to its own state $m_n$ and local position $\theta_n$. As a result, this approach is ideally suited to cases where a set of functionally identical subsystems should be simulated and are only coupled through global interactions.

3. Global wheel state

Finally, we are left with computing global attributes of the waterwheel given the vector of bucket masses. Since each of these calculations is relatively involved, we will continue to abstract each into its own Group. You may want to check to see how each block diagram corresponds to its equation.

We have to compute the following quantities (in dimensionless numbers as derived above):

Center of Mass

$$ \begin{aligned} x_{\mathrm{cm}} &= \frac{\sum_n m_n \sin \theta_n}{\sum_n m_n} \\ y_{\mathrm{cm}} &= \frac{\sum_n -m_n \cos \theta_n}{\sum_n m_n} \end{aligned} $$

Moment of Inertia

$$ I(t) = I_0 + \sum_n m_n(t) $$

Net Torque

$$ \tau = -\gamma \omega - \sum_n m_n \sin \theta_n $$

Angular Acceleration

$$ \dot{\omega} = \frac{1}{I(t)} \left(\tau - \omega \sum_n \dot{m}_n \right) $$

The full model is organized like this:

Note that the calculations of net torque and angular acceleration happen in the blue-shaded block along with the center-of-mass and moment-of-inertia computation. The angular velocity and position are then extracted by integrating the angular acceleration twice; this is the scalar value passed to the vectorized bucket components.

Running the simulation

We will simulate the waterwheel with $N=20$ buckets and use values of $\beta=0.1$ for the drain rate, $q0=0.5$ for the fill rate, and $\gamma = 2$ for the wheel damping. Analytic treatment of the waterwheel model shows that in the limit of infinite buckets the state can be determined by the center of mass location $(x_\mathrm{cm}, y_\mathrm{cm})$ and angular velocity $\omega$. Here's the time series of these variables for these parameters:

As seen in the time series of angular velocity `Omega`, the wheel appears to randomly change motions from clockwise to counter-clockwise. To reproduce the famous "butterfly attractor" we can plot these three variables against each other:

# Load the simulation results from CSV using pandas
df = pd.read_csv("Waterwheel_2023-06-06-17-28-27.csv", index_col=0)

t = np.array(df.index)
omega = df["Omega.out_0"]
x_cm = df["CenterOfMass.x_cm"]
y_cm = df["CenterOfMass.y_cm"]
z = np.array( (omega, x_cm, y_cm) )

ax = plt.figure().add_subplot(projection='3d')
ax.plot(z[0], z[1], -z[2], lw=0.1, c='k')
ax.view_init(10, 60, 0)

See simulation results here.

fig, axs = plt.subplots(3, 1, figsize=(10, 4), sharex='col')

axs[0].plot(t, z[0], lw=1.0)
axs[0].set_ylabel(r"$\omega$")
axs[0].grid()

axs[1].plot(t, z[1], lw=1.0)
axs[1].set_ylabel(r"$x_\mathrm{cm}$")
axs[1].grid()

axs[2].plot(t, z[2], lw=1.0)
axs[2].set_ylabel(r"$y_\mathrm{cm}$")
axs[2].grid()

axs[-1].set_xlabel(r"$t$")
plt.show()

SINDy model

So far, our waterwheel model has 22 degrees of freedom: the mass of each bucket, plus the angular position and velocity of the wheel.

However, the similarity to the Lorenz system suggests that we might be able to approximate the motion of the wheel with a simpler nonlinear dynamics model.

We can use the SINDy nonlinear system identification algorithm implemented in PySINDy to learn such a "surrogate model" from the data directly.

Given a time series of observed data, the SINDy algorithm attempts to approximate the dynamics using a model of the form

$$ \dot{\boldsymbol{x}} = \Theta(\boldsymbol{x}) \boldsymbol{\xi}, $$

where $\Theta(\boldsymbol{x})$ is a library of "candidate functions" (typically low-order polynomials) and $\boldsymbol{\xi}$ is a coefficient vector determined by regression against the observed time series.

By using sparse regression algorithms that encourage a limited number of nonzero entries in the coefficient vector, the algorithm effectively selects a small number of functions that best approximate the observed dynamics.

There are two ways to use SINDy in Collimator: in external analysis, or in Experiment mode.

For the purposes of this tutorial, we will work with the data outside of Collimator.

However, to leverage more advanced features of PySINDy such as ensemble-based regression and parametric modeling, it is better to use the scalability provided by the Experiment canvas.

Preprocessing

Since we are only using the basic features of PySINDy (finite difference differentiation, polynomial function library, one time series), we can nearly just follow the tutorial given in the package documentation.

However, we do need to take one pre-processing step: because Collimator uses adaptive time-stepping, the numerical differentiation routines in PySINDy can sometimes have trouble when the steps are very close together.

It's easiest to interpolate the data to a uniform sampling rate before doing system identification

t_interp = np.arange(0, t[-1], 0.01)
z_interp = np.array([np.interp(t_interp, t, z_i) for z_i in z])

Now we can proceed to apply the standard SINDy algorithm:

# Fit the SINDy model
opt = ps.STLSQ(threshold=0.05, alpha=0.5, verbose=True)
model = ps.SINDy(optimizer=opt)
model.fit(z_interp.T, t=t_interp)
model.print()

The identified model has three degrees of freedom and is similar, but not identical, to the original Lorenz system. In fact, the analytic derivation shows that it should be identical after a change of variables. To see how it performs compared to the full-resolution system, we can also simulate the SINDy model forward in time from the same initial condition:

# Simulate the SINDy model
z_sim = model.simulate(z[:, 0], t_interp)

ax = plt.figure().add_subplot(projection='3d')
ax.plot(z[0], z[1], -z[2], lw=0.1, c='k')
ax.plot(z_sim[:, 0], z_sim[:, 1], -z_sim[:, 2], lw=0.1, c='r')
ax.view_init(10, 60, 0)

With only three degrees of freedom, the low-dimensional surrogate model closely matches the behavior of the 22-dimensional system, at least statistically. Since the dynamics are chaotic, we can't expect the time series to match in detail.

Adding the surrogate model to Collimator

Finally, let's add the SINDy model back into Collimator so that we can compare the performance. The model learned by SINDy is

$$ \begin{aligned} \dot{x} &= -(y + 0.33x) \\ \dot{y} &= -(0.1 y + xz) \\ \dot{z} &= 0.08 - 0.1z + xy \end{aligned} $$

In the future we hope to support automatic model generation from symbolic equations, but for now we can do it the old-fashioned way:

The full 22-dimensional waterwheel model runs for 5000 time units on Collimator in approximately 25 seconds, while the SINDy approximation simulates the same time horizon in about 1 second (and can be exported to C code). This kind of speedup is what makes surrogate models attractive in applications like model-based control, where an order of magnitude speedup can make the difference between a viable online controller and untenable latency.

Next steps

If you want to build on this demo and experiment more with Collimator and system identification, here are a few things you could try:

Ensemble modeling with Experiments

PySINDy supports learning models from multiple trajectories, and Collimator supports running batches of models with varying parameters. If you compare the SINDy model to the Waterwheel implementation, you might notice that several of the coefficients are almost exactly 1.0, and others are near the value of the drain rate, $k = 0.1$. This suggests that the parametric dependence in the SINDy model is probably relatively simple. To explore this, try:

Simulating the waterwheel with varying drain rate, fill rate, and friction parameters using Collimator's Experiments mode
Use the ensemble feature of PySINDy to identify a parametric version of the low-dimensional model
See if the surrogate model can predict critical features of the nonlinear dynamics such as bifurcations and stability of fixed points

Controlling the waterwheel

One of the key use cases for surrogate models is as proxies for controller design. For example, suppose we allow the inflow rate to be a time-varying function that could be set by a controller, i.e. $q(t) = q_0 u$, where $u$ might be chosed with a closed-loop control law. Designing a feedback controller for this system would be relatively challenging, considering some of the more complicated aspects of the model like the `IsTopBucket` control flow logic. On the other hand, PySINDy can easily identify simple surrogate models of systems with external inputs, which could then be used for control design. To explore this, try:

Modifying the Waterwheel model to allow for a time-varying inflow rate or wheel friction, for instance modeling a mechanical brake
Simulate the actuated system over a long trajectory using high-frequency or (low-pass filtered) random inputs
Use the SINDYc algorithm to identify a model of the actuated system (see the PySINDy docs for an example)
Design a feedback controller based on the surrogate model and test it on the full Waterwheel model. Control objectives might include stabilizing one of the fixed points or an unstable limit cycle

Try it in Collimator