MLP Generator (fitting curve) · MonotoneSplines.jl

This section illustrates how to use the MLP generator to perform the monotone fitting. The MLP generator can achieve a perfect approximation to the fitting curve obtained from the optimization toolbox quickly. Particulaly, the MLP generator can save time by avoiding repeating to run the optimization toolbox for continuous $\lambda$ since it only needs to train once to obtain the function $G(\lambda)$, which can immediately return the solution at $\lambda=\lambda_0$ by simply evaluating $G(\lambda_0)$.

using MonotoneSplines
__init_pytorch__() # initialize supports for PyTorch backend
using Plots

We want to train a MLP generator $G(λ)$ to approximate the solution for the monotone spline.

\[\def\bfy{\mathbf{y}} \def\bB{\mathbf{B}} \def\bOmega{\boldsymbol{\Omega}} \def\subto{\mathrm{s.t.}} \begin{aligned} \min_{\gamma} & (\bfy - \bB\gamma)^T(\bfy - \bB\gamma) + \lambda\gamma^T\bOmega\gamma\\ \subto\, & \alpha\gamma_1 \le \cdots \le \alpha\gamma_J \end{aligned}\]

First of all, generate data $y = \exp(x) + ϵ$,

n = 20
σ = 0.1
x, y, x0, y0 = MonotoneSplines.gen_data(n, σ, exp, seed = 1234);

single $λ$

Here we train a MLP network $G(\lambda = λ_0)$ to approximate the solution.

λ = 1e-5;

By default, we use Flux.jl deep learning framework,

@time Ghat, loss = mono_ss_mlp(x, y, λl = λ, λu = λ, device = :cpu, disable_progressbar = true);

┌ Warning: Layer with Float32 parameters got Float64 input.
│   The input will be converted, but any earlier layers may be very slow.
│   layer = Dense(28 => 100, gelu)  # 2_900 parameters
│   summary(x) = "28-element Vector{Float64}"
└ @ Flux ~/.julia/packages/Flux/n3cOc/src/layers/stateless.jl:60
┌ Warning: Layer with Float32 parameters got Float64 input.
│   The input will be converted, but any earlier layers may be very slow.
│   layer = Dense(28 => 100, gelu)  # 2_900 parameters
│   summary(x) = "28-element Vector{Float64}"
└ @ Flux ~/.julia/packages/Flux/n3cOc/src/layers/stateless.jl:60
  2.241329 seconds (7.45 M allocations: 2.576 GiB, 8.06% gc time)

we also support the well-known PyTorch backend with the help of PyCall.jl,

@time Ghat2, loss2 = mono_ss_mlp(x, y, λl = λ, λu = λ, device = :cpu, backend = "pytorch", disable_progressbar = true);

  0.459125 seconds (38.46 k allocations: 2.360 MiB, 3.52% compilation time)

Note

Showing the progressbar is quite useful in practice, but here in the documenter environment, it cannot display properly, so currently I simply disable it via disable_progressbar = true.

plot the log training loss

plot(log.(loss), label = "Flux")
plot!(log.(loss2), label = "Pytorch")

The fitting can be obtained via evaluating at $λ$,

yhat = Ghat(y, λ);
yhat2 = Ghat2(y, λ);

┌ Warning: Layer with Float32 parameters got Float64 input.
│   The input will be converted, but any earlier layers may be very slow.
│   layer = Dense(28 => 100, gelu)  # 2_900 parameters
│   summary(x) = "28-element Vector{Float64}"
└ @ Flux ~/.julia/packages/Flux/n3cOc/src/layers/stateless.jl:60

compare it with the optimization solution

yhat0 = mono_ss(x, y, λ, prop_nknots = 0.2).fitted;

plot the fitted curves

scatter(x, y, label = "")
plot!(x0, y0, label = "truth", legend = :topleft, ls = :dot)
plot!(x, yhat, label = "MLP generator (Flux)", ls = :dash, lw = 2)
plot!(x, yhat0, label = "OPT solution")
plot!(x, yhat2, label = "MLP generator (Pytorch)", ls = :dash, lw = 2)

The fitting curves obtained from optimization solution and MLP generator overlap quite well.

continus $λ$

Here we train a generator $G(\lambda), \lambda\in [\lambda_l, \lambda_u]$,

λl = 1e-2
λu = 1e-1
@time Ghat, loss = mono_ss_mlp(x, y, λl = λl, λu = λu, prop_nknots = 0.2, device = :cpu);

┌ Warning: Layer with Float32 parameters got Float64 input.
│   The input will be converted, but any earlier layers may be very slow.
│   layer = Dense(28 => 100, gelu)  # 2_900 parameters
│   summary(x) = "28-element Vector{Float64}"
└ @ Flux ~/.julia/packages/Flux/n3cOc/src/layers/stateless.jl:60
  2.240657 seconds (7.46 M allocations: 2.577 GiB, 8.26% gc time, 0.41% compilation time)

Plot the training losses along with the iterations.

plot(loss)

Evaluate the generator at $\lambda_l$, $\lambda_u$ and their middle $\lambda_m$

λm = (λl + λu) / 2
yhat_l = Ghat(y, λl)
yhat_u = Ghat(y, λu)
yhat_m = Ghat(y, λm)
yhat0_l = mono_ss(x, y, λl, prop_nknots = 0.2).fitted;
yhat0_u = mono_ss(x, y, λu, prop_nknots = 0.2).fitted;
yhat0_m = mono_ss(x, y, λm, prop_nknots = 0.2).fitted;

┌ Warning: Layer with Float32 parameters got Float64 input.
│   The input will be converted, but any earlier layers may be very slow.
│   layer = Dense(28 => 100, gelu)  # 2_900 parameters
│   summary(x) = "28-element Vector{Float64}"
└ @ Flux ~/.julia/packages/Flux/n3cOc/src/layers/stateless.jl:60
┌ Warning: Layer with Float32 parameters got Float64 input.
│   The input will be converted, but any earlier layers may be very slow.
│   layer = Dense(28 => 100, gelu)  # 2_900 parameters
│   summary(x) = "28-element Vector{Float64}"
└ @ Flux ~/.julia/packages/Flux/n3cOc/src/layers/stateless.jl:60
┌ Warning: Layer with Float32 parameters got Float64 input.
│   The input will be converted, but any earlier layers may be very slow.
│   layer = Dense(28 => 100, gelu)  # 2_900 parameters
│   summary(x) = "28-element Vector{Float64}"
└ @ Flux ~/.julia/packages/Flux/n3cOc/src/layers/stateless.jl:60

Plot the fitting curves

scatter(x, y, label = "")
plot!(x0, y0, label = "truth", legend = :topleft, ls = :dot)
plot!(x, yhat0_l, label = "OPT solution (λ = $λl)")
plot!(x, yhat_l, label = "MLP generator (λ = $λl)", ls = :dash, lw = 2)
plot!(x, yhat0_m, label = "OPT solution (λ = $λm)")
plot!(x, yhat_m, label = "MLP generator (λ = $λm)", ls = :dash, lw = 2)
plot!(x, yhat0_u, label = "OPT solution (λ = $λu)")
plot!(x, yhat_u, label = "MLP generator (λ = $λu)", ls = :dash, lw = 2)