SIREN neural field model in NF2

# SIREN neural field model in NF2

NF2 represents the coronal magnetic field as a neural field: a network that maps a spatial coordinate directly to the magnetic field at that point,

N_{θ} : (x, y, z) ⟼ B .

The network is the field representation. This is different from a classical grid method, where the unknowns are the field values stored at mesh points.

Because $N_{θ}$ is differentiable, NF2 gets the spatial derivatives needed for $\nabla \times B$ and $\nabla \cdot B$ by automatic differentiation. It does not need finite-difference stencils for those derivatives.

Why SIREN

The backbone is a SIREN: a sinusoidal representation network. It is an MLP with sine activations,

sin (ω_{0} x),

instead of ReLU-style activations.

This matters because NF2 is not only fitting field values. It is also fitting the derivatives of the field through the force-free and divergence residuals.

For a ReLU network, the output is piecewise linear in the input coordinates. Within each linear region,

\frac{\partial B}{\partial x _{j}}

is constant, and the second derivative is zero. At the boundaries between regions, the derivative jumps discontinuously. That is a bad match for a coronal magnetic field, where the current density

J \propto \nabla \times B

should vary smoothly through the volume, not sit as piecewise-constant patches separated by artificial kinks.

The PINN loss makes this worse. The force-free and divergence terms are built from derivatives:

(\nabla \times B) \times B, \nabla \cdot B .

So if the activation gives poor derivatives, the physics loss is being computed from poor physics. A ReLU model can fit boundary values while producing derivative structure that is a numerical artefact of activation hinges rather than a smooth magnetic field. This is the classic failure mode: the function values look fine, but the derivatives are not physically meaningful.

There is also an optimisation issue. Inside one ReLU linear region, changing the coordinate slightly does not change the Jacobian. The residuals then have a blocky, low-information structure in space. For PINNs, where the residual field itself is the training signal, that is exactly the wrong inductive bias.

A sine activation avoids this. It is smooth, and its derivatives remain sinusoidal:

\frac{d}{d x} sin (ω_{0} x) = ω_{0} cos (ω_{0} x),

\frac{d ^{2}}{d x ^{2}} sin (ω_{0} x) = - ω_{0}^{2} sin (ω_{0} x) .

So the network can represent both the magnetic field and its derivative structure cleanly. This is especially useful for NF2 because the model repeatedly asks for curls, divergences, currents, and sometimes derivatives of fields derived from a vector potential.

Why not tanh or sigmoid?

tanh and sigmoid are smoother than ReLU, so they avoid the piecewise-linear derivative problem. They are common in many PINNs. The issue is that they are saturating monotonic functions, which gives them a different inductive bias from sine.

For example,

\frac{d}{d x} tanh x = 1 - tanh^{2} x,

and

\frac{d}{d x} σ (x) = σ (x) (1 - σ (x)) .

Both derivatives become very small when the activation saturates. Their higher derivatives also become small away from the transition region. So a deep tanh or sigmoid network can easily place much of the domain in regions where the field still changes, but the derivative signal is weak.

That is awkward for NF2 because the physics loss is mostly derivative information. If the activation saturates, the optimiser can struggle to adjust the curl and divergence structure across the volume.

There is also a representation issue. tanh and sigmoid are good at making smooth transitions, but they are not naturally oscillatory. A magnetic field over an active region can have alternating polarities, compact current structure, and spatial variation across several scales. A tanh network can represent this in principle, but it often needs many units layered together to build high-frequency structure. This is the usual spectral-bias problem: the network learns low-frequency, slowly varying structure first and may underfit fine magnetic detail.

A sine network starts with oscillatory basis functions built in. The frequency scale is controlled by $ω_{0}$ , so the model can represent fine spatial structure without relying entirely on deep stacks of monotonic transitions. Just as importantly, the derivatives remain oscillatory rather than fading to zero over saturated regions.

So the comparison is:

Activation	Smooth?	Main issue for NF2/PINNs
ReLU	no	piecewise-linear field, discontinuous derivatives
sigmoid	yes	saturates, non-centred, weak derivative signal
tanh	yes	saturates, biased toward smooth low-frequency transitions
sine / SIREN	yes	derivatives remain structured and oscillatory

In short: ReLU is too non-smooth, while tanh and sigmoid are smooth but too prone to saturation and low-frequency bias. NF2 needs the derivatives to be physically meaningful across the volume, so a smooth periodic activation is a better fit.

In the local NF2 code (nf2/train/model.py), the defaults are:

8 layers;
width 256;
hidden $ω_{0} = 1$ ;
first-layer $ω_{0} = 5$ .

The larger first-layer frequency spreads the input coordinates across the sine range. The SIREN initialisation keeps activations well scaled through the network.

Three field parameterisations

Model	Network output	Magnetic field	Main consequence
`BModel`	$B$	$B = N_{θ}$	needs an explicit divergence loss
`VectorPotentialModel`	$A$	$B = \nabla \times A$	divergence-free by construction
`ScaledVectorPotentialModel`	$A$ with radial envelopes	$B = \nabla \times A$	useful for spherical/global fields

The vector-potential version uses the identity:

\nabla \cdot (\nabla \times A) \equiv 0.

So solenoidality is exact by construction. The cost is that the model has to take an extra curl, which means more automatic differentiation work.

The scaled version adds radial power-law factors so one SIREN can cover a global field whose magnitude changes strongly with radius.

Current from the Jacobian

The physics losses need the curl and divergence of $B$ . Both come from the Jacobian:

J_{ij} = \frac{\partial B _{i}}{\partial x _{j}} .

From this,

\nabla \times B = \partial_{y} B_{z} - \partial_{z} B_{y} \partial_{z} B_{x} - \partial_{x} B_{z} \partial_{x} B_{y} - \partial_{y} B_{x}, \nabla \cdot B = \partial_{x} B_{x} + \partial_{y} B_{y} + \partial_{z} B_{z} .

NF2 builds the Jacobian with torch.autograd.grad(..., create_graph=True), so the derivative calculation remains part of the training graph.

For VectorPotentialModel, $B$ is already a curl of $A$ , so computing the current requires a second derivative path. That is more expensive, but the divergence constraint is built in.

Knowledge Garden

Explorer

SIREN neural field model in NF2

Why SIREN

Why not tanh or sigmoid?

Three field parameterisations

Current from the Jacobian

Graph View

Table of Contents

Backlinks

Knowledge Garden

Explorer

SIREN neural field model in NF2

Why SIREN

Why not tanh or sigmoid?

Three field parameterisations

Current from the Jacobian

Related

Graph View

Table of Contents

Backlinks