NLFFF quality metrics

These are the quantitative checks used to judge an extrapolated magnetic field.

A low training loss is not proof of a good field. Neither is a field-line plot that looks plausible. The field has to be checked against the physics and, where possible, against a reference field.

The metrics split into two groups:

  • physical-consistency metrics: does the field obey the equations?
  • agreement metrics: does it match a reference field point by point?

Throughout this note, indexes the grid points, is the model field, and:

is the current density.

Force-freeness

A force-free field has current parallel to magnetic field:

The pointwise angle between them is measured using:

If the field is force-free, .

The standard scalar summary is the current-weighted average sine:

This weights the metric toward regions where current actually matters. Weak-current regions are easy to satisfy and should not dominate the score.

A related number is:

usually quoted in degrees.

This metric is close to the sigma_j diagnostic used in the NF2 codebase.

Divergence control

Raw is dimensional and depends on resolution, so it needs normalising.

The usual dimensionless measure is the average fractional flux:

Read it as: how much net flux leaks out of a cell, divided by the total unsigned flux passing through the cell surface.

Values around or below are often considered clean, though the acceptable level depends on the use case.

A useful companion diagnostic is the energy in the non-solenoidal component of the field. That tells us whether divergence error is merely numerical noise or large enough to corrupt energy estimates and topology.

Boundary agreement

Boundary agreement asks whether the model stays tied to the observed magnetogram.

A simple relative error is:

Interpret this together with the physics metrics.

Too much boundary agreement can be bad if the model is overfitting non-force-free photospheric structure. Too little boundary agreement means the extrapolation may no longer represent the observed active region.

In NF2, this trade-off is controlled directly by the boundary weight .

Agreement with a reference field

When a reference field exists, for example an analytic solution or another method, Schrijver-style comparison metrics are useful.

Vector correlation

This is like a 3D correlation:

A perfect match gives .

Cauchy-Schwarz metric

This averages the pointwise direction cosine:

It is sensitive to orientation errors even when magnitudes are similar.

Normalised vector error

Written as a complement so that 1 is perfect:

Mean vector error

Also written as a complement:

Total energy ratio

This compares total magnetic energy:

A value of 1 means the model carries the same total magnetic energy as the reference.

How to read them

No single agreement metric is enough.

and are dominated by strong-field regions. and weight points more evenly, so they expose errors in weak coronal regions.

Use several metrics together. Each one is blind to something.

Energy and topology

Two important diagnostics are not captured by a single pointwise comparison number.

Free magnetic energy:

This measures the energy above the potential-field floor. It is detailed in Magnetic energy and free energy in active regions.

Topology: field-line connectivity, null points, separatrices, QSLs, sheared arcades, and flux ropes. These are extracted by tracing field lines and are discussed in Magnetic topology and field-line tracing.

Topology is partly qualitative, but it is often the physically interesting output for an eruptive region like AR 11158.

The rule

Report the metrics together:

  • force-freeness, e.g. ;
  • divergence, e.g. ;
  • boundary error;
  • free magnetic energy;
  • potential-field comparison;
  • topology or connectivity checks.

Then watch how they move over training and across hyperparameter choices. The individual residuals can improve or degrade in opposite directions, so the joint behaviour is the actual result.