End-to-End PSF Dataset Generation Pipeline¶

What you will learn:

How to generate a batch of PSFs with randomised Zernike aberrations
How to build a WFE budget (per-mode amplitude limits)
How to store and inspect a PSF dataset as a NumPy array
How to visualise the statistical spread of the generated PSFs

Prerequisites: 04_wavefront_aberrations.ipynb, 05_encircled_energy.ipynb

Motivation¶

Many applications — optical system design studies, machine-learning wavefront-sensing models, Monte-Carlo tolerance analyses — require not just a single PSF but a large collection of PSFs drawn from a realistic distribution of wavefront errors.

This notebook shows the complete workflow:

Define WFE budget
       ↓
Sample random Zernike coefficients
       ↓
Compute PSF for each coefficient set
       ↓
Collect results into a NumPy array
       ↓
Inspect and characterise the dataset

In [1]:

Copied!





%matplotlib inline
import psfcraft
from psfcraft.utils import generate_coefficients, build_wfe_budget
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import psfcraft
from psfcraft.utils import generate_coefficients, build_wfe_budget
import numpy as np
import matplotlib.pyplot as plt

pysynphot is unavailable (`np.alltrue` was removed in the NumPy 2.0 release. Use `np.all` instead.). Polychromatic sources will use the built-in Planck-law implementation.

1. Define the Telescope and WFE Budget¶

A WFE budget assigns a maximum amplitude (in nm RMS) to each Zernike mode.
Modes that matter less (high-order) get smaller budgets.

build_wfe_budget() expands a radial-order budget into a per-Zernike list.

In [2]:

Copied!





# --- Telescope configuration ---
PRIMARY_RADIUS   = 0.5    # metres
SECONDARY_RADIUS = 0.1    # metres
PIXEL_SCALE      = 0.298  # arcsec/pixel
FOV_PIXELS       = 32     # image size (keep small for speed)
WAVELENGTH       = 1e-6   # 1 µm

# --- WFE budget: [0, tip, tilt, defocus, astig×2, coma×2, trefoil×2, spherical, …]
# Radial budget: amplitude in nm per radial order (0=piston, 1=tip/tilt, …)
radial_budget_nm = [0, 100, 50, 36, 18, 9, 5]
wfe_budget = build_wfe_budget(radial_budget_nm)

print(f"Budget has {len(wfe_budget)} Zernike coefficients.")
print("Per-mode limits (nm):", wfe_budget[:12], "…")

# Quick bar chart
fig, ax = plt.subplots(figsize=(10, 3))
ax.bar(range(1, len(wfe_budget) + 1), wfe_budget, color="steelblue", edgecolor="white")
ax.set_xlabel("Noll index")
ax.set_ylabel("Max amplitude (nm)")
ax.set_title("WFE budget per Zernike mode")
ax.set_xlim(0.5, len(wfe_budget) + 0.5)
ax.grid(True, axis="y", alpha=0.3)
plt.tight_layout()
plt.show()
# --- Telescope configuration ---
PRIMARY_RADIUS   = 0.5    # metres
SECONDARY_RADIUS = 0.1    # metres
PIXEL_SCALE      = 0.298  # arcsec/pixel
FOV_PIXELS       = 32     # image size (keep small for speed)
WAVELENGTH       = 1e-6   # 1 µm

# --- WFE budget: [0, tip, tilt, defocus, astig×2, coma×2, trefoil×2, spherical, …]
# Radial budget: amplitude in nm per radial order (0=piston, 1=tip/tilt, …)
radial_budget_nm = [0, 100, 50, 36, 18, 9, 5]
wfe_budget = build_wfe_budget(radial_budget_nm)

print(f"Budget has {len(wfe_budget)} Zernike coefficients.")
print("Per-mode limits (nm):", wfe_budget[:12], "…")

# Quick bar chart
fig, ax = plt.subplots(figsize=(10, 3))
ax.bar(range(1, len(wfe_budget) + 1), wfe_budget, color="steelblue", edgecolor="white")
ax.set_xlabel("Noll index")
ax.set_ylabel("Max amplitude (nm)")
ax.set_title("WFE budget per Zernike mode")
ax.set_xlim(0.5, len(wfe_budget) + 0.5)
ax.grid(True, axis="y", alpha=0.3)
plt.tight_layout()
plt.show()

Budget has 28 Zernike coefficients.
Per-mode limits (nm): [0, 100, 100, 50, 50, 50, 36, 36, 36, 36, 18, 18] …

No description has been provided for this image

2. Generate a Batch of PSFs with Random Aberrations¶

generate_coefficients(wfe_budget) draws a uniform random coefficient vector within the budget.
We call it repeatedly and compute one PSF per draw.

Performance tip: Keep N_PSFS small (≤ 20) in this tutorial notebook.
For large datasets (10⁴–10⁵ PSFs), use the PSF_Generator class from psfcraft.utils which handles parallelism and HDF5 output automatically.

In [3]:

Copied!





N_PSFS = 12   # small for tutorial speed
RNG_SEED = 42

rng = np.random.default_rng(RNG_SEED)  # reproducible random state

# Pre-build the telescope object (reused across draws)
tel = psfcraft.NewtonianTelescope(
    version="1_3",
    primary_radius=PRIMARY_RADIUS,
    secondary_radius=SECONDARY_RADIUS,
)
tel.pixelscale = PIXEL_SCALE

psf_stack   = []   # oversampled PSF images
coef_stack  = []   # corresponding Zernike coefficients

for i in range(N_PSFS):
    # Draw random coefficients within the per-mode budget
    coefs = generate_coefficients(wfe_budget)

    tel.wfe_coefficients = coefs
    psf = tel.calc_psf(monochromatic=WAVELENGTH, fov_pixels=FOV_PIXELS)

    psf_stack.append(psf[0].data)   # oversampled extension
    coef_stack.append(coefs)
    print(f"  PSF {i+1:02d}/{N_PSFS}  RMS WFE = {np.sqrt(np.sum(coefs**2))*1e9:.1f} nm", end="\r")

psf_array  = np.array(psf_stack)    # shape: (N_PSFS, H, W)
coef_array = np.array(coef_stack)   # shape: (N_PSFS, n_modes)

print(f"\nDataset ready: {psf_array.shape}  (N × H × W)")
N_PSFS = 12   # small for tutorial speed
RNG_SEED = 42

rng = np.random.default_rng(RNG_SEED)  # reproducible random state

# Pre-build the telescope object (reused across draws)
tel = psfcraft.NewtonianTelescope(
    version="1_3",
    primary_radius=PRIMARY_RADIUS,
    secondary_radius=SECONDARY_RADIUS,
)
tel.pixelscale = PIXEL_SCALE

psf_stack   = []   # oversampled PSF images
coef_stack  = []   # corresponding Zernike coefficients

for i in range(N_PSFS):
    # Draw random coefficients within the per-mode budget
    coefs = generate_coefficients(wfe_budget)

    tel.wfe_coefficients = coefs
    psf = tel.calc_psf(monochromatic=WAVELENGTH, fov_pixels=FOV_PIXELS)

    psf_stack.append(psf[0].data)   # oversampled extension
    coef_stack.append(coefs)
    print(f"  PSF {i+1:02d}/{N_PSFS}  RMS WFE = {np.sqrt(np.sum(coefs**2))*1e9:.1f} nm", end="\r")

psf_array  = np.array(psf_stack)    # shape: (N_PSFS, H, W)
coef_array = np.array(coef_stack)   # shape: (N_PSFS, n_modes)

print(f"\nDataset ready: {psf_array.shape}  (N × H × W)")

  PSF 12/12  RMS WFE = 125.1 nm
Dataset ready: (12, 128, 128)  (N × H × W)
  PSF 12/12  RMS WFE = 125.1 nm
Dataset ready: (12, 128, 128)  (N × H × W)

3. Visualise the Dataset¶

Display all generated PSFs in a grid to get a feel for the variety in the dataset.

In [4]:

Copied!





from matplotlib.colors import LogNorm

n_cols = 6
n_rows = int(np.ceil(N_PSFS / n_cols))

fig, axes = plt.subplots(n_rows, n_cols, figsize=(n_cols * 2.5, n_rows * 2.5))
axes = axes.flatten()

for i, (psf_img, ax) in enumerate(zip(psf_array, axes)):
    rms_nm = np.sqrt(np.sum(coef_array[i] ** 2)) * 1e9
    im = ax.imshow(np.maximum(psf_img, 1e-10), cmap="inferno",
                   norm=LogNorm(), origin="lower")
    ax.set_title(f"#{i+1}  {rms_nm:.0f} nm", fontsize=9)
    ax.axis("off")

# Hide unused axes
for ax in axes[N_PSFS:]:
    ax.axis("off")

plt.suptitle("Random PSF dataset  (log scale, labelled by total RMS WFE)", fontsize=12)
plt.tight_layout()
plt.show()
from matplotlib.colors import LogNorm

n_cols = 6
n_rows = int(np.ceil(N_PSFS / n_cols))

fig, axes = plt.subplots(n_rows, n_cols, figsize=(n_cols * 2.5, n_rows * 2.5))
axes = axes.flatten()

for i, (psf_img, ax) in enumerate(zip(psf_array, axes)):
    rms_nm = np.sqrt(np.sum(coef_array[i] ** 2)) * 1e9
    im = ax.imshow(np.maximum(psf_img, 1e-10), cmap="inferno",
                   norm=LogNorm(), origin="lower")
    ax.set_title(f"#{i+1}  {rms_nm:.0f} nm", fontsize=9)
    ax.axis("off")

# Hide unused axes
for ax in axes[N_PSFS:]:
    ax.axis("off")

plt.suptitle("Random PSF dataset  (log scale, labelled by total RMS WFE)", fontsize=12)
plt.tight_layout()
plt.show()

4. Statistical Characterisation¶

Compute per-PSF EE50 and FWHM to summarise PSF quality across the dataset.

In [5]:

Copied!





from astropy.io import fits as pyfits

# We need HDUList objects for psfcraft.measure_ee — rebuild them
rms_wfe_nm = []
ee50_arcsec = []

for i, coefs in enumerate(coef_array):
    tel.wfe_coefficients = coefs
    psf_hdul = tel.calc_psf(monochromatic=WAVELENGTH, fov_pixels=FOV_PIXELS)

    rms_wfe_nm.append(np.sqrt(np.sum(coefs**2)) * 1e9)

    # Find radius at EE = 0.5
    ee_fn = psfcraft.measure_ee(psf_hdul)
    # Binary search for EE50 radius
    radii = np.linspace(0.01, 1.0, 200)
    ee_vals = np.array([ee_fn(r) for r in radii])
    idx = np.searchsorted(ee_vals, 0.5)
    ee50_arcsec.append(radii[min(idx, len(radii) - 1)])

rms_wfe_nm  = np.array(rms_wfe_nm)
ee50_arcsec = np.array(ee50_arcsec)

fig, axes = plt.subplots(1, 2, figsize=(11, 4))

axes[0].hist(rms_wfe_nm, bins=8, color="steelblue", edgecolor="white")
axes[0].set_xlabel("Total RMS WFE (nm)")
axes[0].set_ylabel("Count")
axes[0].set_title("Distribution of RMS WFE")

axes[1].scatter(rms_wfe_nm, ee50_arcsec, c="steelblue", edgecolors="white", s=60)
axes[1].set_xlabel("Total RMS WFE (nm)")
axes[1].set_ylabel("EE50 radius (arcsec)")
axes[1].set_title("EE50 vs. WFE — larger WFE → broader PSF")
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()
from astropy.io import fits as pyfits

# We need HDUList objects for psfcraft.measure_ee — rebuild them
rms_wfe_nm = []
ee50_arcsec = []

for i, coefs in enumerate(coef_array):
    tel.wfe_coefficients = coefs
    psf_hdul = tel.calc_psf(monochromatic=WAVELENGTH, fov_pixels=FOV_PIXELS)

    rms_wfe_nm.append(np.sqrt(np.sum(coefs**2)) * 1e9)

    # Find radius at EE = 0.5
    ee_fn = psfcraft.measure_ee(psf_hdul)
    # Binary search for EE50 radius
    radii = np.linspace(0.01, 1.0, 200)
    ee_vals = np.array([ee_fn(r) for r in radii])
    idx = np.searchsorted(ee_vals, 0.5)
    ee50_arcsec.append(radii[min(idx, len(radii) - 1)])

rms_wfe_nm  = np.array(rms_wfe_nm)
ee50_arcsec = np.array(ee50_arcsec)

fig, axes = plt.subplots(1, 2, figsize=(11, 4))

axes[0].hist(rms_wfe_nm, bins=8, color="steelblue", edgecolor="white")
axes[0].set_xlabel("Total RMS WFE (nm)")
axes[0].set_ylabel("Count")
axes[0].set_title("Distribution of RMS WFE")

axes[1].scatter(rms_wfe_nm, ee50_arcsec, c="steelblue", edgecolors="white", s=60)
axes[1].set_xlabel("Total RMS WFE (nm)")
axes[1].set_ylabel("EE50 radius (arcsec)")
axes[1].set_title("EE50 vs. WFE — larger WFE → broader PSF")
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

5. Saving the Dataset¶

For large-scale production runs, save the PSF array and coefficient array to disk.

In [6]:

Copied!





output_dir = "/tmp/psfcraft_dataset"
import os
os.makedirs(output_dir, exist_ok=True)

# --- NumPy format (lightweight, fast) ---
np.save(os.path.join(output_dir, "psf_images.npy"),   psf_array)
np.save(os.path.join(output_dir, "wfe_coefficients.npy"), coef_array)

print(f"Saved {N_PSFS} PSFs to {output_dir}/")
print(f"  psf_images.npy      : {psf_array.shape}  dtype={psf_array.dtype}")
print(f"  wfe_coefficients.npy: {coef_array.shape}  dtype={coef_array.dtype}")

# --- Reload and verify ---
psf_reload  = np.load(os.path.join(output_dir, "psf_images.npy"))
coef_reload = np.load(os.path.join(output_dir, "wfe_coefficients.npy"))
assert np.allclose(psf_reload, psf_array), "Round-trip check failed!"
print("\nRound-trip verification: ✅ OK")
output_dir = "/tmp/psfcraft_dataset"
import os
os.makedirs(output_dir, exist_ok=True)

# --- NumPy format (lightweight, fast) ---
np.save(os.path.join(output_dir, "psf_images.npy"),   psf_array)
np.save(os.path.join(output_dir, "wfe_coefficients.npy"), coef_array)

print(f"Saved {N_PSFS} PSFs to {output_dir}/")
print(f"  psf_images.npy      : {psf_array.shape}  dtype={psf_array.dtype}")
print(f"  wfe_coefficients.npy: {coef_array.shape}  dtype={coef_array.dtype}")

# --- Reload and verify ---
psf_reload  = np.load(os.path.join(output_dir, "psf_images.npy"))
coef_reload = np.load(os.path.join(output_dir, "wfe_coefficients.npy"))
assert np.allclose(psf_reload, psf_array), "Round-trip check failed!"
print("\nRound-trip verification: ✅ OK")

Saved 12 PSFs to /tmp/psfcraft_dataset/
  psf_images.npy      : (12, 128, 128)  dtype=float64
  wfe_coefficients.npy: (12, 28)  dtype=float64

Round-trip verification: ✅ OK

Key Takeaways¶

build_wfe_budget(radial_budget) converts a compact radial-order budget into a full per-Zernike list.
generate_coefficients(wfe_budget) draws one random coefficient vector within those limits.
Reuse a single NewtonianTelescope object across draws — only update wfe_coefficients before each calc_psf().
EE50 grows monotonically with total RMS WFE, confirming that larger aberrations degrade PSF sharpness.
Save datasets as .npy files (small batches) or use PSF_Generator + HDF5 for production-scale generation.

Going Further¶

Topic	Where to look
Large-scale parallel generation	`psfcraft.utils.PSF_Generator`
HDF5 dataset management	`PSF_Generator.write_database_hdf5()`
Custom telescope apertures	`psfcraft.optics.NewtonianTelescopeAperture`
Polychromatic sources	`psfcraft.utils.build_polychromatic_star` + notebook 06
WebUI interactive exploration	`webui/psfcraft-webui.html`