Week in review: Vectorization, numerical stability, and biological accuracy

3 minute read

Published:

Modernizing eight research codebases at once sounds like a terrible idea, and some weeks it feels like one. But working across optics, biophysics, robotics, and geostatistics in parallel has surfaced a pattern: the same three engineering challenges keep showing up in every domain, wearing different clothes.

Vectorization

The single biggest performance improvement across all projects has been replacing explicit Python loops with NumPy broadcasting and vectorized operations. The most dramatic example: in the Cellular Potts Model, the adhesion energy computation between all cell pairs used a triple-nested loop.

Adhesion energy (before):
E_adh = Σ_i Σ_j Σ_neighbors J(σ_i, σ_j) · (1 − δ(σ_i, σ_j))   →   O(n³)

Reshaping the interaction matrix and using broadcasting brought it down to a vectorized pairwise computation:

Adhesion energy (after):
E_adh = J[σ_grid, σ_shifted] · mask   →   O(n²), vectorized

The constant factor shrank enough to handle hundreds of cells in real time. In the SOFI pipeline, cumulant computation moved from frame-by-frame accumulation to windowed batch processing over the full time stack. The optical tweezers Gerchberg-Saxton algorithm now runs entirely as FFT-plus-array operations with no Python-level iteration inside the main loop. In every case, wall-clock time dropped by one to two orders of magnitude.

Numerical stability

Each domain brought its own flavor of “the math is right but the numbers are wrong.”

Phase overflow fix

The optical tweezers had phase values overflowing at 10^16 radians because SI units produced a characteristic scale far beyond what float64 can resolve after a modulo operation. The fix was dimensionless reformulation:

Phase overflow fix:
SI: K_j = (2π/λf) · Δp² · N · (u·x_j + v·y_j) → 10¹⁶ rad
Dimensionless: K_j = α(u·x_j + v·y_j),   α ≈ π

The CPM collision resolver oscillated forever because naive repulsion and adhesion forces fought each other across time steps. The Darcy flow solver for well placement used a Jacobi iteration that diverged quietly when the permeability contrast exceeded 10³. In every case, the fix was not to change the physics but to change the numerical formulation – dimensionless coordinates, relaxation sweeps, preconditioning. Stability is an engineering constraint, not a mathematical afterthought.

Biological accuracy

The most humbling part of working on biophysical simulations is that nature already ran the experiment.

Collision resolution

Contact Inhibition of Locomotion means cells repolarize away from collisions, not through them:

CIL repolarization:
polarity_new = normalize(polarity − 0.3 × contact_direction)

Hertwig’s rule says cells divide along their longest axis – implementing anything else produces tissue geometries that look wrong to any biologist in the room. The EVL elastic coupling in the embryo simulator must pull but never push, because that is what Oteiza and colleagues measured in 2015:

EVL spring (pull-only):
F_EVL = k₀ · exp(−d / 0.3) × max(0, θ_EVL − θ_cell)

These are not optional features or nice-to-haves. They are hard constraints from decades of experimental observation, and ignoring them produces simulations that converge beautifully to answers that are biologically meaningless.

All eight repositories now include test suites, documentation with SVG diagrams, and working web demonstrations where applicable. They are publicly available on GitHub – not because the code is perfect, but because reproducibility matters more than polish.