Week in review: Vectorization, numerical stability, and biological accuracy
Published:
Modernizing eight research codebases at once sounds like a terrible idea, and some weeks it feels like one. But working across optics, biophysics, robotics, and geostatistics in parallel has surfaced a pattern: the same three engineering challenges keep showing up in every domain, wearing different clothes.
Vectorization
The single biggest performance improvement across all projects has been replacing explicit Python loops with NumPy broadcasting and vectorized operations. The most dramatic example: in the Cellular Potts Model, the adhesion energy computation between all cell pairs used a triple-nested loop.
E_adh = Σ_i Σ_j Σ_neighbors J(σ_i, σ_j) · (1 − δ(σ_i, σ_j)) → O(n³)
Reshaping the interaction matrix and using broadcasting brought it down to a vectorized pairwise computation:
E_adh = J[σ_grid, σ_shifted] · mask → O(n²), vectorized
The constant factor shrank enough to handle hundreds of cells in real time. In the SOFI pipeline, cumulant computation moved from frame-by-frame accumulation to windowed batch processing over the full time stack. The optical tweezers Gerchberg-Saxton algorithm now runs entirely as FFT-plus-array operations with no Python-level iteration inside the main loop. In every case, wall-clock time dropped by one to two orders of magnitude.
Numerical stability
Each domain brought its own flavor of “the math is right but the numbers are wrong.”
The optical tweezers had phase values overflowing at 10^16 radians because SI units produced a characteristic scale far beyond what float64 can resolve after a modulo operation. The fix was dimensionless reformulation:
SI: K_j = (2π/λf) · Δp² · N · (u·x_j + v·y_j) → 10¹⁶ rad
Dimensionless: K_j = α(u·x_j + v·y_j), α ≈ π
The CPM collision resolver oscillated forever because naive repulsion and adhesion forces fought each other across time steps. The Darcy flow solver for well placement used a Jacobi iteration that diverged quietly when the permeability contrast exceeded 10³. In every case, the fix was not to change the physics but to change the numerical formulation – dimensionless coordinates, relaxation sweeps, preconditioning. Stability is an engineering constraint, not a mathematical afterthought.
Biological accuracy
The most humbling part of working on biophysical simulations is that nature already ran the experiment.
Contact Inhibition of Locomotion means cells repolarize away from collisions, not through them:
polarity_new = normalize(polarity − 0.3 × contact_direction)
Hertwig’s rule says cells divide along their longest axis – implementing anything else produces tissue geometries that look wrong to any biologist in the room. The EVL elastic coupling in the embryo simulator must pull but never push, because that is what Oteiza and colleagues measured in 2015:
F_EVL = k₀ · exp(−d / 0.3) × max(0, θ_EVL − θ_cell)
These are not optional features or nice-to-haves. They are hard constraints from decades of experimental observation, and ignoring them produces simulations that converge beautifully to answers that are biologically meaningless.
All eight repositories now include test suites, documentation with SVG diagrams, and working web demonstrations where applicable. They are publicly available on GitHub – not because the code is perfect, but because reproducibility matters more than polish.
