Evaluation: PyImageJ/CLIJ2 vs Pure Python for Extended Depth of Field (EDF)
⚠️ DEPRECATION NOTICE (2025): This evaluation document is historical. Based on this analysis, KINTSUGI has adopted the pure Python approach (CuPy/NumPy) as the default implementation. PyImageJ/CLIJ2 has been deprecated and removed as a required dependency. See
src/kintsugi/edf.pyfor the current implementation.
Executive Summary
This document evaluates the current PyImageJ/CLIJ2 implementation for 3D to 2D Extended Depth of Field (EDF) conversion and assesses pure Python alternatives for both notebook testing and batch processing workflows.
UPDATE (2025): The pure Python implementation using CuPy (GPU) or NumPy (CPU) has been adopted as the default. PyImageJ/CLIJ2 is now deprecated and no longer required.
~~Recommendation: A hybrid approach is recommended:~~ ~~1. Keep PyImageJ/CLIJ2 for production batch processing (GPU-accelerated, proven performance)~~ ~~2. Implement a pure Python fallback for environments without Java/GPU or for testing~~
1. Current Implementation Analysis
1.1 Architecture Overview
The current EDF implementation uses:
PyImageJ (
imagej >= 1.4.0) as the Python-Java bridgeCLIJ2 plugin for GPU-accelerated processing
extendedDepthOfFocusVarianceProjectionalgorithm
Key Files:
fiji_macros/EDF_Macro.ijm- Standalone macro (72 lines)notebooks/1_Single_Channel_Eval.ipynb- Interactive testing (Section 6)notebooks/2_Cycle_Processing.ipynb- Batch processing (Section 5)
1.2 Algorithm Parameters
From EDF_Macro.ijm:
radius_x = 5.0 # Gaussian blur radius X
radius_y = 5.0 # Gaussian blur radius Y
sigma = 20.0 # Variance projection sigma
xTiles = 3 # X-dimension tile splits
yTiles = 3 # Y-dimension tile splits
z_start = 3 # Starting Z-slice
z_end = 15 # Ending Z-slice
1.3 Performance Benchmarks
From macro comments:
Device |
Time (2x2 tiling) |
|---|---|
NVIDIA RTX A4500 |
42.1 seconds |
Intel UHD Graphics 770 |
1m 59.4s |
Image size: (17, 9484, 9962) - approximately 17 Z-slices at ~9500x10000 pixels
1.4 Current Dependencies
# From env-*.yml and pyproject.toml
pyimagej >= 1.4.0
jpype1 >= 1.5.0
scyjava >= 1.0.0
openjdk = 11 (or 21 on Windows)
maven = 3.9.9
Plus local Fiji installation with CLIJ2 plugin.
2. CLIJ2 Algorithm Details
2.1 What extendedDepthOfFocusVarianceProjection Does
The CLIJ2 variance projection algorithm:
For each (x,y) position in the image:
Calculates local variance in a window of size
(2*radius_x+1, 2*radius_y+1)for each Z-sliceThe variance is computed after Gaussian smoothing with
sigmaSelects the pixel value from the Z-slice with maximum local variance
Rationale: High local variance indicates sharp edges/features (in-focus regions). Low variance indicates blur (out-of-focus regions).
Mathematical formulation:
For each (x,y): variance[z] = Var(GaussianBlur(I[z], sigma)[x-rx:x+rx, y-ry:y+ry]) z_best = argmax(variance) output[x,y] = I[z_best, x, y]
3. Pure Python Alternatives
3.1 Available Libraries
Library |
Method |
GPU Support |
Notes |
|---|---|---|---|
NumPy/SciPy |
Variance-based |
No (but vectorized) |
Most flexible, requires custom implementation |
focus-stack |
Laplacian gradient |
No |
PyPI package, designed for microscopy |
OpenCV |
Laplacian pyramid |
Yes (CUDA) |
Well-tested, multi-platform |
Mahotas |
Sobel-based focus measure |
No |
Microscopy-focused, good documentation |
CuPy |
Custom variance |
Yes (CUDA) |
NumPy API with GPU acceleration |
3.2 Recommended Pure Python Implementation
A variance-based implementation matching CLIJ2’s approach:
import numpy as np
from scipy.ndimage import uniform_filter, gaussian_filter
def extended_depth_of_focus_variance(
stack: np.ndarray,
radius_x: int = 5,
radius_y: int = 5,
sigma: float = 20.0,
use_gpu: bool = False
) -> np.ndarray:
"""
Extended depth of focus using local variance projection.
Matches CLIJ2's extendedDepthOfFocusVarianceProjection algorithm.
Parameters
----------
stack : np.ndarray
3D image stack with shape (z, y, x)
radius_x : int
Half-width of variance calculation window in X
radius_y : int
Half-height of variance calculation window in Y
sigma : float
Gaussian smoothing sigma before variance calculation
use_gpu : bool
If True and CuPy available, use GPU acceleration
Returns
-------
np.ndarray
2D projected image with shape (y, x)
"""
if use_gpu:
try:
import cupy as cp
from cupyx.scipy.ndimage import uniform_filter as gpu_uniform_filter
from cupyx.scipy.ndimage import gaussian_filter as gpu_gaussian_filter
xp = cp
stack = cp.asarray(stack)
uf = gpu_uniform_filter
gf = gpu_gaussian_filter
except ImportError:
xp = np
uf = uniform_filter
gf = gaussian_filter
else:
xp = np
uf = uniform_filter
gf = gaussian_filter
n_slices, height, width = stack.shape
variances = xp.zeros_like(stack, dtype=xp.float32)
# Window size for variance calculation
window_size = (2 * radius_y + 1, 2 * radius_x + 1)
for z in range(n_slices):
# Apply Gaussian smoothing
smoothed = gf(stack[z].astype(xp.float32), sigma=sigma)
# Calculate local variance: Var(X) = E[X^2] - E[X]^2
local_mean = uf(smoothed, size=window_size)
local_mean_sq = uf(smoothed ** 2, size=window_size)
variances[z] = local_mean_sq - local_mean ** 2
# Find Z-index with maximum variance at each (x,y)
max_var_idx = xp.argmax(variances, axis=0)
# Select pixels from best-focus slices
y_idx, x_idx = xp.meshgrid(
xp.arange(height),
xp.arange(width),
indexing='ij'
)
result = stack[max_var_idx, y_idx, x_idx]
if use_gpu:
result = cp.asnumpy(result)
return result
3.3 Tiled Processing for Large Images
def edf_tiled(
stack: np.ndarray,
tile_size: tuple = (3000, 3000),
overlap: int = 50,
**edf_kwargs
) -> np.ndarray:
"""
Process large stacks in tiles to manage memory.
Parameters
----------
stack : np.ndarray
3D image stack (z, y, x)
tile_size : tuple
(height, width) of each tile
overlap : int
Pixel overlap between tiles for seamless blending
**edf_kwargs
Arguments passed to extended_depth_of_focus_variance
Returns
-------
np.ndarray
2D EDF result
"""
n_slices, height, width = stack.shape
tile_h, tile_w = tile_size
result = np.zeros((height, width), dtype=stack.dtype)
weight = np.zeros((height, width), dtype=np.float32)
# Create blending weights (raised cosine)
blend_y = np.ones(tile_h)
blend_x = np.ones(tile_w)
if overlap > 0:
ramp = np.linspace(0, 1, overlap)
blend_y[:overlap] = ramp
blend_y[-overlap:] = ramp[::-1]
blend_x[:overlap] = ramp
blend_x[-overlap:] = ramp[::-1]
blend_weights = np.outer(blend_y, blend_x)
# Process tiles
for y_start in range(0, height, tile_h - overlap):
for x_start in range(0, width, tile_w - overlap):
y_end = min(y_start + tile_h, height)
x_end = min(x_start + tile_w, width)
tile = stack[:, y_start:y_end, x_start:x_end]
tile_result = extended_depth_of_focus_variance(tile, **edf_kwargs)
# Apply blending weights
tile_blend = blend_weights[:y_end-y_start, :x_end-x_start]
result[y_start:y_end, x_start:x_end] += tile_result * tile_blend
weight[y_start:y_end, x_start:x_end] += tile_blend
# Normalize by weights
result = (result / np.maximum(weight, 1e-10)).astype(stack.dtype)
return result
4. Comparison: PyImageJ/CLIJ2 vs Pure Python
4.1 Performance Comparison
Aspect |
PyImageJ/CLIJ2 |
Pure Python (NumPy) |
Pure Python (CuPy) |
|---|---|---|---|
Speed (GPU) |
~42s |
N/A |
~45-60s estimated |
Speed (CPU) |
~2 min |
~3-5 min |
N/A |
Memory |
High (JVM + GPU) |
Moderate |
Moderate (GPU) |
Setup complexity |
High |
Low |
Medium |
Dependencies |
Java, Fiji, Maven |
NumPy, SciPy |
NumPy, CuPy, CUDA |
4.2 Pros and Cons
PyImageJ/CLIJ2
Pros:
Proven, battle-tested CLIJ2 implementation
Excellent GPU utilization via OpenCL
Access to ImageJ’s extensive plugin ecosystem
Already integrated and working
Cons:
Complex dependency chain (Java, Fiji, Maven, OpenCL)
Long initialization time (30s - 3min on first run)
JVM memory overhead
Harder to debug/modify
Platform-specific setup issues
Pure Python
Pros:
Simple installation (
pip install numpy scipy)Easy to understand, modify, and debug
No JVM overhead
Better integration with Python ML/analysis pipelines
Portable across all platforms
Cons:
CPU implementation ~2-3x slower
Need to implement and validate algorithm
No access to ImageJ plugins
5. Recommendations
5.1 Hybrid Architecture (Recommended)
┌─────────────────────────────────────────────────────────┐
│ EDF Processing │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Input: │ │ Backend Selection: │ │
│ │ 3D Stack │────────▶│ - GPU + CLIJ2 avail? │ │
│ │ (Z,Y,X) │ │ - CuPy available? │ │
│ └─────────────┘ │ - CPU fallback │ │
│ └───────────┬─────────────┘ │
│ │ │
│ ┌────────────────────────┼────────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────────────┐ ┌─────────────────┐ ┌────────┐ │
│ │ PyImageJ/CLIJ2 │ │ CuPy (GPU) │ │ NumPy │ │
│ │ (Production) │ │ (Alternative) │ │ (CPU) │ │
│ └────────┬─────────┘ └───────┬─────────┘ └───┬────┘ │
│ │ │ │ │
│ └────────────────────┼───────────────┘ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Output: │ │
│ │ 2D Image │ │
│ └─────────────┘ │
└─────────────────────────────────────────────────────────┘
5.2 Implementation Plan
Phase 1: Create Pure Python EDF Module
Implement
extended_depth_of_focus_variance()functionAdd tiled processing support
Add optional CuPy GPU acceleration
Validate output against CLIJ2 results
Phase 2: Create Unified Interface
# src/kintsugi/edf.py
class EDFProcessor:
def __init__(self, backend='auto'):
"""
backend: 'clij2', 'cupy', 'numpy', or 'auto'
"""
self.backend = self._detect_backend(backend)
def process(self, stack, radius_x=5, radius_y=5, sigma=20.0,
z_start=None, z_end=None, tiles=(3,3)):
"""Unified EDF processing interface."""
pass
Phase 3: Integration
Update notebooks to use unified interface
Add backend selection in batch processing
Update documentation
5.3 Use Case Recommendations
Use Case |
Recommended Backend |
Rationale |
|---|---|---|
Notebook testing |
Pure Python (NumPy) |
Fast iteration, no setup needed |
Small batches (<10 images) |
Pure Python (CuPy if available) |
Avoid JVM startup overhead |
Large batches (>10 images) |
PyImageJ/CLIJ2 |
Best sustained throughput |
CI/CD testing |
Pure Python (NumPy) |
No Java dependencies |
Cloud/HPC |
Pure Python (CuPy) |
Better CUDA integration |
Local workstation |
PyImageJ/CLIJ2 |
Leverage existing setup |
6. Validation Strategy
To ensure the pure Python implementation matches CLIJ2:
def validate_edf_implementations():
"""Compare Python EDF output to CLIJ2 reference."""
# Load test stack
test_stack = load_test_data()
# Run CLIJ2 (reference)
clij2_result = run_clij2_edf(test_stack)
# Run Python implementation
python_result = extended_depth_of_focus_variance(test_stack)
# Compare
mse = np.mean((clij2_result - python_result) ** 2)
ssim = structural_similarity(clij2_result, python_result)
assert mse < 1e-3, f"MSE too high: {mse}"
assert ssim > 0.99, f"SSIM too low: {ssim}"
7. Conclusion
A pure Python implementation of the EDF variance projection algorithm is feasible and recommended as a complement to the existing PyImageJ/CLIJ2 workflow. The implementation should:
Match CLIJ2’s algorithm exactly for consistency
Support GPU acceleration via CuPy for competitive performance
Provide a unified interface that auto-selects the best backend
Be validated against CLIJ2 output to ensure correctness
This approach provides flexibility for different deployment scenarios while maintaining the proven performance of CLIJ2 for production batch processing.
Appendix: Quick Reference
Current EDF Call in Notebooks
# From 1_Single_Channel_Eval.ipynb, cell 83
macro = """
run("CLIJ2 Macro Extensions", "cl_device=[" + device + "]");
Ext.CLIJ2_extendedDepthOfFocusVarianceProjection(input, output, radius_x, radius_y, sigma);
"""
ij.py.run_macro(macro, args)
Proposed Python Equivalent
# Proposed usage
from kintsugi.edf import EDFProcessor
processor = EDFProcessor(backend='auto') # or 'clij2', 'cupy', 'numpy'
result = processor.process(
stack,
radius_x=5,
radius_y=5,
sigma=20.0,
z_start=3,
z_end=15,
tiles=(3, 3)
)