Variational inference (VI) is a powerful method for principled posterior inference for scientific inverse imaging. VI learns the posterior distribution, often with a flow-based network, which can cheaply generate posterior samples upon optimization, and can flexibly incorporate score-based or classic priors. However, its application to large-scale image reconstruction is severely hindered by the poor scalability of the flow-based networks. In this work, we introduce ShuffleFlow, a scalable VI framework to address this challenge. Our method breaks down the problem into three parts: a pixel-unshuffling-based image coordinate sampler, a neural field as feature encoder, and a conditional normalizing flow (CNF) as posterior estimator. Specifically, our framework partitions an image into a stack of sub-images with pixel-unshuffling and uses a shared CNF to model the joint distribution of the sub-image stack. We condition the CNF on the output of a neural field, which embeds feature vectors corresponding to pixel-unshuffling sample locations to capture spatial structures, and share the flow's latent variable across the channels to model their correlations. We demonstrate our method's effectiveness and efficiency on both linear and nonlinear imaging inverse problems, and show its ability to more rapidly generate a high-sample-count posterior than diffusion samplers.
@inproceedings{li2026shuffleflow,
title={ShuffleFlow: Scalable Posterior Inference for Bayesian Inverse Imaging},
author={Li, Tianao and Starkenburg, Tjitske and Sun, Yu and Alexander, Emma},
booktitle={IEEE International Conference on Computational Photography (ICCP)},
year={2026},
organization={IEEE}
}
We gratefully acknowledge the support of the NSF-Simons AI Institute for the Sky (SkAI) via grants NSF AST-2421845 and Simons Foundation MPS-AI-00010513. This material is based upon work supported by the U.S. National Science Foundation under Award No. 2542022. The authors would like to thank Bryan Pardo, He Sun, and Yi-Chun Hung for helpful discussions.