measuring localization confidence for quantifying accuracy ...measuring localization confidence for...

10
Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy Hesam Mazidi a , Tianben Ding a , Arye Nehorai a , and Matthew D. Lew a,1 a Department of Electrical and Systems Engineering, Washington University in St. Louis, MO 63130 Single-molecule localization microscopy (SMLM) measures the po- sitions of individual blinking molecules to reconstruct images of biological and abiological structures with nanoscale resolution. The attainable resolution and accuracy of various SMLM methods are routinely benchmarked using simulated data, calibration “rulers”, or secondary imaging modalities. However, these methods cannot quantify the nanoscale imaging accuracy of any particular SMLM dataset without ground-truth knowledge of the sample. Here, we show that by measuring estimation stability under a well-chosen perturbation and with accurate knowledge of the imaging system, we can robustly quantify the confidence of every individual localiza- tion within an experimental SMLM dataset, without ground-truth knowledge of the sample. We demonstrate our broadly-applicable method, termed Wasserstein-induced flux (WIF), in measuring the accuracy of various reconstruction algorithms directly on experi- mental data of microtubules and amyloid fibrils. We further show that the estimated confidences or WIFs can be used to evaluate the experimental mismatch of computational imaging models, en- hance the accuracy and resolution of reconstructed structures, and discover sample heterogeneity due to hidden molecular parameters. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 localization accuracy, statistical confidence, localization software, model mismatch, Wasserstein distance Single-molecule localization microscopy (SMLM) has be- 1 come an important tool for resolving nanoscale structures 2 and answering fundamental questions in biology (15) and 3 materials science (6, 7). SMLM uses repeated localizations of 4 blinking fluorescent molecules to reconstruct high-resolution 5 images of a target structure. In this way, quasi-static features 6 of the sample are estimated from noisy individual images cap- 7 tured from a fluorescence microscope. These quantities, such 8 as fluorophore positions (i.e., a map of fluorophore density), 9 “on” times, emission wavelengths, and orientations, influence 10 the random blinking events that are captured within an SMLM 11 dataset. By using a mathematical model of the microscope, 12 SMLM reconstruction algorithms seek to estimate the most 13 likely set of fluorophore positions and brightnesses (i.e., a 14 super-resolution image) that is consistent with the observed 15 noisy images. 16 A key question left unresolved by existing SMLM method- 17 ologies is: How well do the SMLM data support an algorithm’s 18 statistical estimates comprising a super-resolved image, i.e., 19 what is our statistical confidence in each measurement? Intu- 20 itively, one’s interpretation of an SMLM reconstruction could 21 dramatically change knowing how trustworthy each localiza- 22 tion is. 23 Existing metrics for assessing SMLM image quality can be 24 categorized broadly into two classes: those that require knowl- 25 edge of the ground-truth positions of fluorophores (8, 9), and 26 those that operate directly on SMLM reconstructions alone, 27 possibly incorporating information from other measurements 28 (e.g., diffraction-limited imaging) (10, 11). One popular ap- 29 proach is the Jaccard index (JAC) (8, 12), which measures 30 localization accuracy, but has limited applicability for SMLM 31 experiments as it requires exact knowledge of ground-truth 32 molecule positions. Therefore, data-driven methods have been 33 proposed to quantify the reliability of a localization without 34 knowing the ground truth (13). A drawback of these meth- 35 ods, however, is their reliance on a user to identify accurate 36 localizations versus inaccurate ones, which suffers from low 37 throughput and poor accuracy in low signal-to-noise-ratio 38 (SNR) datasets. 39 Methods that quantify performance by analyzing SMLM 40 reconstructions exploit some aspect of prior knowledge of the 41 target structure or SMLM data. Calculating a Fourier ring 42 coefficient (FRC) utilizes correlations within SMLM datasets 43 to measure image resolution with the expectation that SMLM 44 reconstructions should be stable upon random partitioning 45 (10). However, the FRC cannot detect localization biases that 46 result in systematic distortions in the SMLM reconstruction. 47 In contrast, other methods quantify errors between a pixelated 48 SMLM image and a reference image, which is taken as a 49 ground truth (11). While these methods are able to provide 50 summary or aggregate measures of performance, none of them 51 directly measure the accuracy of individual localizations. Such 52 knowledge is critical for harnessing fully the power of SMLM 53 for scientific discovery. 54 Here, we leverage two fundamental insights of the SMLM 55 measurement process: 1) we possess highly-accurate mathe- 56 matical models of the imaging system, and 2) we know the 57 precise statistics of noise within each image. This knowledge, 58 when combined with an analysis algorithm, enable us to assess 59 quantitatively the confidence of each individual localization 60 within an SMLM dataset without knowledge of the ground 61 truth. With these confidences in hand, the experimenter may 62 filter unreliable localizations from SMLM images without re- 63 moving accurate ones necessary to resolve fine features. These 64 confidences may also be used to detect mismatches in the 65 mathematical imaging model that create image artifacts (14), 66 such as misfocusing of the microscope, dipole-induced local- 67 ization errors (1517), and the presence of optical aberrations 68 (1820). 69 In the present paper, we describe the principles and opera- 70 tion of our method, Wasserstein-induced flux (WIF), whose 71 1 To whom correspondence should be addressed. E-mail: [email protected] Mazidi et al. July 31, 2019, 1–10 . CC-BY-NC-ND 4.0 International license is made available under a The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It . https://doi.org/10.1101/721837 doi: bioRxiv preprint

Upload: others

Post on 18-Mar-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

Measuring localization confidence for quantifyingaccuracy and heterogeneity in single-moleculesuper-resolution microscopyHesam Mazidia, Tianben Dinga, Arye Nehoraia, and Matthew D. Lewa,1

aDepartment of Electrical and Systems Engineering, Washington University in St. Louis, MO 63130

Single-molecule localization microscopy (SMLM) measures the po-sitions of individual blinking molecules to reconstruct images ofbiological and abiological structures with nanoscale resolution. Theattainable resolution and accuracy of various SMLM methods areroutinely benchmarked using simulated data, calibration “rulers”,or secondary imaging modalities. However, these methods cannotquantify the nanoscale imaging accuracy of any particular SMLMdataset without ground-truth knowledge of the sample. Here, weshow that by measuring estimation stability under a well-chosenperturbation and with accurate knowledge of the imaging system,we can robustly quantify the confidence of every individual localiza-tion within an experimental SMLM dataset, without ground-truthknowledge of the sample. We demonstrate our broadly-applicablemethod, termed Wasserstein-induced flux (WIF), in measuring theaccuracy of various reconstruction algorithms directly on experi-mental data of microtubules and amyloid fibrils. We further showthat the estimated confidences or WIFs can be used to evaluatethe experimental mismatch of computational imaging models, en-hance the accuracy and resolution of reconstructed structures, anddiscover sample heterogeneity due to hidden molecular parameters.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

localization accuracy, statistical confidence, localization software, modelmismatch, Wasserstein distance

Single-molecule localization microscopy (SMLM) has be-1

come an important tool for resolving nanoscale structures2

and answering fundamental questions in biology (1–5) and3

materials science (6, 7). SMLM uses repeated localizations of4

blinking fluorescent molecules to reconstruct high-resolution5

images of a target structure. In this way, quasi-static features6

of the sample are estimated from noisy individual images cap-7

tured from a fluorescence microscope. These quantities, such8

as fluorophore positions (i.e., a map of fluorophore density),9

“on” times, emission wavelengths, and orientations, influence10

the random blinking events that are captured within an SMLM11

dataset. By using a mathematical model of the microscope,12

SMLM reconstruction algorithms seek to estimate the most13

likely set of fluorophore positions and brightnesses (i.e., a14

super-resolution image) that is consistent with the observed15

noisy images.16

A key question left unresolved by existing SMLM method-17

ologies is: How well do the SMLM data support an algorithm’s18

statistical estimates comprising a super-resolved image, i.e.,19

what is our statistical confidence in each measurement? Intu-20

itively, one’s interpretation of an SMLM reconstruction could21

dramatically change knowing how trustworthy each localiza-22

tion is.23

Existing metrics for assessing SMLM image quality can be24

categorized broadly into two classes: those that require knowl-25

edge of the ground-truth positions of fluorophores (8, 9), and26

those that operate directly on SMLM reconstructions alone, 27

possibly incorporating information from other measurements 28

(e.g., diffraction-limited imaging) (10, 11). One popular ap- 29

proach is the Jaccard index (JAC) (8, 12), which measures 30

localization accuracy, but has limited applicability for SMLM 31

experiments as it requires exact knowledge of ground-truth 32

molecule positions. Therefore, data-driven methods have been 33

proposed to quantify the reliability of a localization without 34

knowing the ground truth (13). A drawback of these meth- 35

ods, however, is their reliance on a user to identify accurate 36

localizations versus inaccurate ones, which suffers from low 37

throughput and poor accuracy in low signal-to-noise-ratio 38

(SNR) datasets. 39

Methods that quantify performance by analyzing SMLM 40

reconstructions exploit some aspect of prior knowledge of the 41

target structure or SMLM data. Calculating a Fourier ring 42

coefficient (FRC) utilizes correlations within SMLM datasets 43

to measure image resolution with the expectation that SMLM 44

reconstructions should be stable upon random partitioning 45

(10). However, the FRC cannot detect localization biases that 46

result in systematic distortions in the SMLM reconstruction. 47

In contrast, other methods quantify errors between a pixelated 48

SMLM image and a reference image, which is taken as a 49

ground truth (11). While these methods are able to provide 50

summary or aggregate measures of performance, none of them 51

directly measure the accuracy of individual localizations. Such 52

knowledge is critical for harnessing fully the power of SMLM 53

for scientific discovery. 54

Here, we leverage two fundamental insights of the SMLM 55

measurement process: 1) we possess highly-accurate mathe- 56

matical models of the imaging system, and 2) we know the 57

precise statistics of noise within each image. This knowledge, 58

when combined with an analysis algorithm, enable us to assess 59

quantitatively the confidence of each individual localization 60

within an SMLM dataset without knowledge of the ground 61

truth. With these confidences in hand, the experimenter may 62

filter unreliable localizations from SMLM images without re- 63

moving accurate ones necessary to resolve fine features. These 64

confidences may also be used to detect mismatches in the 65

mathematical imaging model that create image artifacts (14), 66

such as misfocusing of the microscope, dipole-induced local- 67

ization errors (15–17), and the presence of optical aberrations 68

(18–20). 69

In the present paper, we describe the principles and opera- 70

tion of our method, Wasserstein-induced flux (WIF), whose 71

1To whom correspondence should be addressed. E-mail: [email protected]

Mazidi et al. July 31, 2019, 1–10

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint

Page 2: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

underlying algorithm is built on the theory of optimal trans-72

port (21). Given a certain mathematical imaging model, WIF73

reliably quantifies the confidence of individual localizations74

within an SMLM reconstruction. We show that these con-75

fidences yield a consistent measure of localization accuracy76

under various imaging conditions, such as changing molec-77

ular density and optical aberrations, without knowing the78

ground-truth positions a priori. We demonstrate that our79

WIF confidence map outperforms other image-based methods80

in detecting artifacts in high-density SMLM while revealing81

detailed and accurate features of the target structure. We then82

quantify the accuracy of various algorithms on real SMLM83

images of microtubules. Finally, we demonstrate the benefits84

of localization confidences to improve reconstruction accuracy85

and image resolution in super-resolution Transient Amyloid86

Binding imaging (22) under low SNR. Notably, WIF reveals87

heterogeneities in the interaction of Nile red molecules with88

amyloid fibrils (23).89

Problem statement90

In SMLM we may model a variety of physical influences on91

stochastic fluorescence emission using a hidden variable β (Fig.92

1A). For example, β can encode where molecules activate, how93

densely they are activated, or how freely they rotate. For94

each frame, we may represent a set of N activated molecules95

as M =∑N

i=1 siδ(η − ηi), where si > 0 and ηi ∈ Rd repre-96

sent the brightness and related physical parameters (i.e., a97

d-dimensional object space comprising position, orientation,98

etc.) of the ith molecule, respectively. In general, N , si, and99

ηi are random variables whose probability distributions de-100

pend on β. We assume that the measured images of molecular101

blinks g ∈ Rm (i.e., m pixels of photon counts captured by a102

camera) are generated according to a statistical model with103

the negative log likelihood L(q,M; g) (see Methods and Fig.104

1A). Here, q is the point spread function (PSF, or the image of105

an SM) of the microscope that can depend onM. In typical106

SMLM, an algorithm A equipped with a PSF model, q, is used107

to estimate molecular positions. Let us denote the output of108

such localization algorithm by M =∑N

i=1 siδ(η − ηi), where109

ηi = ri represents the estimated positions. Generally, β af-110

fects the accuracy with which an algorithm localizes molecules,111

and uncertainty in β can cause degraded image resolution or112

even bias in estimatingM. This uncertainty may arise from113

miscalibration of the PSF model due to optical aberrations as114

well as neglecting the full molecular parameters ηi that affect115

the PSF q, e.g., the dipole emission pattern of fluorescent116

molecules (16, 17, 24). A more subtle uncertainty may arise117

for difficult measurements even with a well-calibrated PSF,118

e.g., overcounting or undercounting molecules due to image119

overlap (8, 12).120

As β is often hidden in an SMLM experiment, we must121

estimate the degree of uncertainty or confidence of each lo-122

calization in truly representingM (Fig. 1B). For 2D SMLM,123

the fitted width σ of the standard PSF is commonly used; if124

σ is significantly smaller or larger than the expected width125

of q, then the corresponding localization has low confidence.126

However, such a strategy fails when a localization is not a127

single molecule (SM), but in fact two or more closely-spaced128

ones. As an illustrative example, we consider two scenarios:129

an isotropic molecule located at (x = 0, y = 0, z = 0) (Fig. 1C)130

and two close molecules located at (0, 0, 0) and (0, 70 nm, 0)131

A

β M(β) =∑N

i=1 siδ(η − ηi)ηi ∈ Rd

q(M) g

Image formation

A(q) M

M = ΣNi=1siδ(η − ηi)

Localization

B

c

c = (c1, c2, . . . , cN )

C (qc)(M0, g)P(M, g)

Confidence quantification

M g M σ cC

0

25

50

75

Cou

nt

D

10 120

0

25

50

Cou

nt

E

10 70

0 0.5 190 120 1500

25

50

Cou

nt

F0

25

50

Cou

nt

σ (nm) Confidence

Fig. 1. Quantifying confidence in single-molecule localization microscopy (SMLM).(A) Image formation and localization. Here, β is a hidden variable that describesparameters that affect molecular fluorescence, including blinking rates, moleculardensity, etc. For each frame, activated molecules are represented byM in whichN, si, and ηi denote number of molecules, photons emitted, and related physicalparameters of the ith molecule, respectively. q denotes the PSF of the imaging systemthat can vary withM. g ∈ Rm represents the vectorized image quantifying thenumber of photons detected consisting of m pixels. Localization refers to estimatingM from g via an algorithm A that uses a PSF model q. (B) Proposed confidencequantification framework. P is a perturbation operator that applies a small distortionto M. The perturbed moleculesM0 and the measurements g are then analyzedvia a confidence analysis algorithm C that uses its own PSF model qc. The esti-mated confidences are represented by c = (c1, c2, . . . , cN ) taking values within1 (highest confidence) and −1 (lowest confidence). (C) Example of localizing andquantifying confidence using 100 simulated images of an isotropic SM analyzed byThunderSTORM (TS). Scatter plot: localizations (black dots) and the true positions ofmolecules (red triangles). Black histogram: fitted widths of the PSF (σ) estimated byTS. Magenta histogram: estimated confidences using the proposed method. (D) Sim-ilar to (C) but for two closely-spaced molecules. (E) Similar to (C) but for focused,dipole-like molecule. (E) Similar to (C) but for a dim isotropic molecule. Colorbars:photons per 58.5× 58.5 nm2. Scalebars: (C-F) left: 500 nm, right: 50 nm.

(Fig. 1D). We use ThunderSTORM (TS (25)) to localize the 132

molecules, which also provides fitted widths σ. Due to signifi- 133

cant image overlap, TS almost always localizes one molecule for 134

both scenarios, such that in the latter, the estimated positions 135

exhibit a significant deviation from the true ones (Fig. 1C,D). 136

However, the distributions of σ in both cases are virtually 137

identical, suggesting that σ is a poor method for quantifying 138

confidence and detecting localization errors due to overlapping 139

molecules (Fig. 1C,D). 140

More fundamentally, mismatches in SMLM between model 141

and measurement generally depend on β in a way that cannot 142

be quantified via simple image-based features such as PSF 143

width. We illustrate this situation by localizing a rotationally 144

fixed molecule located at (0, 0, 200 nm). The anisotropic emis- 145

sion pattern induces a significant bias in TS localizations (Fig. 146

1E). The distribution of fitted widths is noisy due to photon- 147

shot noise and broadening of the PSF (Fig. 1E). Unfortunately, 148

Mazidi et al. 2

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint

Page 3: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

this rather wide distribution is comparable to that of a dim,149

isotropic molecule whose localizations have no systematic bias150

(Fig. 1F). These observations suggest that quantifying subtle151

model mismatches in SMLM and, thus, localization confidence,152

requires a new mathematical “metric.”153

Proposed methodology154

Our goal is to quantify the confidence ci of each localization155

in M produced by a given algorithm, given the measurements156

g (Fig. 1B). In simplest form, we can formulate the local-157

ization task as minimizing the negative log-likelihood L of158

observing an unknown number of molecules, N , each with159

a photon count si and a position ri. Obviously, if we know160

N , then, localization task reduces to simultaneously fitting161

s1, r1, . . . , sN , rN. The difficulty stems from not knowing162

N a priori, which renders the localization as a non-convex163

optimization. Non-convexity makes algorithms susceptible to164

being practically trapped in a saddle point of the negative165

log-likelihood landscape, while correct localizations are associ-166

ated with its global minima (26). In our example of localizing167

two closely-spaced molecules (Fig. 1D), almost all position168

estimates lie near (0, 35 nm), which exactly matches the cen-169

troid of the two true molecules. At the same time, the photon170

count estimates are twice as large as the ground-truth pho-171

tons, and this point represents a saddle point of the negative172

log-likelihood (Fig. S1A).173

A pivotal observation here is that these saddle points are174

unstable (in the sense of being a minimizer of the negative log-175

likelihood) upon a well-chosen perturbation. Put differently,176

for an accurate localization, the negative log-likelihood surface177

has a convex curvature as a function of the estimated position178

of a molecule. Therefore, if we locally perturb the position179

as well as photon count of a particular estimated molecule,180

relaxing this perturbation along the likelihood surface will181

most likely result in a localization very “close” to the unper-182

turbed one. On the other hand, for an unreliable localization,183

we expect that the negative log-likelihood landscape changes184

arbitrarily in a local neighborhood (Fig. S1B). As a result, re-185

localizing most likely will alter the original localization. The186

stability in the position of a molecule upon a careful perturba-187

tion is precisely what we denote as the quantitative confidence188

of an SMLM localization. Motivated by this observation, we189

devise a robust method to measure the stability and therefore190

statistical confidence of each localization within an SMLM191

dataset.192

Localization stability for measuring confidence. Intu-193

itively, stability is a measure of discrepancy between a source194

point and a perturbed instance of this point after following195

a certain trajectory. To clarify, consider a strongly convex,196

differentiable function f over some open set Ω ∈ R taking197

its minimum at ω∗ ∈ Ω. Since we are mostly interested in198

minimizers of some functional, as they are in a sense the best199

“fit” to the ground truth, we think of the confidence of a point200

estimate ω as a measure of its distance to ω∗. Since ω∗ is201

unknown, we seek to measure the confidence of ω without202

knowing ω∗. To this end, we construct a simple single-step203

gradient-descent update and find a representation of stability204

to quantify the said confidence.205

Consider the following gradient descent update given by206

the gradient-descent step with a small step size ε > 0: 207

ω1 = ω0 − ε∇f(ω0), ω0 = P(ω), (1) 208

where ω0 is a local perturbation of ω according to the operator 209

P(ω) = ω + (1− 2e)∆ω with e ∼ Bern(0.5) and perturbation 210

distance ∆ω = |ω − ω0|. Eq. (1) describes the movement of 211

ω0 in the gradient vector field, ∇f , transporting ω0 in the 212

direction of decreasing f . If the estimate ω is stable, we have 213

|ω1 − ω| < |ω0 − ω| as a result of our gradient-decent update, 214

while for an unstable estimate, we can find a perturbation 215

that results in |ω1 − ω| > |ω0 − ω|. Since ω∗ is the minimizer 216

of f , we have |ω1 − ω∗| < |ω0 − ω∗| for any local perturbation 217

of ω∗. In other words, the gradient vector field pushes the 218

perturbed point ω0 toward ω∗. This observation tells us that 219

we may quantify the confidence of ω by measuring the average 220

convergence of ω0 toward ω. We may define the confidence of 221

a point ω simply as 222

c = E sgn [(ω − ω0) · (ω1 − ω0)] · |ε∇f(ω0)|E [|ε∇f(ω0)|] , (2) 223

where E denotes expectation over random perturbations and 224

sgn(x) takes the sign of a real number x. We call c in Eq. (2) 225

the normalized gradient flux, for reasons that become apparent 226

later. A stable point has the maximum inward gradient flux, 227

i.e., c = 1, while an unstable point has some degree of outward 228

gradient flux, i.e., c < 1. Thus, c represents a confidence score 229

for any point in Ω without knowing ω∗. As an example, for 230

f(ω) = ω2 thus implying ω∗ = 0, we find c = 2∆ω|ω−∆ω|+|ω+∆ω| . 231

Obviously, ω = ω∗ = 0 is the most stable point with highest 232

confidence, and the further away ω is from 0, the worse the 233

confidence. 234

We can gain more insight if we consider the recursive vari- 235

ational form of Eq. (1) as 236

ωk = arg minω∈Ω

12‖ω − ωk−1‖22 + εkf(ω)

, k > 0. (3) 237

Informally, Eq. (3) defines a discrete trajectory ωk by 238

minimizing f while preserving a “local Euclidean distance” 239

constraint. In the limit of εk → 0, i.e., considering continuous 240

trajectories, we recover the Cauchy Problem, that is, dω(t)dt

= 241

−∇f(ω(t)), which defines the evolution of ω ∈ Ω from an 242

initial point ω0. The resulting curve ω(t)t≥0 is called a 243

gradient flow. 244

Wasserstein-induced flux.Molecular brightnesses si > 0 245

and positions ri ∈ R2 in a single SMLM frame are expressed 246

as M =∑N

i=1 siδ(r − ri), which is a multi-parameter dis- 247

tribution in the space of non-negative finite measures M(R2). 248

To extend our discussion to SMLM, we must define the dis- 249

tance between two candidate “guesses” S and Q ∈ M(R2) for 250

molecular parameters. We utilize the elegant theory of opti- 251

mal transport, where roughly speaking, the optimal transport 252

distance between any two measures is the minimum cost of 253

transporting mass from one to the other as measured via some 254

ground metric (21). The Wasserstein distance is particularly 255

suitable, because its ground metric is simply Euclidean dis- 256

tance. The type-2 Wasserstein distance between two measures 257

S,Q ∈ M(R2) is defined as 258

W(S,Q) = minπ∈Π(S,Q)

√∫R2×R2

‖r − r′‖22 dπ(r, r′), (4) 259

Mazidi et al. 3

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint

Page 4: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

where Π(S,Q) is the set of all couplings or transportation plans260

between S and Q satisfying a mass-conservation constraint261

(21). Equipped with Wasserstein distance, let us re-write the262

recursive dynamics of Eq. (3) as263

Sk = arg minS∈M(R2)

12W

2(S,Sk−1) + εkL(S), k > 0, (5)264

where L : M(R2)→ R is the negative Poisson log-likelihood,265

which is a convex functional.266

We recall that our goal is to obtain a useful representation267

of point stability in the space of measures. Since stability268

is coupled with evolution of the measures, we analyze the269

properties of Wasserstein gradient flows, i.e., St≥0. A set270

of intriguing results from the theory of Wasserstein gradient271

flow assert that if S has a smooth density, 1) there exists272

a unique transport map, Tk : M(R2) → M(R2), such that273

W22(S,Sk−1) =

∫Ω |Tk(r) − r|2 dSk−1(r); that is, the mass-274

weighted displacement distance for the transport plan is given275

by the type-2 Wasserstein distance; and 2) the backward veloc-276

ity field v(r), i.e., the ratio between the displacement T (r)−r277

and the time step εk, obtained in the transport of Sk to Sk−1278

is given by ∇(δLδS (S)

)(r) (in the limit of εk → 0) (27). The279

functional δLδS (S) is the so-called first variation of L. In fact,280

the gradient flows satisfy the continuity equation (27):281

∂S∂t−∇ ·

(S∇(δLδS (S)

))= 0. (6)282

We now invoke the divergence theorem and define the283

Wasserstein-induced flux (WIF) corresponding to the pertur-284

bation volume around a molecule V as:285

WIF ,

∫V

(∇ ·(S∇(δLδS (S)

)))dV286

=∫S

(S∇(δLδS (S)

)· n)dS, (7)287

288

where S and n represent the closed surface on the boundary289

of V and its normal vector, respectively. We posit that WIF290

serves as a mathematically-grounded representation for sta-291

bility that accounts for local interactions of point sources on292

the likelihood surface. We note source molecules have various293

brightnesses; photons are the conserved mass under our pertur-294

bation. We therefore normalize WIF w.r.t. the flux associated295

with an isolated source in V. Henceforth, we denote WIF as296

its normalized quantity, which means that it takes on values297

within [−1, 1] with 1 representing the maximum statistical298

confidence.299

As stated previously, we can write the gradient field300

∇(δLδS (S)

)(r) = v(r) ≈ [T1(r)− r] /ε in transporting S1 to S0.301

This equivalence effectively gives us a strategy to approximate302

WIF by finding an estimate of T1, that is, the transport map.303

Unfortunately, it is computationally expensive to solve for the304

infinite-dimensional measure S1. In addition, molecules are305

in actuality point sources, which means that our object space306

M(R2) consists of discrete measures and not smooth densities.307

Even though the uniqueness condition of the transport map308

requires measures with smooth densities, we show that even309

with these approximations, our WIF dynamics mirror those310

predicted by Eq. (6). We designed an efficient, iterative algo-311

rithm to approximately compute T1, which ultimately allows312

us to compute WIF (SI section 2, Fig. S2) using 1) raw SMLM313

images of blinking molecules and 2) a computational model of 314

the imaging system (Fig. 1B). 315

Returning to our earlier examples (Fig. 1C-F), we see that 316

when estimated localizations are close to the ground-truth 317

positions, their estimated confidences or WIFs are concentrated 318

close to 1 (Fig. 1C,F). On the other hand, for inaccurate 319

estimates, localization confidences become significantly smaller, 320

implying their unreliability (Fig. 1D,E). Note that knowledge 321

of the ground truth molecule location is not needed to compute 322

these confidence values. 323

Results 324

Localization confidence of an isolated molecule.To 325

test our confidence metric, we analyze images of fluorescent 326

molecules, generated using a vectorial image formation model 327

(24), having various hidden physical parameters such as de- 328

focus and rotational mobility. These analyses characterize 329

not only how well WIF measures mismatches introduced by 330

these parameters, but also its limitations due to statistical 331

shot noise, especially for low photon counts. As a baseline, 332

we fix the PSF model in our confidence analysis to that of 333

an isotropic molecule with zero defocus. To determine our 334

confidence metric’s robustness to shot noise, we use RoSE 335

to localize an isotropic emitter from 200 noisy, independent 336

realizations of its image for a wide range of detected photons. 337

Computing WIF for these localizations, we observe that the 338

confidences are mostly close to 1 for all photon counts, taking 339

values in [0.95, 1] (Fig. S3). There is a slight reduction in 340

estimated confidences for large photon counts, most likely due 341

to the first-order approximation in our PSF model (SI section 342

2B). 343

Next, we quantify how hidden variables that are not ac- 344

counted for within the model affect the confidences. For a dim 345

molecule (800 photons and 20 background photons/pixel) at 346

modest defocus values (z ∈ [0, 200 nm]), we observe that the 347

confidences mostly remain above 0.9 (Fig. S4A,B). As defocus 348

increases beyond 200 nm, approximately 50% of localizations 349

exhibit confidence lower than 0.9. In particular, for z = 300 350

nm, the median confidence decreases to 0.62, a reduction of 351

approximately 40% from z = 0 (Fig. S4A). Our confidence 352

metric is remarkably more sensitive to defocus compared to 353

estimates of normalized PSF width (w.r.t. the PSF width at 354

focus), which fluctuate mostly within 10% of their nominal 355

values. For z = 300 nm, the median width reduces somewhat 356

counter-intuitively by 13% from its nominal value (Fig. S4A), 357

most likely because of the low SNR. 358

To explore how shot noise affects WIF and width estimates, 359

we consider a bright molecule (2000 photons) (Fig. S4D). 360

Interestingly, as soon as the defocus increases beyond 140 nm, 361

the confidences sharply drop below 0.9 such that at z = 200 nm 362

the median confidence approaches 0.3. In contrast, normalized 363

width estimates remain mostly within 5% of their nominal 364

values with their medians consistently close to 1 (Fig. S4C). 365

Therefore, WIF even detects subtle defocus-induced model 366

mismatches for brighter molecules with sufficient SNRs. 367

Lastly, we study how well WIF can quantify dipole-induced 368

imaging errors, further exacerbated by defocus. We consider 369

a molecule inclined at 45 with respect to the optical axis 370

and with various degrees of rotational motion: effectively 371

unconstrained or isotropic (uniform rotation within a cone of 372

half angle α = 90), moderate confinement (α = 30), and 373

Mazidi et al. 4

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint

Page 5: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

strong constraint (α = 15) (Fig. S5C). For a photon count374

of 1000, notably, we observe consistent decreases in median375

confidences (below 0.85) for both α = 30 and α = 15 across376

all z, while for the isotropic molecule, the median confidence377

drops below 0.9 only for z greater than 160 nm. In addition,378

confidences for α = 15 are smaller than those of α = 30,379

which shows our confidence metric’s consistency, trending380

smaller as the degree of mismatch increases (Fig. S5A). On381

the other hand, normalized width estimates are practically382

indistinguishable for all α and z values (Fig. S5B).383

We next consider a brighter molecule (2000 photons) (Fig.384

S6C), and observe that confidences for both α = 30 and α =385

15 significantly decrease below 0.5 for almost all z positions386

(Fig. S6A). Surprisingly, the normalized width estimates for387

α = 30 and α = 15 converge to their nominal (in focus) value388

as z approaches 200 nm (Fig. S6B). Therefore, confounding389

molecular parameters (e.g., a defocused dipole-like emitter)390

may cause estimates of PSF width to appear unbiased, while391

our WIF metric consistently detects these image distortions392

and yields small confidence values, resulting in a quantitative,393

interpretable measure of image trustworthiness. Collectively,394

these analyses demonstrate that WIF provides a consistent395

and reliable measure of localization confidence for various396

forms of experimental mismatches.397

Quantifying localization accuracy without ground398

truth.Assume we are given a series of camera frames of399

blinking SMs and the corresponding set of localizations re-400

turned by an arbitrary algorithm. Our main objective is to401

assess the trustworthiness of each localization and to quantify402

the aggregate accuracy of the said algorithm. Obviously, the403

difficulty is that, in practice, we cannot access an oracle that404

knows the ground-truth positions of the SMs for comparison.405

We propose average confidence WIFavg as a novel metric406

for quantifying the collective accuracy of these localizations407

(with confidences c1, . . . , cN):408

WIFavg ,1N

N∑i=1

ci. (8)409

We can gain insight into Eq. (8) by examining its correspon-410

dence to the well-known Jaccard index (JAC), which uses an411

oracle to determine the credibility of a localization based on its412

distance to the ground-truth SM. In particular, we may define413

JAC = TP/(TP + FN + FP), where TP, FN, and FP denote414

number of true positives, false negatives, and false positives,415

respectively. An undetected molecule, that is, a false negative,416

would increase the denominator of JAC, thereby reducing its417

value. We posit that this same undetected molecule adversely418

affects the confidence of a nearby localized molecule, thereby419

reducing WIFavg. This intuitive connection between JAC and420

WIFavg suggests that the average confidence may serve as a421

good surrogate for localization accuracy.422

Using WIFavg, we quantify the performance of two algo-423

rithms, RoSE (28) and ThunderSTORM (25), for localizing424

emitters at various blinking densities (defined as number of425

molecules per µm2, see Methods) (Fig. 2A,B). Examining426

the localizations returned by the algorithms, we calculate the427

Jaccard index using ground-truth information from the oracle428

and WIFavg using only the simulated images of SM blinking429

(see Methods). For both RoSE and TS, we observe excellent430

agreement between WIFavg and Jaccard index for densities as431

A

B50

100

150

0

0.5

1

00.30.60.9

Jacc

ard

COracle (RoSE)WIFavg (RoSE)

Oracle (TS)WIFavg (TS)

1 2 3 4 5 6 7 8 9Density (molecules/ m2)

0.30.50.70.9

Rec

all

E 0.3

0.5

0.7

0.9

Pre

cisi

on

D

Raw (RoSE)Filtered (RoSE)Raw (TS)Filtered (TS)

Fig. 2. Wasserstein-induced flux (WIFavg) quantifies localization accuracy withoutground truth. (A) From left to right: images of molecules for blinking densities of 3, 5,7, and 9 mol. per µm2, respectively. (B) RoSE localizations (colored dots representcalculated confidence) corresponding to images in (A). Open red circles representground-truth positions. (C) Jaccard index for RoSE (solid, red) and TS (solid, green)at various blinking densities. The dashed lines represent WIFavg for RoSE (red)and TS (green). For each blinking density, 200 independent realizations were used.(D) Precision for all localizations (solid) and localizations with confidence greaterthan 0.5 (dotted) using RoSE (red) and TS (green), respectively. (E) Recall forall localizations (solid) and localizations with confidence greater than 0.5 (dotted)using RoSE (red) and TS (green), respectively. Colorbars: (A) photons per 58.5×58.5 nm2; (B) confidence. Scalebar: 500 nm.

high as 5 mol./µm2. For higher densities, WIFavg monotoni- 432

cally decreases at a rate differing from that of Jaccard index. 433

For instance, at high densities JAC for TS saturates to 0.1, 434

whereas WIFavg further decreases due to high FN and low 435

TP, thereby demonstrating the non-convexity of the negative 436

log-likelihood landscape (Fig. 2C). 437

A natural application of our confidence metric is to remove 438

localizations with poor accuracy. We filter localizations with 439

confidence smaller than 0.5, corresponding to half of the per- 440

turbed photons “returning” toward a particular localization, 441

and calculate the resulting precision = TP/(TP + FP) and 442

recall = TP/(TP + FN). If the filtered localizations truly rep- 443

resent false positives, we expect to see an increase in precision 444

and a relatively unchanged recall after filtering. Our results 445

show a precision enhancement as high as 180% for TS and a de- 446

sirable increase of 23% for RoSE (density= 9 mol./µm2) (Fig. 447

2D). Remarkably, these improvements come with a negligible 448

loss in recall (13% in the worst case) across all densities for 449

both algorithms (Fig. 2E). Overall, these simulation studies 450

show that WIFavg is a reliable means of quantifying localiza- 451

tion accuracy without having access to ground-truth molecular 452

parameters. 453

Mazidi et al. 5

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint

Page 6: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

0 5 10

A

20 40 60

B

0.2 0.4 0.6 0.8

C

Fig. 3. Confidence map reveals artifacts in recovering a tubulin network from high-density SMLM data. (A) Recovered structure (red) using FALCON overlaid with the groundtruth (green). (B) Error map recovered by SQUIRREL (brighter colors correspond to larger errors). (C) Confidence map (brighter colors indicate higher confidence) obtained byaveraging localization confidences in each pixel. Colorbars: (A) number of localizations, (B) error, and (C) confidence per 20× 20 nm2. Scalebars: (C) 1 µm, insets: 200 nm.

Quantifying and revealing artifacts in high-density454

datasets.A key difficulty encountered in high-density (HD)455

localization, when images of molecules overlap on the camera,456

is image artifacts that distort SMLM reconstructions in a457

structured, or vectorial, manner (28). Constructing an SMLM458

error map using a reference image has been proposed to certify459

the reliability of an SMLM reconstruction, but such a map460

does not quantify the reliability of each individual localization461

within the image. Here, we illustrate the power of measur-462

ing SM confidence in quantifying and revealing artifacts in a463

challenging HD localization experiment.464

We use FALCON (29), an HD localization algorithm, to465

reconstruct a simulated benchmark SMLM dataset (12) con-466

sisting of 360 HD frames of a tubulin network (Fig. 3A). In467

regions where the tubules coalesce, corresponding to higher468

blinking densities, we see numerous inaccurate localizations469

(Fig. 3A, insets). In particular, we see fused and broadened470

tubules instead of thin and separate structures. A reliable error471

map should assign low confidence or high error to such regions472

while discriminating fine but accurate details of the structure.473

Interestingly, we notice significant differences between an error474

map (Fig. 3B, obtained via SQUIRREL (11)) and the pro-475

posed confidence map (Fig. 3C). First, the error map appears476

to overestimate errors in regions with accurate localizations,477

while our confidence map exhibits low confidence for inaccu-478

rate localizations and assigns high confidence to neighboring,479

well-resolved parallel tubules (Fig. 3B,C, top insets). Second,480

the error map underestimates the error in the regions where481

tubules are apparently fused, whereas the confidence map482

assigns an overall low confidence to this region, suggesting483

potential artifacts (Fig. 3B,C, bottom insets). Overall, our484

confidence map enables scientists to discriminate specific SM485

localizations that are trustworthy, while also assigning low486

confidence values to those that are not, thereby maximizing487

the utility of SMLM datasets without throwing away useful488

localizations.489

Calibrating and validating WIF using SMLM of mi-490

crotubules.A super-resolution dataset often contains well-491

isolated images of molecules, e.g., after a significant portion of 492

them are bleached. These images can therefore serve as a use- 493

ful internal control, taken under realistic conditions, to assess 494

the performance of a PSF model as well as SMLM algorithms 495

themselves on a particular dataset. As a practical example, 496

we examine an SMLM dataset of blinking AlexaFluor 647- 497

labeled microtubules (see Methods). We randomly selected 498

600 images of bright molecules sampled over the entire field of 499

view (Fig. 4A). We used an ideal PSF model to localize these 500

molecules using RoSE, but found that the mean confidence 501

of these localizations is notably small (WIFavg = −0.36), im- 502

plying the presence of significant aberrations and PSF model 503

mismatch (Fig. S7). We therefore calibrated our physics-based 504

PSF model and re-analyzed the data (see Methods). After 505

calibration, the estimated confidences of RoSE’s localizations 506

show a notable average increase of 0.79 (WIFavg = 0.43). 507

We also observe a rather broad distribution of confidences, 508

suggesting that optical aberrations, such as defocus, vary 509

throughout the structure (Fig. S7). RoSE’s use of this cal- 510

ibrated PSF produces localizations with higher confidence 511

values (WIFavg = 0.43) compared to TS’s use of an elliptical 512

Gaussian PSF (WIFavg = 0.15) (Fig. 4A). The higher average 513

confidence score for RoSE suggests that it should recover the 514

underlying structure with greater accuracy compared to TS. 515

We confirm the consistency of localization confidences, in 516

the absence of the ground truth, through the perceived quality 517

of the super-resolution reconstructions (Fig. 4B). We expect 518

more confident localizations result in an image with greater 519

resolution, whereas localizations with poor confidence fail 520

to resolve fine details and potentially distort the underlying 521

structure. Within a region containing a few parallel and well 522

separated microtubules, we see similar confidences for both 523

algorithms (Fig. 4H) resulting in images of similar quality (Fig. 524

4F,G). Conversely, for a region with intersecting microtubules, 525

we observe marked qualitative and quantitative differences 526

between the two reconstructions (Fig. 4C,D). RoSE is able 527

to resolve structural details near the intersections, while the 528

TS image contains missing and blurred localizations near the 529

crossing points. Moreover, RoSE recovers the curved micro- 530

Mazidi et al. 6

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint

Page 7: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

A

50

400

-1 0 1

Confidence

0

50

100

Cou

nt

B

C,D,E

F,G,H

10

20

30

40

50

60

70

C D

-1 0 10

0.2

0.4

Nor

mal

ized

cou

nt E

F G

5 10 15 20 25

-1 0 1Confidence

0

0.2

0.4

0.6

Nor

mal

ized

cou

nt H

Fig. 4. Comparison of SMLM algorithms on experimental images of labeled micro-tubules. (A) Left: isolated images of Alexa Fluor 647 molecules. Right: localizationconfidences for 600 isolated molecules using RoSE (red) and TS (green). (B) Super-resolution image of Alexa Fluor 647-labeled microtubules recovered by RoSE. (C,D)Enlarged top-left region in (B) for RoSE and TS, respectively. (E) Histogram ofconfidences corresponding to localizations in (C) and (D) for RoSE (red) and TS(green), respectively. (F,G) Similar to (C,D) but for the middle-right region in (B). (H)Similar to (E) but for localizations in (F) and (G). Colorbars: (A) photons detected per160× 160 nm2, (B) number of localizations per 40× 40 nm2. Scalebars: (A) 500nm, (B) 1 µm, and (G) 500 nm.

tubule faithfully, whereas TS fails to reconstruct its central531

part (lower red arrow in Fig. 4C,D). Quantitatively, RoSE532

exhibits significantly greater confidence in its localizations533

compared to TS, which shows negative confidences for an ap-534

preciable number of localizations (Fig. 4E). This confidence535

gap, in part, can be caused by hidden parameters such as high536

blinking density. These SMLM reconstructions demonstrate537

that localization confidences obtained from both images of538

isolated molecules as well as HD datasets are a consistent and539

quantitative measure of algorithmic performance. 540

Quantifying algorithmic robustness and molecular 541

heterogeneity.Next, we used WIF to characterize algorith- 542

mic performance on a Transient Amyloid Binding (TAB) (22) 543

dataset imaging amyloid fibrils. Here, the relatively large shot 544

noise in images of Nile red (<1000 photons per frame) tests the 545

robustness of three distinct algorithms: TS with weighted-least 546

squares (WLS) using a weighted Gaussian noise model; TS 547

with maximum likelihood estimation (MLE) using a Poisson 548

noise model; and RoSE, which uses a Poisson noise model 549

but also is robust to image overlap. Qualitative and quantita- 550

tive differences are readily noticeable between reconstructed 551

images, particularly where the fibrillar bundle unwinds (Fig. 552

5A-C, insets). We attribute the poor localization of WLS, ex- 553

emplified by broadening of the fibrils, to its lack of robustness 554

to shot noise. By using instead a Poisson noise model, MLE 555

recovers thinner and better resolved fibrils, but struggles to 556

resolve fibrils at the top end of the structure (Fig. 5B,E). This 557

inefficiency is probably due to algorithmic failure on images 558

containing overlapping molecules. In contrast, RoSE localiza- 559

tions have greater precision and accuracy, thereby enabling the 560

parallel unbundled filaments to be resolved (Fig. 5C,F). These 561

perceived image qualities can be reliably quantified via WIF. 562

Indeed, RoSE localizations show the greatest confidence of the 563

three algorithms with WIFavg = 0.78 while WLS shows a low 564

WIFavg of 0.18 attesting to their excellent and poor recovery, 565

respectively (Fig. 5G-I). Interestingly, we found that, in terms 566

of FRC, RoSE has only 3% better resolution compared to 567

MLE. 568

To further prove that WIF is a reliable measure of accu- 569

racy at the single-molecule level, we filtered out all localiza- 570

tions with confidence smaller than 0.5. Remarkably, filtered 571

reconstructions from all three algorithms appear to resolve 572

unbundled fibrils (Fig. 5J-L). In contrast, filtering based 573

on estimated PSF width produces sub-optimal results. No- 574

tably, retaining MLE localizations within a strict width range 575

W1 ∈ [90, 110 nm], improves filament resolvability at the cost 576

of compromising sampling continuity (Fig. S8A). For a slightly 577

larger range, W2 ∈ [70, 130 nm], the filtering is ineffective and 578

the fibrils are not well resolved (Fig. S8B). In contrast, filtered 579

localizations based on WIF, qualitatively and quantitatively, 580

resolve fine fibrillar features (Fig. S8C). 581

A powerful feature of WIF is its ability to quantify an 582

arbitrary discrepancy between a computational imaging model 583

and SMLM measurements. This property is particularly useful 584

since hidden physical parameters, which may be difficult to 585

model accurately, can induce perturbations in the observed 586

PSF. Therefore, we can use WIF to interrogate variations in the 587

interactions of Nile red with amyloid fibrils that are encoded 588

as subtle features within SMLM images. To demonstrate this 589

capability, we analyzed TAB fibrillar datasets using RoSE and 590

calculated the WIFs of localizations with greater than 400 591

detected photons (Fig. 6). Interestingly, WIF density plots 592

reveal heterogeneous regions along both fibrils. Specifically, for 593

segments of fibrils that are oriented away from the vertical axis, 594

we see a larger fraction of localizations that have low confidence 595

(<0.5) compared to regions that are vertically oriented (Fig. 596

6A,B). Quantitatively, the upper regions of two fibrils have 597

17% (Fig. 6C) and 37% (Fig. 6D) more localizations with 598

confidence greater than 0.8. 599

To examine the origin of this heterogeneity, we directly 600

Mazidi et al. 7

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint

Page 8: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

A B C

5 10 15 20 25 30

-80 0 80x (nm)

0

0.1

0.2

Nor

mal

ized

c

ount

D

-80 0 80x (nm)

E

-80 0 80x (nm)

F

WIFavg

= 0.18

-1 0 1

Confidence

0

0.5

Nor

mal

ized

c

ount

GWIF

avg= 0.44

-1 0 1

Confidence

HWIF

avg= 0.78

-1 0 1

Confidence

I

-80 0 80x (nm)

0

0.1

0.2

Nor

mal

ized

cou

nt

J

-80 0 80x (nm)

K

-80 0 80x (nm)

L

Fig. 5. Quantifying algorithmic robustness and enhancing reconstruction accuracyin SMLM of amyloid fibrils. Super-resolution image of twisted fibrils recovered by(A) TS weighted least-squares (WLS), (B) TS maximum-liklihood estimation (MLE),and (C) RoSE. (D-F) Histogram of localizations projected onto the gold line in (C,top inset) from (D) WLS, (E) MLE, and (F) RoSE. (G) WIFs for all WLS localizationsin (A). (H) WIFs for all MLE localizations in (B). (I) WIFs for all RoSE localizationsin (C). In (G-I), green regions denote localizations with confidence greater than 0.5.(J-L) Histograms of localizations with confidence greater than 0.5 projected onto thegolden line in (C, top inset) and corresponding filtered inset images for (J) WLS,(K) MLE, and (L) RoSE. Colorbar: number of localizations per 58.8 × 58.5 nm2.Scalebars: (C) 500 nm, (C inset) 150 nm, (L inset) 150 nm.

compare observed PSFs from high- and low-confidence regions.601

Curiously, PSFs in the bottom regions are slightly elongated602

along an axis parallel the fibril itself, whereas PSFs from the603

top regions better match our model (Fig. S9). These features604

may be attributed to the orientation of Nile red molecules605

upon binding to fibrils (30–32). We stress that the influence of606

molecular orientation on these PSFs is detected and quantified607

by WIF and cannot otherwise be distinguished by estimates608

of PSF width (Fig. S9).609

Discussion610

WIF is a computational tool that utilizes mathematical models611

of the imaging system and measurement noise to measure the612

statistical confidence of each localization within an SMLM613

image. We used WIF to benchmark the accuracy of SMLM614

algorithms on a variety of simulated and experimental datasets.615

We also demonstrated WIF for analyzing how sample proper-616

ties (e.g., defocus, dipole emission, molecular density) affect617

A B

0.2 0.4 0.6 0.8 1

0

0.1

0.2

0.3

Nor

mal

ized

cou

nt

C

0.5 1

Confidence

0

0.1

0.2

0.3

Nor

mal

ized

cou

nt

D

Fig. 6. Heterogeneity in Nile red interactions with amyloid fibrils. (A, B) WIFs ofbright localizations (>400 detected photons) detected by RoSE on two fibrils. (C, D)WIFs of localizations within corresponding boxed regions in (A, upper magenta andlower black) and (B, upper orange and lower blue), respectively. In (C) and (D), greenregions indicate localizations with WIF > 0.8, corresponding to (A, magenta) 63%,(A, black) 54%, (B, orange) 62%, and (B, blue) 45% of the localizations. Colorbar:confidence. Scalebar: 500 nm.

reconstruction accuracy. WIF quantifies robustly determin- 618

istic model mismatch between experimental images and a 619

computational imaging model, which affects accuracy, in the 620

presence of stochastic measurement noise, which affects preci- 621

sion. Intuitively, low SNRs make the detection of minor model 622

mismatches, such as defocus, comparatively difficult (Fig. S4). 623

While WIF has excellent sensitivity for detecting overlapping 624

molecules (Fig. 1D) and dipole-like emission patterns (Figs. 625

S5-S6), WIF cannot explain the source of low confidence values 626

that cause localization inaccuracies or heterogeneities; rather, 627

it detects and quantifies these effects. 628

WIF exhibits several advantages over existing methods for 629

quantifying reconstruction accuracy in experimental SMLM. 630

First, WIF does not require labeled training data to judge the 631

trustworthiness of predetermined image features; a model of 632

the imaging system, i.e., its PSF, suffices. Second, it does not 633

need ground-truth knowledge of SM positions, which would be 634

prohibitive in most SMLM applications. Third, it obviates the 635

need for a secondary imaging modality for comparison and is 636

therefore more robust than such methods; it does not require 637

alignment between modalities. More fundamentally, WIF ex- 638

ploits a unique property of SMLM compared to other non-SM 639

super-resolution optical methodologies (e.g., structured illumi- 640

nation, RESOLFT, and STED); imaging the entirety (peak 641

and spatial decay) of each SM PSF synergistically creates 642

well-behaved gradient flows along the likelihood surface that 643

are used in computing WIF (SI Section 2). WIF quantifies 644

errors by using knowledge from its PSF model to explore the 645

object space of molecular positions and brightnesses; leveraging 646

Wasserstein distance ensures that meaningful perturbations to 647

SM positions are tested. In contrast, computing mismatches in 648

image space (e.g., PSF width in Figs. 1, S4-S6) is insensitive 649

to molecular overlap, defocus, and dipole emission artifacts 650

without assuming strong statistical priors on the spatial dis- 651

tribution of molecules or a simplified PSF (33). 652

We believe that WIF will become a valuable tool for the 653

SMLM community as it offers the unique capability of quanti- 654

Mazidi et al. 8

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint

Page 9: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

fying localization accuracy and heterogeneity in experimental655

SMLM datasets. WIF can also be used for online tuning of656

parameters (e.g., activation density and imaging buffer con-657

ditions) during an experiment to maximize imaging accuracy658

and resolution. It also offers a reliable means to detect oth-659

erwise hidden properties β of the sample of interest, such as660

molecular orientation shown here, allowing for the discovery of661

new biophysical and biochemical phenomena at the nanoscale.662

While a majority of neural network training methods in SMLM663

utilize simulated data (34) or experimental data assuming a664

perfectly matched model (35), the discriminative power of665

WIF may enable these networks to be trained robustly on666

experimental data in the presence of mismatches stemming667

from hidden parameters. Finally, WIF can be readily ex-668

tended to 3D (36) or molecular orientation (37–42) SMLM669

using calibrated models.670

Inspired by measuring distribution model risk (43), i.e.,671

quantifying the impact of using a “best-guess” model instead672

of the true model, WIF represents an advance in statistical673

quantification in imaging science (44–46), where the relia-674

bility of each quantitative feature within a scientific image675

can now be evaluated. The benefits of integrating WIF into676

downstream analysis (47, 48) (e.g., SM clustering, counting,677

and co-localization) and even in other imaging modalities678

(e.g., astronomical imaging, positron emission tomography,679

and computed tomography) are exciting opportunities yet to680

be explored.681

Materials and Methods682

683

Super-resolution imaging of labeled microtubules. The mi-684

crotubules of BSC-1 cells were immunolabeled with Alexa Fluor 647685

(Invitrogen) and imaged under blinking conditions (49) with glucose686

oxidase/catalase and mM concentrations of mercaptoethylamine687

(MEA) as in (50). The sample was imaged using an Olympus688

IX71 epifluorescence microscope equipped with a 100× 1.4 NA689

oil-immersion objective lens (Olympus UPlan-SApo 100×/1.40).690

Fluorophores were excited using a 641-nm laser source (Coher-691

ent Cube, peak intensity ~10 kW/cm2). Fluorescence from the692

microscope was filtered using a dichroic beamsplitter (Semrock,693

Di01-R635) and bandpass filter (Omega, 3RD650-710) and sepa-694

rated into two orthogonally-polarized detection channels, but only695

one channel was used for this analysis. Fluorescence photons were696

captured using an electron-multiplying (EM) CCD camera (Andor697

iXon+ DU897-E) at an EM gain setting of 300 with a pixel size of698

160× 160 nm2 in object space. For the 284k localizations shown in699

Fig. 4B, 2287 photons were detected on average with a background700

of 76 photons per pixel.701

Transient Amyloid Binding imaging. The 42 amino-acid702

residue amyloid-beta peptide (Aβ42, ERI Amyloid Laboratory)703

was dissolved in hexafluoro-2-propanol (HFIP) and sonicated at704

room temperature for one hour. After flash freezing in liquid ni-705

trogen, HFIP was removed by lyophilization and stored at -20 C.706

To further purify the protein, the lyophilized Aβ42 was dissolved707

in 10 mM NaOH, sonicated for 25 min in a cold water bath and708

filtered first through a 0.22 µm and then through a 30 kD centrifu-709

gal membrane filter (Millipore Sigma, UFC30GV and UFC5030) as710

described previously (22). To prepare fibrils, we incubated 10 µM711

monomeric Aβ42 in phosphate-buffered saline (PBS, 150 mM NaCl,712

50 mM Na3PO4, pH 7.4) at 37 C with 200 rpm shaking for 20-42713

hours. The aggregated structures were adsorbed to an ozone-cleaned714

cell culture chamber (Lab Tek, No. 1.5H, 170 ± 5 µm thickness)715

for 1 hour followed by a rinse using PBS. A PBS solution (200 µL)716

containing 50 µM Nile red (Fisher Scientific, AC415711000) was717

placed into the amyloid-adsorbed chambers for transient amyloid718

binding.719

Blinking of Nile red molecules on fibrils were imaged using a 720

home-built epifluorescence microscope equipped with a 100× 1.4 NA 721

oil-immersion objective lens (Olympus, UPlan-SApo 100×/1.40). 722

The samples were excited using a 561-nm laser source (Coherent 723

Sapphire, peak intensity ~0.45 kW/cm2). Fluorescence was fil- 724

tered by a dichroic beamsplitter (Semrock, Di03-R488/561) and a 725

bandpass filter (Semrock, FF01-523/610) and separated into two 726

orthogonally-polarized detection channels by a polarizing beamsplit- 727

ter cube (Meadowlark Optics). Both channels were captured by a 728

scientific CMOS camera (Hamamatsu, C11440-22CU) with a pixel 729

size of 58.5× 58.5 nm2 in object space. Only one of the channels 730

was analyzed in this work. For the 12k localizations shown in Fig. 731

5C, 390 photons were detected on average with a background of 5 732

photons per pixel. For the 931 localizations shown in Fig. 6B, 785 733

photons were detected on average with a background of 2.4 photons 734

per pixel. 735

Synthetic data.We generated images of molecules via a vecto- 736

rial image-formation model (24), assuming unpolarized ideal PSFs. 737

Briefly, a molecule is modeled as a dipole rotating uniformly within 738

a cone with a half-angle α. A rotationally fixed dipole corresponds 739

to α = 0, while α = 90 represents an isotropic molecule. Molecular 740

blinking trajectories were simulated using a two state Markov chain 741

(28). We used a wavelength of 637 nm, NA = 1.4, and spatially 742

uniform background. We simulated a camera with 58.5× 58.5 nm2 743

square pixels in object space. 744

Poisson log likelihood. Consider a set of N molecules 745

M =N∑i=1

siδ(r − ri), 746

where si ≥ 0 and ri ∈ R3 denote ith molecules’ brightness (in 747

photons) and position, respectively. The resulting intensity µj , that 748

is, the expected number of photons detected on camera, for each 749

pixel j can be written as 750

µj =N∑i=1

siqj(ri) + bj, j ∈ 1, . . . ,m, 751

where qj(ri) represents the value of the PSF q (for ith molecule) at 752

jth pixel; bj denotes the expected number of background photons 753

at jth pixel. 754

If we denote g ∈ Rm as m pixels of photon counts captured by 755

a camera, the negative Poisson log likelihood is then given by 756

L(q,M, g) =m∑j=1

µj − gj log(µj). 757

Jaccard index. Following (8), given a set of ground-truth positions 758

and corresponding localizations, we first match these points by 759

solving a bipartite graph-matching problem of minimizing the sum 760

of distances between the two elements of a pair. We say that a 761

pairing is successful if the distance between the corresponding two 762

elements is smaller than twice the full width at half maximum 763

(FWHM) of the localization precision σ, which is calculated using 764

the theoretical Cramér-Rao bound (51) (σ = 3.4 nm with 2000 765

photons detected). The elements that are paired with a ground- 766

truth position are counted as true positive (TP) and those without 767

a pair are counted as false positive (FP). Finally, the ground-truth 768

molecules without a match are counted as false negative (FN). 769

PSF modeling for computing Wasserstein-induced flux. For 770

simulation studies, we used an ideal, unpolarized standard PSF 771

resulting from an isotropic emitter (Figs. 1, 2, 3, S3, S4, S5, S6), 772

while for experimental data (Figs. 4, 5, 6, S7, S8, S9), we used a 773

linearly-polarized PSF, also resulting from an isotropic emitter (SI 774

Section 2). 775

In addition to the ideal PSFs modeled above, we needed to 776

calibrate the aberrations present in the PSF used for microtubule 777

imaging (Fig. 4). We modeled the microscope pupil function P as 778

P (u, v) = exp

(j

l∑i=3

aiZi(u, v)

)· P0(u, v), 779

Mazidi et al. 9

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint

Page 10: Measuring localization confidence for quantifying accuracy ...Measuring localization confidence for quantifying accuracy and heterogeneity in single-molecule super-resolution microscopy

where (u, v) are microscope’s pupil coordinates; Zi and ai represent780

the ith Zernike basis function and its corresponding coefficient; and781

P0 denotes the pupil function of the uncalibrated model. We used782

33 Zernike modes corresponding to l = 35.783

Using RoSE, we localized well-isolated molecules over a large784

field-of-view (FOV) corresponding to Fig. 4. Next, for each lo-785

calization, we obtained an image of size 11 × 11 pixels with the786

localized molecule being at its center. We excluded molecules with787

brightnesses less than 3000 photons or with positions away from788

the origin by more that a pixel. Next, we randomly selected 600 of789

these images to estimate the Zernike coefficients, i.e., a1, . . . , al,790

as described previously (52). The calibrated PSF (Fig. S7) is then791

computed based on recovered P .792

Data and code availability. The data and code that support the793

findings of this study is available from the corresponding author794

upon reasonable request.795

Acknowledgements. We thank W.E. Moerner and Steffen J.796

Sahl for contributing the SMLM microtubule dataset. We also thank797

Jin Lu, Oumeng Zhang, and Tingting Wu for helpful discussion.798

Research reported in this publication was supported by the National799

Science Foundation under grant number ECCS-1653777 and by the800

National Institute of General Medical Sciences of the National801

Institutes of Health under grant number R35GM124858 to M.D.L.802

References803

1. Moerner WE, Shechtman Y, Wang Q (2015) Single-molecule spectroscopy and imaging804

over the decades. Faraday discussions 184:9–36.805

2. Sauer M, Heilemann M (2017) Single-molecule localization microscopy in Eukaryotes.806

Chemical Reviews 117(11):7478–7509.807

3. Sahl SJ, Hell SW, Jakobs S (2017) Fluorescence nanoscopy in cell biology. Nature Reviews.808

Molecular Cell Biology 18(11):685–701.809

4. Baddeley D, Bewersdorf J (2018) Biological insight from super-resolution microscopy:810

What we can learn from localization-based images. Annual Review of Biochemistry811

87(1):965–989.812

5. Moringo NA, Shen H, Bishop LDC, Wang W, Landes CF (2018) Enhancing analytical813

separations using super-resolution microscopy. Annual Review of Physical Chemistry814

69(1):353–375.815

6. Chen T, et al. (2017) Optical super-resolution imaging of surface reactions. Chemical Re-816

views 117(11):7510–7537.817

7. Willets KA, Wilson AJ, Sundaresan V, Joshi PB (2017) Super-resolution imaging and plas-818

monics. Chemical Reviews 117(11):7538–7582.819

8. Sage D, et al. (2019) Super-resolution fight club: assessment of 2D and 3D single-820

molecule localization microscopy software. Nature Methods 16(5):387–395.821

9. Cohen EA, Abraham AV, Ramakrishnan S, Ober RJ (2019) Resolution limit of image anal-822

ysis algorithms. Nature communications 10(1):793.823

10. Nieuwenhuizen RPJ, et al. (2013) Measuring image resolution in optical nanoscopy. Na-824

ture Methods 10(6):557–562.825

11. Culley S, et al. (2018) Quantitative mapping and minimization of super-resolution optical826

imaging artifacts. Nature Methods 15(4):263–266.827

12. Sage D, et al. (2015) Quantitative evaluation of software packages for single-molecule828

localization microscopy. Nature Methods 12(8):717–724.829

13. Fox-Roberts P, et al. (2017) Local dimensionality determines imaging speed in localiza-830

tion microscopy. Nature Communications 8:13558.831

14. Deschout H, et al. (2014) Precisely and accurately localizing single emitters in fluores-832

cence microscopy. Nature Methods 11(3):253–266.833

15. Enderlein J, Toprak E, Selvin PR (2006) Polarization effect on position accuracy of fluo-834

rophore localization. Optics Express 14(18):8111–8120.835

16. Engelhardt J, et al. (2010) Molecular orientation affects localization accuracy in superres-836

olution far-field fluorescence microscopy. Nano Letters 11(1):209–213.837

17. Lew MD, Backlund MP, Moerner W (2013) Rotational mobility of single molecules af-838

fects localization accuracy in super-resolution fluorescence microscopy. Nano Letters839

13(9):3967–3972.840

18. Burke D, Patton B, Huang F, Bewersdorf J, Booth MJ (2015) Adaptive optics correction of841

specimen-induced aberrations in single-molecule switching microscopy. Optica 2(2):177–842

185.843

19. von Diezmann A, Lee MY, Lew MD, Moerner W (2015) Correcting field-dependent aber-844

rations with nanoscale accuracy in three-dimensional single-molecule localization mi-845

croscopy. Optica 2(11):985–993.846

20. Ji N (2017) Adaptive optical fluorescence microscopy. Nature Methods 14(4):374–380.847

21. Villani C (2008) Optimal transport: old and new. (Springer Science & Business Media) Vol.848

338.849

22. Spehar K, et al. (2018) Super-resolution imaging of amyloid structures over extended850

times by using transient binding of single Thioflavin T molecules. ChemBioChem851

19(18):1944–1948.852

23. Lee JE, et al. (2018) Mapping surface hydrophobicity of α-synuclein oligomers at the853

nanoscale. Nano Letters 18(12):7494–7501.854

24. Backer AS, Moerner WE (2014) Extending single-molecule microscopy using optical 855

Fourier processing. The Journal of Physical Chemistry. B 118:8313–8329. 856

25. Ovesný M, Krížek P, Borkovec J, Svindrych Z, Hagen GM (2014) ThunderSTORM: a 857

comprehensive ImageJ plug-in for palm and STORM data analysis and super-resolution 858

imaging. Bioinformatics 30:2389–2390. 859

26. Dauphin YN, et al. (2014) Identifying and attacking the saddle point problem in high- 860

dimensional non-convex optimization in Advances in neural information processing systems. 861

pp. 2933–2941. 862

27. Santambrogio F (2017) Euclidean, metric, and Wasserstein gradient flows: an 863

overview. Bulletin of Mathematical Sciences 7(1):87–154. 864

28. Mazidi H, Lu J, Nehorai A, Lew MD (2018) Minimizing structural bias in single-molecule 865

super-resolution microscopy. Scientific Reports 8:13133. 866

29. Min J, et al. (2015) FALCON: fast and unbiased reconstruction of high-density super- 867

resolution microscopy data. Scientific Reports 4:4577. 868

30. Duboisset J, et al. (2013) Thioflavine-T and congo red reveal the polymorphism of insulin 869

amyloid fibrils when probed by polarization-resolved fluorescence microscopy. The Jour- 870

nal of Physical Chemistry B 117(3):784–788. 871

31. Shaban HA, Valades-Cruz CA, Savatier J, Brasselet S (2017) Polarized super-resolution 872

structural imaging inside amyloid fibrils using Thioflavine t. Scientific Reports 7:12482. 873

32. Varela JA, et al. (2018) Optical structural analysis of individual α-synuclein oligomers. 874

Angewandte Chemie International Edition 57(18):4886–4890. 875

33. Lindén M, Curic V, Amselem E, Elf J (2017) Pointwise error estimates in localization 876

microscopy. Nature Communications 8(1):15115. 877

34. Hershko E, Weiss LE, Michaeli T, Shechtman Y (2019) Multicolor localization microscopy 878

and point-spread-function engineering by deep learning. Optics Express 27(5):6158–6183. 879

35. Kim T, Moon S, Xu K (2019) Information-rich localization microscopy through machine 880

learning. Nature Communications 10(1):1996. 881

36. von Diezmann A, Shechtman Y, Moerner W (2017) Three-dimensional localization of sin- 882

gle molecules for super-resolution imaging and single-particle tracking. Chemical Reviews 883

117(11):7244–7275. 884

37. Aguet F, Geissbühler S, Märki I, Lasser T, Unser M (2009) Super-resolution orientation es- 885

timation and localization of fluorescent dipoles using 3-D steerable filters. Optics Express 886

17(8):6829–6848. 887

38. Stallinga S, Rieger B (2012) Position and orientation estimation of fixed dipole emitters 888

using an effective Hermite point spread function model. Optics Express 20(6):5896–5921. 889

39. Backlund MP, et al. (2012) Simultaneous, accurate measurement of the 3D position and 890

orientation of single molecules. Proceedings of the National Academy of Sciences of the United 891

States of America 109(47):19087–19092. 892

40. Backlund MP, Lew MD, Backer AS, Sahl SJ, Moerner W (2014) The role of molecu- 893

lar dipole orientation in single-molecule fluorescence microscopy and implications for 894

super-resolution imaging. ChemPhysChem 15(4):587–599. 895

41. Zhang O, Lu J, Ding T, Lew MD (2018) Imaging the three-dimensional orientation and ro- 896

tational mobility of fluorescent emitters using the Tri-spot point spread function. Applied 897

Physics Letters 113(3):031103. 898

42. Mazidi H, King ES, Zhang O, Nehorai A, Lew MD (2019) Dense super-resolution imag- 899

ing of molecular orientation via joint sparse basis deconvolution and spatial pooling. 900

IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) pp. 325–329. 901

43. Breuer T, Csiszár I (2016) Measuring distribution model risk. Mathematical Finance 902

26(2):395–411. 903

44. Beck RN (1993) Overview of imaging science. Proceedings of the National Academy of Sci- 904

ences 90(21):9746–9750. 905

45. Barrett HH, Myers KJ (2013) Foundations of image science. (John Wiley & Sons). 906

46. Pereyra M (2017) Maximum-a-posteriori estimation with Bayesian confidence regions. 907

SIAM Journal on Imaging Sciences 10(1):285–302. 908

47. Nicovich PR, Owen DM, Gaus K (2017) Turning single-molecule localization microscopy 909

into a quantitative bioanalytical tool. Nature Protocols 12(3):453. 910

48. Schnitzbauer J, et al. (2018) Correlation analysis framework for localization-based super- 911

resolution microscopy. Proceedings of the National Academy of Sciences 115(13):3219–3224. 912

49. Dempsey GT, Vaughan JC, Chen KH, Bates M, Zhuang X (2011) Evaluation of fluo- 913

rophores for optimal performance in localization-based super-resolution imaging. Na- 914

ture Methods 8(12):1027–1036. 915

50. Lee HlD, Sahl SJ, Lew MD, Moerner W (2012) The double-helix microscope super- 916

resolves extended biological structures by localizing single blinking molecules in three 917

dimensions with nanoscale precision. Applied Physics Letters 100(15):153701. 918

51. Snyder DL, Miller MI (2012) Random point processes in time and space. (Springer Science & 919

Business Media). 920

52. Petrov PN, Shechtman Y, Moerner WE (2017) Measurement-based estimation of global 921

pupil functions in 3D localization microscopy. Optics Express 25(7):7945–7959. 922

Mazidi et al. 10

.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/721837doi: bioRxiv preprint