generalized redundant calibration of radio inteferferometers

Generalized Redundant Calibration of Radio Inteferferometers

Prakruth Adari and Anze SlosarBrookhaven National Laboratory, Upton NY 11973 and

Physics and Astronomy Department, Stony Brook University, Stony Brook, NY 11794

Redundant calibration is a technique in radio astronomy that allows calibration of radio arrays,whose antennas lie on a lattice by exploiting the fact that redundant baselines should see thesame sky signal. Because the number of measured visibilities scales quadratically with the num-ber of antennas, but the number of unknowns describing the individual antenna responses and theavailable information about the sky scales only linearly with the array size, the problem is alwaysover-constrained as long as the array is big and dense enough. This is true even for non-latticearray configurations. In this work we study a generalized algorithm in which a per-antenna gainis replaced with a number of gains. We show that it can successfully describe data from an ap-proximately redundant array on square lattice with pointing and geometry errors. We discuss theparameterization, limitations and possible extensions of this algorithm.

I. INTRODUCTION

The 21 cm emission from neutral hydrogen is promisingto transform our understanding of the universe across theages: from low-redshift observations of neutral hydrogenin galaxies through the epoch of reionization all the wayto the dark ages at redshift z ∼ 100 in the future.

The field received a major boost when it was re-alized that developments in computing and RF tech-nology allow telescopes to be build almost entirelyin software. A number of experiments were born,some already operating, such as CHIME[1], Tianlai[2],HERA[3], MWA[4] some are under construction, includ-ing HIRAX[5], BINGO[6], and CHORD[7] and some pro-posals for future very large facilities, such as PUMA [8].

Large interferometric radio arrays require a large num-ber of calibration parameters. It was soon noticed thatarrays of indistinguishable elements on regular latticeposses strong redundancy. The total number of possi-ble pairs of antennas, corresponding to the total numberof measured visibilities scales squarely with the numberof elements in the array. On the other hand the numberunique baselines (given by the number of unique separa-tion vectors between all possible pairs of antenas) scaleonly linearly with the number of elements. Since all base-lines made of identical elements and spanning the samedistance vector should measure the same signal, we canuse this to back out calibration factors of individual an-tennas, without ever knowing anything about the actualsky signal. The solution is unique up to intrinsic degen-eracies of the system corresponding to the overal shift,scaling in amplitude and translation of the sky signal (i.e.applying a phase gradient across the u-v plane). This cal-ibration procedure is known a redundant calibration andhas been worked in detail in [9] and [10].

Unfortunately, it soon became clear that real arraysare much less redundant for the simplest form of re-dundant calibration to be sufficient. For example [11]study redundancy in HERA and found that real-life non-redundancy produces spurious temporal structure in gainsolutions. In [12] the shortcomings of redundant calibra-

tion for HERA are further studied in simulations.In some sense, these findings indicate that redundant

calibration is the victim of its own success. A possi-ble way to look at the problem is that it is not thatthe arrays are too non-redundant, but that they are toosensitive. They are sensitive enough that particularitiesof individual elements produce big enough effects thata good fit cannot be found for the data. One possibleapproach has been studies in [13]: a pair of nearly, butnot perfectly redundant baselines will have very strongly,but not perfectly correlated measured visibilities. Theseslight decorrelations can be propagated self-consistentlyusing a quadratic estimator formalism to allow stable so-lutions and almost by construction, a good fit. This pro-cess has the advantage of gracefully dealing with outlierantennas: there is no need to cut them out if we caninstead model them as such. A more recent paper [14]proposed a unified approach to sky-based and redundantcalibration, in which both can be used concurrently in aself-consistent Bayesian model, assuming we have modelof the telescope and its calibration uncertainties.

However, model complexity is like European Union: ifit does not solve your problems you are not using enoughof it. So in this paper we take a different approach: in-stead of describing every single antenna with one complexnumber, that is a single gain, we are looking for a descrip-tion in terms of multiple numbers per elemnt that can beadjusted in order to achieve good fit as well as give somephysical insight into the type of imperfections. Note thatthis does not break the basic premise of redundant cali-bration: the number of unknown still scales linearly withthe number of antennas while the number of measure-ments scales quadratically. So, for arbitrarily complexdescription of per-antenna non-idealities, the system willbe over-constraint for large enough array (as long as thenumber of free parameters pet antenna is finite).

There are two canonical types of errors in a redundantradio array. The pointing errors refer to the fact thatthe beams of individual element are not aligned (i.e thetelescope points away from the zenith in a transit arrayconfiguraion) and geometry errors, which means that theentire dish is displaced from its nominal lattice positions

arX

iv:2

107.

1018

6v2

[as

tro-

ph.I

M]

30

Jul 2

021

2

affecting the effective baseline lengths. These are thekinds of errors that we study, but we do not explicitly fitfor them. Instead, we build a general pixelized descrip-tion of the response of each element that is arbitrarilyfine-resolved given by a tunable parameter M .

II. PROBLEM SET-UP

Let us consider a general interferometric array observ-ing, for simplicity, in a single polarization mode. Weconsider a pair of antennas pointing at the zenith andobserving in a narrow frequency range. The noiselessobserved visibility for pairs of antennas i and j in theflat-sky approximation is given by

Vij =

∫I(θ)e−2πi(xi−xj)·θBi(θ)B

∗j (θ)d2θ =

FT [IBiB∗j ](xi − xj) (1)

where xi is the geometric position of antenna i measuredin wavelengths, I(θ) is the intensity signal coming fromthe sky as a function of angle on the sky θ and Bi isthe beam of antenna i. The operator FT [X] denotesa Fourier transform of X. The signal I(θ) is real andin principle spans the entire sky. The beams Bi are ingeneral complex and are compact in Fourier space (i.e.on the u-v plane). They correspond to the complex am-plitude response to an incident uniform plane wave justabove the aperture.

Multiplication in real space corresponds to convolutionin Fourier space. Addition of the random noise compo-nent ε gives the general equation for the observed visibil-ity for paris of antennas

V oij =

∫U(xi − xj + u)(Bi ~ B†j )(−u)d2u + ε (2)

Where the uv-plane U = FT [I] contains the image of the

sky (in Fourier space), and Bi = FT [Bi] are beams repre-sentation in this domain. Variable ε is a random variabledescribing the noise realization. We have used notationB†(u) = FT [B∗](u) = FT [B](−u), i.e. complex conju-gation of a real space quantity results in mirroring acrossthe origin in the Fourier domain. The crucial insight isthat beams are compact in Fourier space, essentially cor-responding to the physical extent of the dishes and thus

Bi~ B†j is also compact. Therefore, the limits of integra-

tion in equation 2 need to extend only as far as support

of Bi ~ B†j .It is instructive to compare this equation with one that

is typically used in set-up of a redundant calibration:

V oij = Ui−jgig∗j + ε, (3)

where i−j indexes redundant baselines corresponding tothe baseline vector xi − xj . If we set our beams to bethe same, scaled by only a complex gain factor for each

dish, namely Bi(θ) = giB(θ), we find that the Equation2 reduces to the Equation 3 with

Ui−j =

∫U(xi − xj + u)(B ~ B†)(−u)d2u (4)

In other words, the redundant calibration assumes thatall dishes have exactly the same sky response that areonly allowed to vary by an overall complex gain, whichis assumed to be coming from electronics, amplifiers, etc.It reconstructs these gains, but also the uv-plane infor-mation after the beam convolution.

But in general, massive redundancy in compact arraysallows us to go beyond this approximation (we will quan-tify this in the next section). The main idea is to startwith Equation 2 as the main ingredient and reconstructboth B for each antenna and the uv-plane U . It is impor-tant to note, from the beginning that we cannot unpickthe variations in complex gain comping from electronicsfrom those arising from the imperfect dish. Thereforeaction of a gain gi is equivalent to multiplying the entirebeam by the same factor.

A. Discretization

We will try to build an intuition that will prepare us forformulation of general redundant calibration in steps. Inshort, we will try to build a redunant calibration that willreplace the parameters Ui−j and gi for the standard re-dundant calibration in the Equation 3 with a new, larger,but still finite number of parameters that can still besolved for using the inherent redundancy in the systembut can allow for some freedom in non-redundancy.

In order to proceed, we need to express the integralof the Equation (2) as a sum over a finite number ofdegrees of freedom. Our approach is to pixelizing thebeam maps Bi and choosing the pixelization in of theuv-plane U that matches the beam pixelization in a waythat express Equation (2) as a sum that is cubic in inputparameters without entailing any interpolation.

For concreteness, let us consider an interferometric ar-ray on regular square grid of size Ns×Ns with a total ofNa = N2

s antennas. We will measure all distances in theunits of wavelength and take the lattice spacing to be L.We start by introducing the ”oversampling” parameterM . The beam map is pixelized in (2M + 1)2 pixels, ie. acentral pixels and M additional pixels on each side. ForM = 0 them beam is reduced to a single pixel and themethod reduces to a redundant calibration. The require-ment on the odd number of pixels in the beam descriptionis there to ensure that a single pixel description of eachbeam is a natural limit of the problem.

The convolution of two beams (Bi ~ B†j ) is of the size

(4M + 1)× (4M + 1), which sets the natural pixel size tobe L/(2M + 1) for both the beam grid and the U grid.We can therefore re-write the equation 2 in a pixelized

3

FIG. 1. Illustration of the pixelization scheme for M = 1. La-beled points illustrate the lattice points on the u−v plane fora square close-packed array. For example (1, 0) correspondsto baselines one lattice spacing apart in the E-W directionand (0, 1) is the same for N-S baslines. (0, 0) is the originof the u − v plane and correspoinds to single-dish observa-tions. Invidual beam is described by 3 × 3 pixelized grid, so(B~B†) is the 5×5 pixelized grid shown in blue around (1, 0).Making prediction for a particular (1, 0) baseline amounts tomultiplying the values of the blue beam convolved map onthe blue grid and the corresponding points in the u− v planeand summing them up. See text for further discussion.

form schematically as

Vij =

+2M∑i=−2M

2M∑j=−2M

Uio+i;jo+j(Bi ~ B†j )ij , (5)

where io and jo are the grid offsets corresponding to thei− j redundant baseline on the uv-plane. Note that thesum is over (4M + 1) × (4M + 1) postage stamp corre-sponding to the convolution of two (2M + 1)× (2M + 1)sized beam maps. This is illustrated in the Figure 1. Weurge reader to spend some time trying to understand thisFigure as this pixelization is essential for the understand-ing of our approach. Consider the shortest E-W baselineof a closely-packed dish array. Dishes have diameter Dso the distances between pieces of reflector surface onthe same dish vary between 0 and D. The possible dis-tances between pairs of pieces of reflector surface on twodishes vary between 0 (at point at which dishes nominallytouch) and 2D. In the formalism, this is encoded by thepostage stamp of the beam being (2M + 1) pixels in sizefor individual beam and (4M + 1) for the convolution ofthe two beams. This then defines the extent on the u−vplane to which this particular baseline is sensitive to. Inthe limit of large M it comes arbitrarily close, but neverreaches the origin of the u−v plane (corresponding to theconstant background) and similaly gets arbtrarily closebut never reaches two lattice spacings (corresponding tosensitivity doe to pieces of reflector surface that are fur-

thest apart). In Figure 1 this is illustrated with the bluegrid.

In Figure 1 we also show squares around other nominallattice pointings. We see that most u−v points are beingprobed by multiple baslines, except those on precise gridspacings. This severly limits the degeneracies presentin the problem. We also note that we need to describejust one half of the u-v plane with other half give by thereality of the observed field. Together with rules aboutFourier transforms of conjugated fields, this ensures thatprediction for the antenna pair i, j is always a complexconjugate of the prediction for the antenna pair j, i.

To continue to build intuition about the pixelization,consider the pixel m in the antenna i and pixel n in theantenna j. The visibility for the pairs of dishes i and jis given by

Vij =∑

m∈ Beam in∈ Beam j

U∆x;m;nBi;mB∗j;n, (6)

where the index to U schematically implies picking thecorrect index in the u − v plane. Instead of being thevalue of the plane at the separation of the dishes, it cor-responds to the vector separating the pixel m on dish iand pixel j on dish n. This equation now has the formof Equation 3, but added over all the pixel pairs formedby the two beams. In other words, the pixelization isin effect modelling the redundant array made of Ns×Nsantennas as a grid of Neff = (2M+1)Ns×(2M+1)Ns in-dependent antena elements which are phased up in blocksof (2M + 1)× (2M + 1).

The total number of unknowns to be determined isthus given by i) the total number of pixels required todescribe all the beams Na(2M + 1)2 and ii) the pixelsrequired to describe uv-plane, which are given by thenumber of redundant baselines in the effective phased-uparray 2Neff(Neff−1) = 2Ns(2M+1)(2MNs+Ns−1). Thetotal number of measurements, on the other hand goes asthe total number of baselines with measured visibilities,which is given by Na(Na − 1)/2. We can thus constructthe redundancy factor

r =

√Na (Na − 1)

2[(2M + 1)

(6M√Na + 3

√Na − 2

)] (7)

which describes the ratio of the number of measuredquantities divied by the number of unknowns. TheM = 0 case reduced to the standard redundant calibra-tion scheme. Note that since we are assuming a squarearray, the

√Na is always an integer. When r ≥ 1, the sys-

tem is overconstrained and we can hope to constrain it,subject to known degeneracies. Note that ratio in Equa-tion (7) is for the particular case of square lattice. For animperfectly filled array or a hexagonal close-packing, theratio would be different. But note that this formalismdoes not even require antennas to be on the lattice. Itcan be calcualted for any array and even an irregular butsufficiently packed array will have r > 1.

4

FIG. 2. Redundancy ratio as a function of Na with varyingM. Solid black line is r = 1 and dashed black line is r = 5corresponding to significant redundancy.

In Figure 2 we plot r as a function of Na for a coupleof values of M . We see that the for the traditional redun-dant calibration, even very small arrays containing tensof antennas are sufficiently redundant. HIRAX, with a32 × 32 array can theoretically model to up to M = 5but realistically more likely up to M = 2. In what fol-lows we will focus on the M = 1 case which is perhapssufficient for the current arrays such as HERA. In this ex-ploratory work, we have not done an explicit calculationfor a cylinder array configuration like that of CHIME,although that would a straightforward generalization.

III. NUMERICAL IMPLEMENTATION

In previous section we have presented a general methodfor modeling the signals from an imperfect interferome-ter. The methods contains the “resolution” parameterM which controls how fine are the pixels which describethe response of each individual beam. M = 0 corre-sponds to standard redundant calibration and M → ∞is a completely general description. The hope is thata relatively small M will be sufficient to describe typicallevel of non-redundancy. The purpose of this section is toto what extent this is true and our logic is to attempt touse M = 1 on relatively small arrays to focus on pointingand geometry errors. The path we follow is to assume aconcrete perfectly redundant array observing a given skyand then see how the general redundant calibration per-forms compared to a standard redundant calibration aswe introduce beam non-redundancies.

FIG. 3. Radius response function with R = 0.4L and taper=0.05L.

FIG. 4. The beam B used is plotted on the left and its Fouriertransform, the actual primary beam B on the right. In thelimit of no taper, the beam on the right would be an Airy discpattern.

A. Model Beam and its Imperfections

Our model beam is a circularly symmetric taperedfilled circle in the Fourier space:

Bc(r) = 1− 1

1 + e−2· r−r0∆r

. (8)

Note that this is description of the Fourier transform ofthe beam, rather than the beam itself, i.e the real sapcebeam becomes an airy disc in the limit of ∆r going tozero. The reason why we specify the beam in Fourierspace is partly because this is the input quantity we needfor making predictions, but also more importantly it cor-responds to space with a compact beam representation.We use units of lattice spacing, so radius r0 = 0.5L wouldcorresponds to corresponds to dishes that just touch. Inour simulation we use r0 = 0.4L and ∆r = 0.05L. Weshow the beam in real and Fourier space in Fig 3.

In this work we focus on to most common types oferrors encountered in redundant arrays: the pointing er-rors (the center of the beam of the dish is misplaced) andgeometry errors (the dish is not at its nominal locationon the lattice). Pointing errors are displacement of thebeam in the Fourier space and thus correspond to apply-

5

ing a phase gradient across the complex beam responseof a single dish. The geometry errors on the other handand just simple displacement of the dish in the Fourierdomain. A full expression for the beam is thus given by

B(x, y) = ABc

(√(x− xo)2 + (y − yo)2

)e−i(pxx+pyy),

(9)where A corresponds to the overall complex gain factor,(xo, yo) is the geometric error and (px, py) is the point-ing error. We draw both the overall complex gains froma Gaussian distribution centered around 1 + 0i with avariance of σg. Geometric and pointing errors are drawnfrom 2D Gaussians with 1D variances of σg and σp re-spectively.

In Figure 5 we illustrate a perfect and an imperfectarray with the types of errors discussed above.

B. Simulating Signal

To simulate the noiseless signal we take two ap-proaches. In the first approach, we simply generate datausing a large value of M . We use M = 14 (correspondingto beam images of 29×29 pixels). In this case we take thetrue values of the u − v plane as random gaussian vari-ates, corresponding to a white-noise signal on the sky.We have checked that increasing M beyond value useddoes not affect our results.

In this approach, the code using for generating the sig-nal and fitting is is essentially the same with only thevalue of M being different. One one hand this is elegant,but it is also prone to potential bug inadvertently cancel-ing out between data generation and fitting. Moreover,the realistic skies are often dominated by a few brightsources.

In the second approach we generate the signal as asum over discrete sources. This is to take into accountthe fact that some sources are considerably brighter thanthe others. For each source and baseline we

• Calculate the primary beam response at the loca-tion of the source for both dishes. Given pointingerrors this is different for each baseline.

• Directly calculate the response of the interferome-ter for the baseline length spanned by a given pairof dishes (taking into account geometry errors).

We use sources whose fluxes are randomly drawn uni-formly in log from 1 to 1000.

In both approaches, once the noiseless visibilities arecalculated, we multiply visibility by the total complexgain error contribution gig

∗j , where gi are the gain error

for dish i. Note that the method is completely generalwith respect to individual gains, i.e. they can be perfectlyabsorbed into beam description. Therefore we keep thevariance small to avoid dealing with solver falling into alocal minima.

Finally, we add complex noise ε drawn from a Gaussianwith variance σ2

ε . Since the only quantity that mattersis the level of noise compared to the level of signal, wewill report our results in terms of signal-to-noise ration(SNR). We define the SNR per visiblity as

SNR2 =1

Nvisibilities

∑baselines i−j

(|V 2i−j |σ2ε

)2

(10)

There are thus 6 additional degrees of freedom per eachdish: 2 for pointing error, 2 for geometry error and 2 foroverall gain. Even with M = 1, each beam is describedby 9 complex numbers (i.e. 18 degrees of freedom) sothere is reasonable hope that the description is sufficient.However, on purpose, we decided to consider the types oferrors that are not perfectly reproducible by our modelto asses its flexibility. Had we instead decided to modeldish imperfections as bird droppings 1/9th the size of thedish, our model would be guaranteed to perform better.

IV. SOLVING FOR Bi AND U

In this work we use simple iterative solvers for beamsBi and the true u− v plane U . Since we are focusing onthe modelling side, i.e. how well the solutions perform,these solvers are not designed to be either particularlystable nor fast.

In general, we are trying to maximize the log likelihoodof the model, which is equivalent to minimizing the χ2,given by

χ2 =∑

all pairs i,j

(V oij − Vpij(B, U))2

σ2ε

, (11)

where V pij denotes the predicted visibilities which are afunction of all beam parameters and u− v plane values.At each step we optimize for either visibilities or beamparameters.

A. Visibility

To solve for visibilities, U , we rely on the fact thatobserved visibilities are linear in the input visibilities.

V oi = MikUk + ε (12)

The matrix M is quite sparse, but because beams over-lap, it is not a block matrix. In other words, neighbour-ing baselines do probe some of the same sky signals asillustrated in the Figure 1. Matrix M depends on allbeams and so we assume the current best guess for thebeams (which improve with every iteration). We rely on

stock python infrastructure. To calculate (Bi ~ B†j ) weuse scipy.signal.convolve and to solve the sparse sys-tem we rely on scipy.sparse.linalg.lsqr. The rest is

6

FIG. 5. Example of the real part of beam responses for a 4 × 4 array. What we plot is the illustration of the complex gainresponse of the entire array. The left image shows a perfectly redundant array, where the only calibration factor is the complexgain (which is a random quantity for each beam). The right image shows the same, but with geometric errors with σg = .1Land pointing errors with σp = .5. Pointing errors are modeled as phase gradient on each beam and geometry errors are modeledas circle offsets. The top images were generated using M = 30. The bottom two plots are showing the same, but in the reducedM = 1 resolution, which is used in the numerical tests in this paper.

rather painful but otherwise straightforward housekeep-ing. For interested reader we point at some of the moresubtle technical issues in Appendix A.

There are advantages to incorporating a Wiener filter,as written in [15], via

[S−1 + (Mik)†N−1Mik

]Uk =

((Mik)†N−1

)V oi ,

instead of directly jumping to a sparseleast squares solver. With this we can usescipy.sparse.linalg.spsolve to solve the system,and although there is more overhead, it is ultimatelyfaster than directly solving equation (12). It is also con-siderably better at reducing χ2 per iteration comparedto directly solving the system which makes the filter aworthwhile implementation.

B. Beams

While the equation (2) is nominally quadratic in thebeams, this is not an issue in our actual problem, becausewe do not consider auto-correlation signal. To solve forthe beam Bi, we fix all the remaining beams and thesolved u− v plane U , so that

V ok = MklBi,l + ε (13)

Here index k runs over all visibilities that depend in onthe beam i (i.e all baselines that contain antenna i) and

l over all pixels of beam Bi. This is a dense system thatwe solve using scipy.optimize.lsq_linear separatelyfor each beam i.

7

V. PERFECT DEGENERACIES

The standard redundant calibration has perfect degen-eracies spanning 4 degrees of freedom:

• Multiply gains by complex factor α and divide skysignal by αα∗ (this is often split into the overalamplitude degeneracy and the overall phase degen-eracy);

• Translate the sky by a vector p and apply a con-sumerate phase gradient across the gain solution;

The same degeneracies continue to exist in our case.One would naively expect that we also have a similar setof per-element degeneracies, however, there are not, ingeneral, present because neighbouring antennas actuallymeasure the many of the same points in the u− v plane,thus introduce “intelocking” of the u− v plane solutions.

However, we have a different kind of degeneracypresent. We know that if an array is truly redundant,then one needs only ∼ 3N2

s numbers to describe the data.So if the array is actually fully redundant, we are free topick any “shape” of the beam B for M > 0 and still havesufficient freedom in the U array to form a model thatgives precisely the same predictions. In the other limit,if the array is really non-redundant then this degeneracydisappears. Therefore this is not really a model degen-eracy, but a degeneracy associated with a perfectly re-dundant array solutions which are, from a mathematicalperspective, pathological. This is anologous to solving amatrix equation Mx = y, which for a general matrix Mis solvable by x = M−1y, unless matrix M is singular.

In practice, the presence of noise will instead use theextra model freedom to “fit the noise” and find a nom-inally better solution as we will describe in the Resultssection. We have implemented one possible regulariza-tion with some success as described in the next section.A formally correct way would be to perform a strictBayesian model comparison, where we weight the solu-tions by the Bayesian evidence in favor of a certain model:if the model with M > 0 can fit the data equally well asthe standard redundant calibration, then the standardredundant calibration would be strong favored due hav-ing many fewer priors.

VI. REGULARIZATION

To prevent overfitting, we implement regularization asfollows. We introduce a prior on the beam parametersthat attempts to pull the beam solution back to the fidu-cial, redundant beams. The total likelihood is thus givenby

logL = −1

2χ2+

∑beam i

∑pixel k

(−2 log σB −

|(Bik − B0k)|2

2σ2B

), (14)

FIG. 6. The best fit χ2 as a function of SNR per visibility(see Equation 10) for different values of pointing error vari-ance σp. Values of other parameters are held at their fidu-cial values described in the text. Different colors correspondto different values of σp as per legend. Solid line is for thegeneralized redundant calibration while dashed line for thestandard redundant calibration that assumes perfect redun-dancy. The two dashed boxed show the bound on the 5 and95 precentiles of the expected χ2 for both cases.

where χ2 is given by Equation 11 and B0 corresponds tothe perfect beam. The σB describing typical deviationfrom perfect beams is crucially left to be determined bythe data. If the array is close to truly redundant, thesystem can achieve a good fit by floating beams towardsnominal beam values and lowering σB (and thus achievea large likelihood improvement though the normalizationterm − log σB). However, for non-redundant array, it isbetter to raise σB and instead improve the χ2.

In terms of numerical solution, we now iterate overthree steps: i) fit for the visibilities given data and beams;ii) fit for the beams, taking into account visibilities, data,and σB and iii) fit for σB , which in effect amounts tocalculating the variance of beam solutions with respectto their nominal values.

VII. RESULTS

In what follows we will apply our algorithm to sim-ulated data. Our main measure of success will be thegoodness of fit χ2. A poor χ2, which is too high given thenumber of degrees of freedom indicates that the model isnot able to fit the data; on the other hand the χ2 whichis too good indicates that the model is fitting the noiseand that it is too complex for the data at hand and wewill check if the regularization scheme described above

8

FIG. 7. Same as Figure 6 but varying the geometry errorvariance σg.

FIG. 8. Same as Figures 6 and 7 but varying the side of thearray Ns and plotting the reduced χ2 (degrees of freedom varywith array size).

helps. We naturally expect that χ2 to be monotonicallyincreasing with SNR; in numerical practice, this was notquite the case. Instead we simulated a single set of visi-bilities and started by optimizing the array at the highestSNR and then took the high SNR solution as the startingpoint in the minimizer when lowering SNR.

Our fiducial model has Ns = 10, corresponding toNa = 100 antennas with σp = 0.05 radians (approxi-

FIG. 9. χ2 minimization with a self-regularizing prior on thebeam as explained in Section VI. In this test we increase bothgeometric and pointing errors by the same amount as indi-cated by legend. In the top plot, solid lines are with prior,dashed without. Bottom is the fitted σB for the prior.

mately 3 degrees) and σg = 0.05L, which corresponds to30cm geometry errors for 6m close-packed dishes. Theseare somewhat on the large side.

Our main results are shown in the Figures 6, 7 and8, where we show the result of our minimizer when wevary σp and σg and Ns respectively. A few things are im-mediately apparent from this plot. First, the redundantcalibration is behaving as expected. When the array issufficiently redundant it fits the data perfectly – it hasexactly the right number of degrees of freedom to ex-plain the measurements and we recover the expected χ2.As we increase the non-redundancy by increasing eithertype of errors, the fitting fails. With the generalized re-dundant calibration we see that for a truly redundantarray looking at a single sky, it produces χ2 which is toogood. In other words, it fits the noise. As we increase thenon-redundancy, the χ2 increases and for sufficiently highSNR and sufficiently bad non-redundancy, this methodfails as well. Nevertheless, the model is able to consider-ably increase the range in SNR over which the model isable to find a satistfactory fit to the data.

When varying Ns in 8, we plot the reduced degrees offreedom. We see that at fixed SNR per visibility, increas-ing the size of the array lessens the amount of over-fitting,because the the number of unknowns scales with Ns andthe total number of visibilities with N2

s . We also find thathigher values of Ns lead to a shallower slope in reducedχ2, but note that the absolute χ2 excess is still greaterfor higher Ns when everything else is held constant.

9

FIG. 10. Values of 1 − r2 of visibility Vij between solvedand errorless data with varying SNR. Solid lines are resultsusing regularized solution. The unregularized solution looksessentially the same. Dashed lines are for standard redundantcalibration. Straight dashed lines are approximate theory ex-pectation (SNR2 ×Np)−1/2 where Np is the total number ofvisibilities entering the calculation.

In Figure 9 we plot the results when we apply ourself-regularization scheme. We see that at some levelit behaves as expected: for high SNR when the modelbecomes unsatisfactory, it finds a larger prior size andlargely the same χ2 and solution as in the unregularizedcase. When SNR is low, however, it lets the value of σBdrop, which should relax the beam shapes to their fidu-cial values and thus prevent noise fitting. While the χ2

raises, it does not reach ∼ 1 per degree of freedom, sothe method remains prone to fitting the noise.

To investigate how well we are actually doing, we cancompare the noiseless observed visibilities V oij (that is the

data-vector with noise set to zero) with V pij(b, u))2 as afunction of baseline length. This measures how well is ourfitted solution related to the underlying truth. Of course,the better SNR the better we will do. We find that at lowSNR (for a given non-redundancy), the standard redun-dant calibration indeed performs better (solid lines abovedotted lines). As we increase the SNR and are able todetect the non-redundancy, our model the standard re-dudant calibration becomes limited by systematic error(inability of its model to describe the data), while thegeneralized redundant calibration continues to improve.

VIII. CONCLUSIONS

We have presented a new method for calibrating imper-fect redundant array. The method is a derivative of re-dundant calibration and models each independent beamelement as a phased-up array of (2M + 1) × (2M + 1)sub-elements, each with its own complex gain factors. Inthe limit of large M , the method is capable of modelingany array, by having a complete freedom to represent theresponse of each dish as pixelized (2M + 1) × (2M + 1)complex beam response. In the limit of M = 0, themethods reduces to the standard redundant calibration.

We study how well the method can perform with thesmallest non-trivial value of M , which is M = 1. We ex-amined if the methods can explain simulated data withpointing and geometric errors. The answer, of course, de-pends on the available signal-to-noise. In the limit of verysmall signal-to-noise data, the method overfits the data;however it is able to explain the data over a consider-ably larger range of SNRs than that standard redundantcalibration.

To avoid fitting for the noise we have attempted a reg-ularization scheme that models the departures from theperfect beams using a Guassian with diagonal scatter.The magnitude of scatter σB is a fitted parameter. Asexpected, we found that when signal-to-noise is low, andthe data is sufficiently well described by the standard re-dundant calibration, the beams solutions relax to theirpriors and σB becomes essentially zero. In this limit, thesystem has less tendency to fit for the noise, although wefind that the χ2 remains too low and and noise fittingremains an issue. When the signal-to-noise is sufficientto detect non-redundancy, the value of σB rises and forsufficiently non-redundant array, the solutions approachthose without regularization.

In this paper we have focused on methodological as-pects of this method, namely, is the method capable ofproducing good fits to the data. In practice, while thismight be true, the very high dimensionality of this prob-lem makes finding of these solutions difficult. We foundthat, unless we start with a good approximate guess, themethod is likely to fall into a local minima. Therefore, inorder to make this method practically usable, it is nec-essary to first find efficient minimizers. Moreover, themethod currently works with a single sky snap-shot andshould be generalized to time-stream data. These aretechnical problems that we leave for the future.

While we have focused on nearly redundant-arrays, ourmethod is in trivially generalizable to only partially re-dundant arrays in arbitrary configuration. A fixed valueof M defines a grid with D/(λM) in the u − v plane.Any array containing dishes (even heterogeneous ones!),whose response with respect to some arbitrary originalcan be satisfactorily described on the u− v plane in thisgrid can be in general fit with generalized redundant cal-ibration. Of course, this model is interesting only if thearray has a sufficient redundancy that the number of un-knowns does not exceed the number of observed visibil-

10

ities, since otherwise it is capable of trivially explainingany measurement.

On downside of this method is that it is non-trivial toconnect the measured values to the underlying quantitiesof interest. In the standard redundant calibration, theinterpretation of the fitted uv-values is clear: they areexactly the values of the true uv-plane convolved with theappropriate beam responses. In generalized redundantcalibration however, the actual beams are not made upfrom identical sub-beams. Therefore, it is non-trivial toprecisely connect the recovered (2M + 1) × (2M + 1)pixelized beam back to actual pointing and geometricoffsets.

The beauty of the proposed scheme is that is it verygeneral, especially for large array that could afford to gobeyond M = 1. On the other hand, if we did know thatthe dominant errors are pointing and geometry errors,one could design a model that would fit explicitly forthose. In that case, the contents of the u-v plane couldbe modeled by the values and its spatial derivatives at thelattice points. We leave extended comparisons of thesemethods for the future.

ACKNOWLEDGEMENTS

Authors acknowledge useful feedback obtained duringthe PUMA collaboration seminar series. This work wassupported in part by the U.S. Department of Energy,Office of Science, Office of Workforce Development forTeachers and Scientists (WDTS) under the Science Un-dergraduate Laboratory Internships Program (SULI).

Appendix A: Technical considerations in fitting

Since we are probing nearby baselines, it is possiblethat we probe the conjugate of a known visibility, i.e.while looking at the baseline (0, 1) we will pick up someamount of signal from the baseline (− 1

3 , 1). It is impor-tant that we are careful to enforce that

U(− 13 ,1) =

(U( 1

3 ,−1)

)∗. (A1)

This means that even if we construct a large sparse ma-trix M (from Equation 12) with each row correspondingto the probed baseline α = (i, j) with coefficients fromPij (the postage between convolved beams), we cannot

simply solve for ~U using this system.

Thus the correct way to implement Equation 12 would

be to split our ~U into real and complex components anddoing the same with our data. Explicitly, given a generica, b ∈ C, with a = a1 + a2i and similarly for b, we cansay

a · b = (a1b1 − a2b2) + (a1b2 + a2b1)i, (A2)

a∗ · b = (a1b1 + a2b2) + (a1b2 − a2b1)i. (A3)

So it simply comes down to flipping the sign on a few ofthe coefficients of our postage Pij for some of the base-lines.

While the housekeeping for visibility solving boils downto changing signs and splitting our complex data into twoparts, writing out Equation 13 to solve for the beams isa bit more involved.

To write our 2D convolution as a matrix product wecan use a blocked Toeplitz matrix. For each Bi we canwrite it’s effect in convolution as a matrix βi by writing

it as a blocked Toeplitz matrix, and for B†j we simply

have β†j . Then our data can be written as

Vij =(B†j ~ Bi

)· ~Uα (A4)

= ~Uα ·(β†j Bi

)(A5)

=(~UTα β

†j

)· Bi (A6)

= MijBi (A7)

Here we have a fixed i and iterate over the other beams j.Unlike the visibility solver, there is no relation betweenany of the beams and thus we are able to solve each oneas an independent linear system.

[1] K. Bandura, G. E. Addison, M. Amiri, J. R. Bond,D. Campbell-Wilson, L. Connor, J.-F. Cliche, G. Davis,M. Deng, N. Denman, M. Dobbs, M. Fandino, K. Gibbs,A. Gilbert, M. Halpern, D. Hanna, A. D. Hincks,G. Hinshaw, C. Hofer, P. Klages, T. L. Landecker,K. Masui, J. Mena Parra, L. B. Newburgh, U.-l. Pen,J. B. Peterson, A. Recnik, J. R. Shaw, K. Sigurd-son, M. Sitwell, G. Smecher, R. Smegal, K. Vander-linde, and D. Wiebe, Canadian Hydrogen Intensity Map-ping Experiment (CHIME) Pathfinder, Proc. SPIE Int.Soc. Opt. Eng. Society of Photo-Optical InstrumentationEngineers (SPIE) Conference Series, 9145, 22 (2014),

arXiv:1406.2288 [astro-ph.IM].[2] F. Wu, J. Zhang, Y. Wang, X. Chen, and H. Shi, Tian-

lai: A 21 cm Radio Telescope Array for BAO and DarkEnergy, Status and Progress, in Proceedings, 51st Ren-contres de Moriond, Cosmology session: La Thuile, Italy,March 19-26, 2016 (ARISF, 2016) p. 315.

[3] D. Jacobs and HERA Collaboration, Hydrogen Epoch ofReionization Array — Report on last phases of construc-tion and first light, in American Astronomical SocietyMeeting Abstracts #236, American Astronomical SocietyMeeting Abstracts, Vol. 236 (2020) p. 308.04.

[4] C. M. Trott, C. H. Jordan, S. Midgley, N. Barry,

https://doi.org/10.1117/12.2054950

https://doi.org/10.1117/12.2054950

https://doi.org/10.1117/12.2054950

https://arxiv.org/abs/1406.2288

11

B. Greig, B. Pindor, J. H. Cook, G. Sleap, S. J. Tingay,D. Ung, P. Hancock, A. Williams, J. Bowman, R. Byrne,A. Chokshi, B. J. Hazelton, K. Hasegawa, D. Jacobs,R. C. Joseph, W. Li, J. L. B. Line, C. Lynch, B. McKin-ley, D. A. Mitchell, M. F. Morales, M. Ouchi, J. C. Pober,M. Rahimi, K. Takahashi, R. B. Wayth, R. L. Webster,M. Wilensky, J. S. B. Wyithe, S. Yoshiura, Z. Zhang,and Q. Zheng, Deep multiredshift limits on Epoch ofReionization 21 cm power spectra from four seasons ofMurchison Widefield Array observations, MNRAS 493,4711 (2020), arXiv:2002.02575 [astro-ph.CO].

[5] L. B. Newburgh, K. Bandura, M. A. Bucher, T. C.Chang, H. C. Chiang, J. F. Cliche, R. Dave, M. Dobbs,C. Clarkson, K. M. Ganga, T. Gogo, A. Gumba,N. Gupta, M. Hilton, B. Johnstone, A. Karastergiou,M. Kunz, D. Lokhorst, R. Maartens, S. Macpherson,M. Mdlalose, K. Moodley, L. Ngwenya, J. M. Parra,J. Peterson, O. Recnik, B. Saliwanchik, M. G. San-tos, J. L. Sievers, O. Smirnov, P. Stronkhorst, R. Tay-lor, K. Vanderlinde, G. Van Vuuren, A. Weltman, andA. Witzemann, HIRAX: A Probe of Dark Energy and Ra-dio Transients, Proc. SPIE Int. Soc. Opt. Eng. Society ofPhoto-Optical Instrumentation Engineers (SPIE) Con-ference Series, 9906, 99065X (2016), arXiv:1607.02059[astro-ph.IM].

[6] C. A. Wuensche and the BINGO Collaboration, TheBINGO Telescope: A New Instrument Exploring theNew 21 cm Cosmology Window, J. Phys. Conf. Ser. 1269,012002 (2019), arXiv:1803.01644 [astro-ph.IM].

[7] K. Vanderlinde, A. Liu, B. Gaensler, D. Bond, G. Hin-shaw, C. Ng, C. Chiang, I. Stairs, J.-A. Brown, J. Sievers,J. Mena, K. Smith, K. Bandura, K. Masui, K. Spekkens,L. Belostotski, M. Dobbs, N. Turok, P. Boyle, M. Ru-pen, T. Landecker, U.-L. Pen, and V. Kaspi, The Cana-dian Hydrogen Observatory and Radio-transient Detec-tor (CHORD), in Canadian Long Range Plan for As-tronony and Astrophysics White Papers, Vol. 2020 (2019)p. 28, arXiv:1911.01777 [astro-ph.IM].

[8] E. Castorina, S. Foreman, D. Karagiannis, A. Liu, K. W.Masui, P. D. Meerburg, L. B. Newburgh, P. O’Connor,A. Obuljen, H. Padmanabhan, J. R. Shaw, A. Slosar,P. Stankus, P. T. Timbie, B. Wallisch, and M. White,Packed Ultra-wideband Mapping Array (PUMA): As-tro2020 RFI Response, arXiv e-prints , arXiv:2002.05072(2020), arXiv:2002.05072 [astro-ph.IM].

[9] A. Liu, M. Tegmark, S. Morrison, A. Lutomirski, andM. Zaldarriaga, Precision Calibration of Radio Interfer-ometers Using Redundant Baselines, MNRAS 408, 1029(2010), arXiv:1001.5268 [astro-ph.IM].

[10] J. S. Dillon and A. R. Parsons, Redundant Array Con-figurations for 21 cm Cosmology, ApJ 826, 181 (2016),arXiv:1602.06259 [astro-ph.IM].

[11] J. S. Dillon, M. Lee, Z. S. Ali, A. R. Parsons, N. Orosz,C. D. Nunhokee, P. La Plante, A. P. Beardsley, N. S.Kern, Z. Abdurashidova, J. E. Aguirre, P. Alexander,Y. Balfour, G. Bernardi, T. S. Billings, J. D. Bowman,R. F. Bradley, P. Bull, J. Burba, S. Carey, C. L. Car-illi, C. Cheng, D. R. DeBoer, M. Dexter, E. de LeraAcedo, J. Ely, A. Ewall-Wice, N. Fagnoni, R. Fritz,S. R. Furlanetto, K. Gale-Sides, B. Glendenning, D. Gor-thi, B. Greig, J. Grobbelaar, Z. Halday, B. J. Hazel-ton, J. N. Hewitt, J. Hickish, D. C. Jacobs, A. Julius,J. Kerrigan, P. Kittiwisit, S. A. Kohn, M. Kolopanis,A. Lanman, T. Lekalake, D. Lewis, A. Liu, Y.-Z. Ma,D. MacMahon, L. Malan, C. Malgas, M. Maree, Z. E.Martinot, E. Matsetela, A. Mesinger, M. Molewa, M. F.Morales, T. Mosiane, S. Murray, A. R. Neben, B. Nikolic,R. Pascua, N. Patra, S. Pieterse, J. C. Pober, N. Razavi-Ghods, J. Ringuette, J. Robnett, K. Rosie, M. G. San-tos, P. Sims, C. Smith, A. Syce, M. Tegmark, N. Thya-garajan, P. K. G. Williams, and H. Zheng, Redundant-baseline calibration of the hydrogen epoch of reioniza-tion array, MNRAS 499, 5840 (2020), arXiv:2003.08399[astro-ph.IM].

[12] S. Choudhuri, P. Bull, and H. Garsden, Patterns of pri-mary beam non-redundancy in close-packed 21cm arrayobservations, MNRAS 10.1093/mnras/stab1795 (2021),arXiv:2101.02684 [astro-ph.CO].

[13] J. L. Sievers, Calibration of Quasi-Redundant Inter-ferometers, arXiv e-prints , arXiv:1701.01860 (2017),arXiv:1701.01860 [astro-ph.IM].

[14] R. Byrne, M. F. Morales, B. J. Hazelton, and M. Wilen-sky, A unified calibration framework for 21 cm cosmol-ogy, MNRAS 503, 2457 (2021), arXiv:2004.08463 [astro-ph.IM].

[15] M. Tegmark, How to Make Maps from Cosmic MicrowaveBackground Data without Losing Information, ApJ 480,L87 (1997), arXiv:astro-ph/9611130 [astro-ph].

https://doi.org/10.1093/mnras/staa414



https://doi.org/10.1117/12.2234286

https://doi.org/10.1117/12.2234286

https://doi.org/10.1117/12.2234286



https://doi.org/10.1088/1742-6596/1269/1/012002

https://doi.org/10.1088/1742-6596/1269/1/012002


https://doi.org/10.5281/zenodo.3765414

https://doi.org/10.5281/zenodo.3765414



https://doi.org/10.1111/j.1365-2966.2010.17174.x

https://doi.org/10.1111/j.1365-2966.2010.17174.x


https://doi.org/10.3847/0004-637X/826/2/181





https://doi.org/10.1093/mnras/stab1795



https://doi.org/10.1093/mnras/stab647



https://doi.org/10.1086/310631

https://doi.org/10.1086/310631

https://arxiv.org/abs/astro-ph/9611130

generalized redundant calibration of radio inteferferometers

Documents