1 arxiv:2007.04349v1 [cs.cv] 8 jul 2020 · registration mosaicking twin-to-twin transfusion...

11
Deep Placental Vessel Segmentation for Fetoscopic Mosaicking Sophia Bano 1 ( ), Francisco Vasconcelos 1 , Luke M. Shepherd 1 , Emmanuel Vander Poorten 2 , Tom Vercauteren 3 , Sebastien Ourselin 3 , Anna L. David 4 , Jan Deprest 5 , and Danail Stoyanov 1 1 Wellcome/EPSRC Centre for Interventional and Surgical Sciences(WEISS) and Department of Computer Science, University College London, London, UK [email protected] 2 Department of Mechanical Engineering, KU Leuven University, Leuven, Belgium 3 School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK 4 Fetal Medicine Unit, University College London Hospital, London, UK 5 Department of Development and Regeneration, University Hospital Leuven, Leuven, Belgium Abstract. During fetoscopic laser photocoagulation, a treatment for twin-to-twin transfusion syndrome (TTTS), the clinician first identifies abnormal placental vascular connections and laser ablates them to regu- late blood flow in both fetuses. The procedure is challenging due to the mobility of the environment, poor visibility in amniotic fluid, occasional bleeding, and limitations in the fetoscopic field-of-view and image qual- ity. Ideally, anastomotic placental vessels would be automatically identi- fied, segmented and registered to create expanded vessel maps to guide laser ablation, however, such methods have yet to be clinically adopted. We propose a solution utilising the U-Net architecture for performing placental vessel segmentation in fetoscopic videos. The obtained ves- sel probability maps provide sufficient cues for mosaicking alignment by registering consecutive vessel maps using the direct intensity-based tech- nique. Experiments on 6 different in vivo fetoscopic videos demonstrate that the vessel intensity-based registration outperformed image intensity- based registration approaches showing better robustness in qualitative and quantitative comparison. We additionally reduce drift accumulation to negligible even for sequences with up to 400 frames and we incorporate a scheme for quantifying drift error in the absence of the ground-truth. Our paper provides a benchmark for fetoscopy placental vessel segmenta- tion and registration by contributing the first in vivo vessel segmentation and fetoscopic videos dataset. Keywords: Fetoscopy · Deep learning · Vessel segmentation · Vessel registration · Mosaicking · Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020

Upload: others

Post on 13-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

Deep Placental Vessel Segmentation forFetoscopic Mosaicking

Sophia Bano1(�), Francisco Vasconcelos1, Luke M. Shepherd1, EmmanuelVander Poorten2, Tom Vercauteren3, Sebastien Ourselin3, Anna L. David4,

Jan Deprest5, and Danail Stoyanov1

1 Wellcome/EPSRC Centre for Interventional and Surgical Sciences(WEISS) andDepartment of Computer Science, University College London, London, UK

[email protected] Department of Mechanical Engineering, KU Leuven University, Leuven, Belgium3 School of Biomedical Engineering and Imaging Sciences, King's College London,

London, UK4 Fetal Medicine Unit, University College London Hospital, London, UK

5 Department of Development and Regeneration, University Hospital Leuven,Leuven, Belgium

Abstract. During fetoscopic laser photocoagulation, a treatment fortwin-to-twin transfusion syndrome (TTTS), the clinician first identifiesabnormal placental vascular connections and laser ablates them to regu-late blood flow in both fetuses. The procedure is challenging due to themobility of the environment, poor visibility in amniotic fluid, occasionalbleeding, and limitations in the fetoscopic field-of-view and image qual-ity. Ideally, anastomotic placental vessels would be automatically identi-fied, segmented and registered to create expanded vessel maps to guidelaser ablation, however, such methods have yet to be clinically adopted.We propose a solution utilising the U-Net architecture for performingplacental vessel segmentation in fetoscopic videos. The obtained ves-sel probability maps provide sufficient cues for mosaicking alignment byregistering consecutive vessel maps using the direct intensity-based tech-nique. Experiments on 6 different in vivo fetoscopic videos demonstratethat the vessel intensity-based registration outperformed image intensity-based registration approaches showing better robustness in qualitativeand quantitative comparison. We additionally reduce drift accumulationto negligible even for sequences with up to 400 frames and we incorporatea scheme for quantifying drift error in the absence of the ground-truth.Our paper provides a benchmark for fetoscopy placental vessel segmenta-tion and registration by contributing the first in vivo vessel segmentationand fetoscopic videos dataset.

Keywords: Fetoscopy · Deep learning · Vessel segmentation · Vesselregistration · Mosaicking · Twin-to-twin transfusion syndrome.

arX

iv:2

007.

0434

9v1

[cs

.CV

] 8

Jul

202

0

Page 2: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

2 S. Bano et al.

1 Introduction

Twin-to-twin transfusion syndrome (TTTS) is a rare condition during pregnancythat affects the placenta shared by genetically identical twins [6]. It is caused byabnormal placental vascular anastomoses on the chorionic plate of the placentabetween the twin fetuses that disproportionately allow transfusion of blood fromone twin to the other. Fetoscopic laser photocoagulation is a minimally invasiveprocedure that uses a fetoscopic camera and a laser ablation tool. After insertioninto the amniotic cavity, the surgeon uses the scope to identify the inter-twinanastomoses and then photocoagulates them to treat the TTTS. Limited field-of-view (FoV), poor visibility [17], unusual placenta position [13] and limitedmaneuverability of the fetoscope may hinder the photocoagulation resulting inincreased procedural time and incomplete ablation of anastomoses leading to per-sistent TTTS. Automatic segmentation of placental vessels and mosaicking forFoV expansion and creation of a placental vessel map registration may providecomputer-assisted interventions (CAI) support for TTTS treatment to supportthe identification of abnormal vessels and their ablation status.

CAI techniques in fetoscopy have concentrated efforts on visual mosaickingto create RGB maps of the placental vasculature for surgical planning and nav-igation [4,12,19,20,24]. Several approaches have been proposed for generatingmosaics based on: (1) detection and matching of visual point features [12,20];(2) fusion of visual tracking with electromagnetic pose sensing to cater for drift-ing error in ex vivo experiments [24]; (3) direct pixel-wise alignment of gradientorientations for a single in vivo fetoscopic video [19]; (4) deep learning-basedhomography estimation for fetoscopic videos captured from various sources [4];and (5) detection of stable regions using R-CNN and using these regions as fea-tures for placental image registration in an underwater phantom setting [15].While promising results have been achieved for short video sequences [4], long-term mapping remains a significant challenge [19] due to a variety of factorsthat include occlusion by the fetus, non-planar views, floating amniotic fluidparticles and poor video quality and resolution. Some of these challenges can beaddressed by identifying occlusion-free views in fetoscopic videos [5]. Immersionin the amniotic fluid also causes distortions due to light refraction [9,10] thatare often hard to model. Moreover, ground-truth homographies are not availablefor in vivo fetoscopic videos making qualitative (visual) evaluation the widelyused standard for judging the quality of generated mosaics. Quantitative evalu-ation of registration errors in fetoscopic mosaicking has been limited to ex vivo,phantom, or synthetic experiments.

Fetoscopic mosaicking aims to densely reconstruct the surface of the placentabut limited effort has been directed at identification and localisation of vesselsin the fetoscopic video. Placental vasculature can be pre-operatively imagedwith MRI for surgical planning [2], but there is currently no integration withfetoscopic imaging that enables intra-operative CAI navigation. Moreover, thereare no publicly available annotated datasets to perform extensive supervisedtraining. Methods based on a multi-scale vessel enhancement filter [14] have beendeveloped for segmenting vasculature structures from ex vivo high-resolution

Page 3: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

Deep Placental Vessel Segmentation for Fetoscopic Mosaicking 3

photographs of the entire placental surface [1,11]. However, such methods fail onin vivo fetoscopy [22], where captured videos have significantly poorer visibilityconditions, lower resolution, and a narrower FoV.

Identifying vessels and creating an expanded vessel map can support laserablation but remains an open problem to date. In this paper, we propose aframework for generating placental vasculature maps that expand the FoV ofthe fetoscopic camera by performing vessel segmentation in fetoscopic images,registration of consecutive frames, and blending of vessel prediction probabilitymaps from multiple views in a unified mosaic image. We use the U-Net ar-chitecture for vessel segmentation since it is robust even with limited trainingdata [21]. Comparison with the available alternative [22] confirmed the superiorperformance of U-Net for placental vessel segmentation. Alignment of vessel seg-mentations from consecutive frames is performed via direct registration of theprobability maps provided by the U-Net. Finally, multiple overlapping probabil-ity maps are blended in a single reference frame. Additionally, we propose theuse of quantitative metrics to evaluate the drifting error in sequential mosaickingwithout relying on the ground-truth (GT). Such temporal evaluation is crucialwhen GT is not available in surgical video. Our contributions can be summarisedas follows:

– A placental vessel segmentation deep learning network trained on 483 man-ually annotated in vivo fetoscopic images that significantly outperforms theavailable alternative [22], showing accurate results on 6 in vivo sequencesfrom different patients, with significant changes in appearance.

– Validation of fetoscopic image registration driven exclusively from vessel seg-mentation maps. We show that, when vessels are visible, this approach ismore reliable than performing direct image registration. Many of the visibil-ity challenges in lighting conditions and the presence of moving occlusionsare filtered out and vessels are found to have unique recognisable shapes.

– A quantitative evaluation of drift registration error for in vivo fetoscopicsequences, by analysing the similarity between overlapping warped imagesand predicted segmentation maps. This measures the registration consistencyafter sequential registration of multiple frames.

– Contribute the first placental vessel segmentation dataset (483 images from6 subjects) and 6 in vivo video clips, useful for benchmarking results inthis domain. Completely anonymised videos of TTTS fetoscopic procedurewere obtained from the University College London Hospital. This datasetis made publicly available for research purpose here: https://www.ucl.ac.uk/interventional-surgical-sciences/fetoscopy-placenta-data.

2 Vessel Segmentation-based Mosaicking

Our proposed framework consists of a segmentation block followed by the reg-istration block as shown in Fig. 1. A U-Net architecture is used for obtainingthe prediction maps for the vessels (Sec. 2.1). Vessel probability maps from two

Page 4: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

4 S. Bano et al.

Optimisation

Photometricloss

Warping

Imagetransformation

Initialtransformation

RegisteredImages

Segmentedmask A

Segmentedmask BImage B

Image A

DirectRegistration

Stage 1

Stage 2

Stage 4

Decoder3

Decoder2

Decoder1

Stage 3

1x1

Con

v, 2

Stage 0 Decoder0

Max pool 2x2

Up-conv 2x2

CopySegmentation

Network

Stage 1

Stage 2

Stage 4

Decoder3

Decoder2

Decoder1

Stage 3

1x1

Con

v, 2

Stage 0 Decoder0

Max pool 2x2

Up-conv 2x2

CopySegmentation

Network

Encoder Deco

der

Encoder Deco

der

SequentialRegistrationand Blending

Fig. 1: An overview of the proposed framework which is composed of the seg-mentation block followed by the direct registration block for mosaic generation.

consecutive frames are then aligned through affine registration (Sec. 2.2). Thesetransformations are accumulated in sequence with respect to the first frame togenerate an expanded view of the placental vessels.

2.1 U-Net for Placental Vessel Segmentation

U-Net [21] provides the standard architecture for semantic segmentation andcontinues to be the base of many of the state-of-the-art segmentation mod-els [3,18]. U-Net is a fully convolutional network which is preferred for medicalimage segmentation since it results in accurate segmentation even when thetraining dataset is relatively small. Placental vessel segmentation is consideredas a pixel-wise binary classification problem. Unlike [21], which used the binarycross-entropy loss (Lbce), we use the sum of binary cross-entropy loss and inter-section over union (Jaccard) loss during training given by:

L(p, p) = Lbce(p, p) + Liou(p, p),

= −∑

[p(log p) + (1 − p) log(1 − p)]

N+

[1 −

∑(p . p) + δ∑

(p+ p) −∑

(p . p) + δ

],

(1)

where p is the flattened ground-truth label tensor, p is the flattened predictedlabel tensor, N is the total number of pixels in the image and δ = 10−5 is arbi-trary selected to avoid division by zero. We empirically found that the combinedloss (eq. 1) results in improved accuracy compared to when Lbce alone is used.Detailed description of the U-Net can be found in [21]. The vessel probabilitymaps from the U-Net are then used for the vessel registration.

2.2 Vessel Map Registration and Mosaicking

Segmentations from consecutive frames are aligned via registration of the prob-ability maps provided by the segmentation network. We perform registration

Page 5: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

Deep Placental Vessel Segmentation for Fetoscopic Mosaicking 5

on the probability maps, rather than on binary segmentation masks as this pro-vides smoother cost transitions for iterative optimisation frameworks, facilitatingconvergence. We approximate the registration between consecutive frames withaffine transformations. We confirmed the observations in [19] that projectiveregistration leads to worse results. In our in vivo datasets, camera calibrationis not available and lens distortion cannot be compensated, therefore the useof projective transformations can lead to excessive over-fitting to distorted pat-terns on the image edges. Existing literature [19] shows that direct-registrationapproach is robust in fetoscopy where feature-based methods fail due to lack oftexture and resolution [4]. Therefore, we use a standard pyramidal Lucas-Kanaderegistration framework [7] that minimises the bidirectional least-squares differ-ence, also referred to as photometric loss, between a fixed image and a warpedmoving image. A solution is found with the Levenberg-Marquardt iterative al-gorithm. Several available implementations of Levenberg-Marquardt offer theoption of computing the optimisation step direction through finite differences(e. g. MATLAB’s lsqnonlin implementation), however, we have observed thatexplicitly computing it using the Jacobian of the cost function is necessary forconvergence. Given the fetoscopic images have a circular FoV, the registrationis performed with a visibility mask and a robust error metric is applied to dealwith complex overlap cases [8].

After multiple sequential registrations are performed, they can be blendedinto a single image, which not only expands the FoV of the vessel map but alsofilters single-frame segmentation errors whenever multiple results overlap. Thisis achieved by taking the average probability over all available overlapping valuesfor each pixel, leading to the results presented in Sec. 4.

3 Experimental Setup and Evaluation Protocol

We annotated 483 sampled frames from 6 different in vivo TTTS laser abla-tion videos. The image appearance and quality varied in each video due to thevariation in intra-operative environment among different cases, artefacts andlighting conditions, resulting in increased variability in the data (sample im-ages are shown in Fig. 2). We first selected the non-occluded (no fetus or toolpresence) frames through a separate frame-level fetoscopic event identificationapproach [5] since vessels are mostly visible in such frames. The videos werethen down-sampled from 25 to 1 fps to avoid including very similar frames inthe annotated samples and each frame was resized to 448× 448 pixel resolution.The number of clear view samples varied in each in vivo video, hence the numberof annotated images varied in each fold (Table 1). We use the pixel annotationtool6 for annotating and creating a binary mask of vessels in each frame. A fetalmedicine specialist further verified our annotations to confirm the correctness ofour labels. For direct image registration, we use continuous unannotated videoclips from the 6 in vivo videos.

6 Pixel annotation tool: https://github.com/abreheret/PixelAnnotationTool

Page 6: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

6 S. Bano et al.

Fig. 2: Qualitative comparison of U-Net (ResNet101) with the baseline (Saddaet al. [22]) network for the placental vessel segmentation.

Table 1: Six fold cross-validation and comparison of the baseline [22] with U-Netarchitecture (having different backbones) for placental vessel segmentation.

Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Overall

No. of validation images 121 101 39 88 37 97 483

Sadda et al.[22] Dice 0.61 ± 0.17 0.57 ± 0.17 0.54 ± 0.17 0.59 ± 0.17 0.56 ± 0.16 0.50 ± 0.16 0.57 ± 0.17(Baseline) IoU 0.46 ± 0.16 0.42 ± 0.15 0.39 ± 0.16 0.43 ± 0.16 0.40 ± 0.15 0.35 ± 0.16 0.41 ± 0.16

U-Net (Vanilla)[21] Dice 0.82 ± 0.12 0.73 ± 0.17 0.82 ± 0.09 0.70 ± 0.20 0.67 ± 0.19 0.72 ± 0.13 0.75 ± 0.15IoU 0.71 ± 0.13 0.60 ± 0.18 0.70 ± 0.19 0.57 ± 0.22 0.53 ± 0.19 0.58 ± 0.15 0.62 ± 0.17

U-Net (VGG16) Dice 0.82 ± 0.12 0.69 ± 0.14 0.84± 0.08 0.73 ± 0.19 0.70 ± 0.18 0.74 ± 0.14 0.75 ± 0.14IoU 0.71 ± 0.14 0.55 ± 0.16 0.73± 0.11 0.61 ± 0.21 0.56 ± 0.19 0.60 ± 0.16 0.63 ± 0.16

U-Net (Resnet50) Dice 0.84 ± 0.10 0.74 ± 0.14 0.83 ± 0.09 0.74 ± 0.19 0.72± 0.17 0.72 ± 0.16 0.77 ± 0.14IoU 0.74± 0.12 0.61 ± 0.16 0.73± 0.12 0.62± 0.21 0.58± 0.18 0.58 ± 0.17 0.65 ± 0.16

U-Net (Resnet101) Dice 0.85± 0.07 0.77± 0.16 0.83 ± 0.08 0.75± 0.18 0.70 ± 0.18 0.75± 0.12 0.78± 0.13IoU 0.74± 0.10 0.64± 0.17 0.72 ± 0.12 0.62± 0.20 0.56 ± 0.19 0.62± 0.15 0.66± 0.15

We perform 6-fold cross-validation to compare and verify the robustness ofthe segmentation algorithms. Mean Intersection over Union (IoU) and Dice (F1)scores are used to evaluate the segmentation performance (reported in Table 1).We experiment with the vanilla U-Net [21] and with VGG16 [23], ResNet50 [16]and ResNet101 [16] backbones (with pre-trained Imagenet weights) to searchfor the best performing architecture. In each training iteration, a sub-image ofsize 224 × 224 is cropped at random after augmenting the image with rotation,horizontal or vertical flip, and illumination intensity change randomly. This helpsin increasing the data and variation during training. A learning rate of 3e−4 withAdam optimiser and our combined loss (eq. 1) is used. For each fold, training isperformed for 1000 epochs with early stopping and the best performing weightson the training dataset are captured and used to validate the performance onthe left-out (hold) set of frames.

In order to evaluate our segmentation driven registration approach, we com-pare it against standard intensity-based registration. We use the same registra-tion pipeline (described in Sec. 2.2) for both approaches, only changing the inputdata. Quantification of fetoscopic mosaicking performance with in vivo data re-mains a difficult challenge in the absence of GT registration. We propose to

Page 7: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

Deep Placental Vessel Segmentation for Fetoscopic Mosaicking 7

indirectly characterise the drift error accumulated by consecutive registrationsby aligning non-consecutive frames with overlapping FoVs and measuring theirstructural similarity (SSIM) and the IoU of their vessel predicted maps (resultsshown in Fig. 3). We use a non-overlapping sliding window of 5 frames and com-pute the SSIM and IoU between frame 1 and the reprojected non-consecutiveframes (from 2 to 5). We highlight that this evaluation is mostly suitable forstrictly sequential registration, when non-consecutive frames can provide an un-biased measure of accumulated drift.

4 Results and Discussion

We perform comparison of the U-Net (having different backbones) with the ex-isting baseline placental vessel segmentation network [22]. The qualitative com-parison is shown in Fig. 2 and 6-fold cross-validation comparison is reportedin Table 1. Sadda et al. [22] implemented a 3-stage U-Net with 8 convolutionsin each stage and designed a positive class weighted loss function for optimi-sation. Compared to the Vanilla U-Net (IoU = 0.62), the performance of [22]is lower ((IoU = 0.41). When experimenting with U-Net with different back-bones, we found that the results are comparable though U-Net with ResNet101backbone overall showed an improvement of 3% Dice (4% IoU) over the VanillaU-Net. Therefore, we selected U-Net with ResNet101 as the segmentation ar-chitecture for obtaining vessel probability maps for completely unlabelled clipsfrom the 6 fetoscopic videos. The robustness of the obtained vessel probabilitymaps is further verified by the generated mosaics. From Table 1, we note thatthe segmentation results on fold-5 are significantly lower for all methods. Thisis mainly because most of the vessels were thin in this validation set and therewere a few erroneous manual annotations which were rightly detected as falsenegative. From Fig. 2 (video 3, blue circle), we observe that U-Net (ResNet101)even managed to detect small vessels which were originally missed by the humanannotator. In video 5, some vessels around the lower right side of the image weremissed due to poor illumination (indicated by red circle).

The performance comparison between the vessel-based and image-based reg-istration shown in Fig. 3 reports the SSIM and IoU scores for all image/ prediction-map pair overlaps that are up to 5 frames apart. For each of the 6 unseen andunlabelled test video clips, we use the segmentation network trained on theframes from the remaining 5 videos as input for predicting the vessel maps andregistration. It is worth noting that our method has usually lower SSIM val-ues than the baseline between consecutive frames (leftmost box of plots in Fig.3a). This is to be expected since the baseline overfits to occlusions in the scene(amniotic fluid particles, fetus occlusions, etc). However, when analysing theoverlap 5 frames apart our method becomes more consistent. Note from Fig.3b(Video 3 to 6) the gradually decreasing IoU over the 5 consecutive frames for theimage-based registration compared to the vessel-based registration. This effect isbecause of the increasing drift in the image-based registration method resultingin heavy misalignment. The qualitative comparison in Fig. 4 further supports

Page 8: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

8 S. Bano et al.

1 2 3 4 5

Frame distance

0.6

0.7

0.8

0.9

1

Str

uctu

ral S

imila

rity

In

de

x (

SS

IM) Video 1 (400 frames)

1 2 3 4 5

Frame distance

0.6

0.7

0.8

0.9

1Video 2 (200 frames)

1 2 3 4 5

Frame distance

0.6

0.7

0.8

0.9

1Video 3 (50 frames)

1 2 3 4 5

Frame distance

0.6

0.7

0.8

0.9

1Video 4 (100 frames)

1 2 3 4 5

Frame distance

0.6

0.7

0.8

0.9

1Video 5 (100 frames)

1 2 3 4 5

Frame distance

0.6

0.7

0.8

0.9

1Video 6 (100 frames)

(a) Structural Similarity Index (SSIM)

1 2 3 4 5

Frame distance

0

0.1

0.2

0.3

0.4

0.5

Inte

rse

ctio

n o

ve

r U

nio

n (

IoU

)

Video 1 (400 frames)

1 2 3 4 5

Frame distance

0

0.1

0.2

0.3

0.4

0.5Video 2 (200 frames)

1 2 3 4 5

Frame distance

0

0.1

0.2

0.3

0.4

0.5Video 3 (50 frames)

1 2 3 4 5

Frame distance

0

0.1

0.2

0.3

0.4

0.5Video 4 (100 frames)

1 2 3 4 5

Frame distance

0

0.1

0.2

0.3

0.4

0.5Video 5 (100 frames)

1 2 3 4 5

Frame distance

0

0.1

0.2

0.3

0.4

0.5Video 6 (100 frames)

(b) Intersection over Union (IoU)

Fig. 3: Registration performance while using vessel prediction maps (blue) andintensity images (red) as input over 5 frame distance.

(a) Vessel-based Registration (b) Image-based registration

Fig. 4: Image registration on video 1 (400 frames duration). First (blue) and last(red) frames are highlighted. Mosaics are generated relative to the frame closestto the centre (i. e. this frame has the same shape as the original image).

our method’s better performance. Using vessel prediction maps are more robustsince it overcomes challenges such as poor visibility and resolution that accountsfor introducing drift in the image-based registration. Figure 5 shows the qualita-tive results of the vessel-based registration for the 6 leave-one-out unlabelled invivo video clips (refer to the supplementary video for the sequential qualitativecomparison). Note that vessel-based mosaicking not only generated an increasedFoV image but also helped in improving vessel segmentation (occluded or missedvessel) results by blending several registered frames.

Acknowledgments This work was supported by the Wellcome/EPSRC Cen-tre for Interventional and Surgical Sciences (WEISS) at UCL (203145Z/16/Z),EPSRC (EP/P027938/1, EP/R004080/1,NS/A000027/1), the H2020 FET (GA

Page 9: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

Deep Placental Vessel Segmentation for Fetoscopic Mosaicking 9

Fig. 5: Visualisation of the vessel maps generated from the segmentation predic-tions for the 6 in vivo clips. First (blue) and last (red) frames are highlighted.Refer to the supplementary video for the qualitative comparison.

863146) and Wellcome [WT101957]. Danail Stoyanov is supported by a RoyalAcademy of Engineering Chair in Emerging Technologies (CiET1819/2/36) andan EPSRC Early Career Research Fellowship (EP/P012841/1). Tom Vercauterenis supported by a Medtronic/Royal Academy of Engineering Research Chair[RCSRF1819/7/34].

5 Conclusion

We proposed a placental vessel segmentation driven framework for generatingchorionic plate vasculature maps from in vivo fetoscopic videos. Vessel probabil-ity maps were created by training a U-Net on 483 manually annotated placentalvessel images. Direct vessel-based registration was performed using the vesselprobability maps which not only helped in minimising error due to drift but alsocorrected missing vessels that occurred due to partial occlusion in some frames,alongside providing a vascular map of the chorionic plate of the mono-chorionicplacenta. The proposed framework was evaluated through both quantitative andqualitative comparison with the existing methods for validating the segmenta-tion and registration blocks. Six different in vivo video clips were used to validatethe generated mosaics. Our proposed framework along with the contributed ves-sel segmentation and in vivo fetoscopic videos datasets provide a benchmark forfuture research on this problem.

Page 10: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

10 S. Bano et al.

References

1. Almoussa, N., Dutra, B., Lampe, B., Getreuer, P., Wittman, T., Salafia, C., Vese,L.: Automated vasculature extraction from placenta images. In: Medical Imaging2011: Image Processing. vol. 7962, p. 79621L. International Society for Optics andPhotonics (2011)

2. Aughwane, R., Ingram, E., Johnstone, E.D., Salomon, L.J., David, A.L., Mel-bourne, A.: Placental MRI and its application to fetal intervention. Prenatal diag-nosis (2019)

3. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: A deep convolutionalencoder-decoder architecture for image segmentation. IEEE Transactions on Pat-tern Analysis and Machine Intelligence 39(12), 2481–2495 (2017)

4. Bano, S., Vasconcelos, F., Amo, M.T., Dwyer, G., Gruijthuijsen, C., Deprest, J.,Ourselin, S., Vander Poorten, E., Vercauteren, T., Stoyanov, D.: Deep sequentialmosaicking of fetoscopic videos. In: International Conference on Medical ImageComputing and Computer-Assisted Intervention. pp. 311–319. Springer (2019)

5. Bano, S., Vasconcelos, F., Vander Poorten, E., Vercauteren, T., Ourselin, S., De-prest, J., Stoyanov, D.: FetNet: A recurrent convolutional network for occlusionidentification in fetoscopic videos. International Journal of Computer Assisted Ra-diology and Surgery 15(5), 791–801 (2020)

6. Baschat, A., Chmait, R.H., Deprest, J., Gratacos, E., Hecher, K., Kontopoulos, E.,Quintero, R., Skupski, D.W., Valsky, D.V., Ville, Y., et al.: Twin-to-twin transfu-sion syndrome (TTTS). Journal of Perinatal Medicine 39(2), 107–112 (2011)

7. Bouguet, J.Y., et al.: Pyramidal implementation of the affine lucas kanade featuretracker description of the algorithm. Intel Corporation 5(1-10), 4 (2001)

8. Brunet, F., Bartoli, A., Navab, N., Malgouyres, R.: Direct image registration with-out region of interest. In: Vision, Modeling, and Visualization. pp. 323–330 (2010)

9. Chadebecq, F., Vasconcelos, F., Dwyer, G., Lacher, R., Ourselin, S., Vercauteren,T., Stoyanov, D.: Refractive structure-from-motion through a flat refractive inter-face. In: Proceedings of the IEEE International Conference on Computer Vision.pp. 5315–5323 (2017)

10. Chadebecq, F., Vasconcelos, F., Lacher, R., Maneas, E., Desjardins, A., Ourselin,S., Vercauteren, T., Stoyanov, D.: Refractive two-view reconstruction for under-water 3D vision. International Journal of Computer Vision pp. 1–17 (2019)

11. Chang, J.M., Huynh, N., Vazquez, M., Salafia, C.: Vessel enhancement with mul-tiscale and curvilinear filter matching for placenta images. In: International Con-ference on Systems, Signals and Image Processing. pp. 125–128. IEEE (2013)

12. Daga, P., Chadebecq, F., Shakir, D.I., Herrera, L.C.G.P., Tella, M., Dwyer, G.,David, A.L., Deprest, J., Stoyanov, D., Vercauteren, T., et al.: Real-time mosaicingof fetoscopic videos using SIFT. In: Medical Imaging 2016: Image-Guided Proce-dures, Robotic Interventions, and Modeling. vol. 9786, p. 97861R. InternationalSociety for Optics and Photonics (2016)

13. Deprest, J., Van Schoubroeck, D., Van Ballaer, P., Flageole, H., Van Assche, F.A.,Vandenberghe, K.: Alternative technique for nd: Yag laser coagulation in twin-to-twin transfusion syndrome with anterior placenta. Ultrasound in Obstetrics andGynecology: The Official Journal of the International Society of Ultrasound inObstetrics and Gynecology 11(5), 347–352 (1998)

14. Frangi, A.F., Niessen, W.J., Vincken, K.L., Viergever, M.A.: Multiscale vessel en-hancement filtering. In: International Conference on Medical Image Computingand Computer-Assisted Intervention. pp. 130–137. Springer (1998)

Page 11: 1 arXiv:2007.04349v1 [cs.CV] 8 Jul 2020 · registration Mosaicking Twin-to-twin transfusion syndrome. arXiv:2007.04349v1 [cs.CV] 8 Jul 2020. 2 S. Bano et al. 1 Introduction Twin-to-twin

Deep Placental Vessel Segmentation for Fetoscopic Mosaicking 11

15. Gaisser, F., Peeters, S.H., Lenseigne, B.A., Jonker, P.P., Oepkes, D.: Stable imageregistration for in-vivo fetoscopic panorama reconstruction. Journal of Imaging4(1), 24 (2018)

16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 770–778(2016)

17. Lewi, L., Deprest, J., Hecher, K.: The vascular anastomoses in monochorionic twinpregnancies and their clinical consequences. American journal of obstetrics andgynecology 208(1), 19–30 (2013)

18. Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M.,Van Der Laak, J.A., Van Ginneken, B., Sanchez, C.I.: A survey on deep learningin medical image analysis. Medical image analysis 42, 60–88 (2017)

19. Peter, L., Tella-Amo, M., Shakir, D.I., Attilakos, G., Wimalasundera, R., Deprest,J., Ourselin, S., Vercauteren, T.: Retrieval and registration of long-range overlap-ping frames for scalable mosaicking of in vivo fetoscopy. International journal ofcomputer assisted radiology and surgery 13(5), 713–720 (2018)

20. Reeff, M., Gerhard, F., Cattin, P., Gabor, S.: Mosaicing of endoscopic placentaimages. INFORMATIK 2006–Informatik fur Menschen, Band 1 (2006)

21. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedi-cal image segmentation. In: International Conference on Medical image computingand computer-assisted intervention. pp. 234–241. Springer (2015)

22. Sadda, P., Imamoglu, M., Dombrowski, M., Papademetris, X., Bahtiyar, M.O.,Onofrey, J.: Deep-learned placental vessel segmentation for intraoperative videoenhancement in fetoscopic surgery. International journal of computer assisted ra-diology and surgery 14(2), 227–235 (2019)

23. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale im-age recognition. In: International Conference on Learning Representations (2015)

24. Tella-Amo, M., Peter, L., Shakir, D.I., Deprest, J., Stoyanov, D., Vercauteren, T.,Ourselin, S.: Pruning strategies for efficient online globally consistent mosaickingin fetoscopy. Journal of Medical Imaging 6(3), 035001 (2019)