look over here: attention-directing composition of manga elements ying cao rynson w.h. lau antoni b....

40
Look Over Here: Attention- Directing Composition of Manga Elements Ying Cao Rynson W.H. Lau Antoni B. Chan SIGGRAPH 2014 1

Upload: ann-cooper

Post on 14-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Look Over Here: Attention-Directing Composition of Manga ElementsYing CaoRynson W.H. LauAntoni B. Chan

SIGGRAPH 20141

Outline• Introduction • Overview• Data Acquisition and Preprocessing • Probabilistic Graphical Model• Learning• Interactive Composition Synthesis• Evaluation and Results • Discussion

2

Introduction • Goal

3

1.Rabbit, I came here for gold,

2. and I'm gonna get it!

3. I gotcha, you rabbit! I'll show you!

Close-upFast

LongMedium

Close-upMedium

Big Close-upMedium

MediumMedium

You can't do this to me!

Eureka! Gold at last!

Eureka! Gold at last!

1

2

3

4

5

Talk

Introduction• The especially composition of manga elements . subjects ( ) and balloons( )

• Manga artist guides viewer’s eyes through the page via subject and balloon placement.

• The path guiding the readers through the artworks the underlying artist’s guiding path (AGP) • The viewer’s eye-gaze path through the page the actual viewer attention

4

Introduction• We introduce a novel probabilistic graphical model for

subject-balloon composition.

• Based on this model, we propose an approach for placing a set of subjects and their balloons on a page.

• In response to high-level user specification, and evaluate its effectiveness through a series of visual perception studies.

5

Overview

6Probabilistic Graphical Model

𝐆 𝐂 𝐀Artist’s Guiding Path

Composition

Viewer Attention

Input Storyboard Layout

Generate

Resulting composition

DataAnnotation Eye-tracking Data

LearnInput Infer

Data Acquisition and Preprocessing • To train our probabilistic model, we have collected a data set

comprising 80 manga pages from three different series.

7

Shot type→Motion state→

Balloons→

↓Subject

Annotation Eye movements of viewers

Probabilistic Graphical Model• We propose a novel probabilistic graphical model to

hierarchically connect artist’s guiding path, composition and viewer attention in a probabilistic network.

• Abstracts the artist’s guiding path (AGP) as a latent variable in our model.

8

Probabilistic Graphical Model

𝐆 𝐂 𝐀Artist’s Guiding Path

Composition

Viewer Attention

Probabilistic Graphical Model• Our proposed model consists of 6 components, representing

different factors that influence the placement of elements on the page.

9

(1)-Model Components and Variables

• In our model, the page consists of a set of panels.

• Each panel has subjects, each of which has balloons.

10

(1)-Model Components and Variables • Artist’s Guiding Path(AGP)Underlying AGP (f(t)) and actual AGP (I(t)) are represented as

smooth splines over the page.

Uniformly samples control points along the curve length,

11

actual AGP:

underlying AGP:

(1)-Model Components and Variables • Panel Properties and Local Composition ModelWe consider both semantic (i.e., shot type and motion state)

and geometric (i.e., rough shape) properties of the panels.

12

eometric style 1 = 1, geometric style 2 = 2, geometric style 3 = 3}

long = 1, medium = 2, close-up = 3, big close-up = 4

slow = 1, medium = 2, fast = 3

(1)-Model Components and Variables • Panel Properties and Local Composition ModelWe define as the possible subject locations and

sizes according to the local composition in the panel

13

(1)-Model Components and Variables • Subject PlacementThe actual placement of a subject is a mixture of its local

position and an associated point on the global AGP.We denote the subject’s location and size as .

14

(1)-Model Components and Variables • Balloon PlacementThe placement of a balloon depends on its subject’s configuration

, its size , and reader order , as well as an associated point on the AGP. We denote the balloon’s position and size as .

15

(1)-Model Components and Variables • Viewer Attention TransitionsFor each panel, we define a set of binary variables ,

where indicates that there is a viewer transition between elements and .

16

(1)-Model Components and Variables • Complete model by putting the six model components

together.

17

(2)- Probability Distributions

• Each random variable in our model is associated with a conditional probability distribution (CPD), , which represents the probability of observing given its parents

.

• We next describe the CPDs used for each variable in our model.

18

(2)- Probability Distributions• Artist’s Guiding Path (f, I).The two coordinate components of the curve are modeled as

two independent Gaussian processes,

- , : the squared exponential covariance functions

The actual AGP I is a noisy version of the underlying AGP f,

- denotes a multivariate Gaussian distribution of x, with mean µ and covariance Σ.

19

(2)- Probability Distributions• Panel Properties (P).The shot type t, motion state m and geometric style g are all

discrete random variables with categorical distributions,

• Local Composition (, ).To describe the complexities of local foreground placement ,

we use a Gaussian mixture model (GMM),

The local subject size is Gaussian, .

20

(2)- Probability Distributions• Subjects and Balloons (S, B).Let be the continuous parent variables of . For

the subject S, we have

Similarly, let be the continuous parent variables of . For the balloon B, we have

- For subject size , we define , with ω and being weight parameter and variance.

21

(2)- Probability Distributions• Viewer Attention Transitions (U = {}).Let be a set of parent random variables of . We define the

CPD of as

- We define .

The potential function is a linear combination of two terms,

22

Learning

• The goal of the offline learning stage is to estimate the parameters θ in the CPDs of all random variables in the probabilistic model, from the training set D.

expectation-maximization (EM) algorithm [Bishop 2006]

23

BISHOP, C. 2006. Pattern Recognition and Machine Learning. Springer.

Interactive Composition Synthesis• Generate a composition, subject to user-specified semantics

• Layout Generation + Composition Synthesis

24

1.Rabbit, I came here for gold,

2. and I'm gonna get it!

3. I gotcha, you rabbit! I'll show you!

Input:

subject & script

Close-up

Fast

shot type & motion state

Talk

inter-subject constraint

(1)-Layout Generation• We use a simple search algorithm to retrieve the best-fitting

layout from our database of labeled pages.

for i-th panel of the input and layout candidate- : shot type - : motion state- : the number of elements

25

(2)-Composition via MAP Inference

• The objective of MAP(Maximum A Posteriori) is to find a solution to that maximizes the posterior probability,

26

Configurations of elements

Input elements & semantics + Layout

𝐘𝐶Constraints

(2)-Composition via MAP InferenceConstraint-based Likelihood.

-where {ρi} are weights controlling importance of different terms. -Our implementation uses ρ1 = ρ2 = 0.3, ρ3 = ρ4 = 0.2.

27

(2)-Composition via MAP InferenceConstraint-based Likelihood.

28

: overlap term : order term

(2)-Composition via MAP InferenceConstraint-based Likelihood.

29

: boundary term

subject relation term

𝑟 𝑖 𝑟 𝑗 𝐯 ij

Evaluation and Results

30

(1)-Comparison to Heuristic Method • Visual Perception Study.The goal of the visual perception study is to investigate if the

participants have a strong preference for our results over those produced by the heuristic methodt[Chun et al. 2006].

31

CHUN, B., RYU, D., HWANG, W., AND CHO, H. 2006. An automated procedure for word balloon placement in cinema comics. LNCS 4292, 576–585..

(1)-Comparison to Heuristic Method • Visual Perception Study.

32

(1)-Comparison to Heuristic Method • Eye-tracking experiment and analysis.We measure the consistency in both unordered and ordered

eye fixations across different viewers. Inlier percent [Judd et al. 2009] Root Mean Squared Distance (RMSD)

33

JUDD, T., EHINGER, K., DURAND, F., AND TORRALBA, A. 2009. Learning to predict where humans look. In ICCV’09.

InliersViewer A Saliency Map

Viewer B

Classification

RMSD,

Viewer A Viewer B

(1)-Comparison to Heuristic Method • Eye-tracking experiment and analysis. Shows example compositions with eye-tracking data.

34

(2)-Comparison to Manual Method

35

Participant preference voting Time for one composition

140

45

(3)-Comparison to Existing Manga Pages

36

(4)-Recovering Artist’s Guiding Path

37

(5)-Limitations • Our work has two limitations.1. Our work assumes that the variations in spatial location and

scale of elements are the only factors driving viewer attention.

2. For the panel with more than four subjects, our approach can fail to produce satisfying results automatically.

38

Discussion • We have proposed a probabilistic graphical model for

representing dependency among the artist’s guiding path, composition and viewer attention.

• We show that compositions from our approach are more visually appealing and provide a smoother reading experience, as compared to those by a heuristic method.

• Enable easy and quick creation of attention-directing compositions.

• Extend to other graphic design tasks. 39

References• manga pic http://goo.gl/O2HNXb

40