Learning to Generate Posters of Scientific Papers by Probabilistic Graphical Models

02/21/2017 ∙ by Yu-ting Qiang, et al. ∙ Disney Research FUDAN University 0

Researchers often summarize their work in the form of scientific posters. Posters provide a coherent and efficient way to convey core ideas expressed in scientific papers. Generating a good scientific poster, however, is a complex and time consuming cognitive task, since such posters need to be readable, informative, and visually aesthetic. In this paper, for the first time, we study the challenging problem of learning to generate posters from scientific papers. To this end, a data-driven framework, that utilizes graphical models, is proposed. Specifically, given content to display, the key elements of a good poster, including attributes of each panel and arrangements of graphical elements are learned and inferred from data. During the inference stage, an MAP inference framework is employed to incorporate some design principles. In order to bridge the gap between panel attributes and the composition within each panel, we also propose a recursive page splitting algorithm to generate the panel layout for a poster. To learn and validate our model, we collect and release a new benchmark dataset, called NJU-Fudan Paper-Poster dataset, which consists of scientific papers and corresponding posters with exhaustively labelled panels and attributes. Qualitative and quantitative results indicate the effectiveness of our approach.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

page 8

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The emergence of a large number of scientific papers in various academic fields and venues (conferences and journals) is noteworthy. For example, ArXiv, a premiere on-line scientific repository, reports upload rate of over 9,000 papers and reports a month in 2016. It is time-consuming to read and digest all of these papers for researchers, particularly those interested in holistically assess state-of-the-art, or understanding of just core scientific ideas explored in the last year. Converting a scientific paper into a poster provides an important way to efficiently and coherently convey core ideas and findings of the original paper.

To achieve this goal, it is therefore essential to keep the posters readable, informative and visually aesthetic. It is challenging, however, to design a high-quality scientific poster which meets all of the above design principles, particularly for novice researchers who may not be proficient at design tasks or familiar with design packages (e.g., Adobe Illustrator). In general, poster design is a complicated and time-consuming task; both understanding of the paper content and experience in design are required.

Automatic tools for scientific poster generation would help researchers by providing them with an easier way to effectively share their research. Further, given avid amount of scientific papers on ArXiv and other on-line repositories, such tools may also provide a way for other researchers to consume the content more easily. Rather than browsing raw papers, they may be able to browse automatically generated poster previews (potentially constructed with their specific preferences in mind).

Page layout generation [16, 13, 24], has been popular in recent years with the goal of generating graphical design layout, such as photo collage[8], furniture object arrangements[33, 21], comics panel layouts [4] and so on. These works pay more attention on visual aesthetics than informativeness and readability. On the other hand, there are also lots of works that study presentation layout automation [14, 18, 27], which aim at document generation. These works often focus on micro-typography problems such as line breaking, margins inference and so on. In addition, some works utilize templates as input to their layout algorithms [5].

In general, in order to generate a scientific poster in accordance with, and representative of, the original paper, many problems need to be solved: (1)Content extraction. Both important textual and graphical content needs to be extracted from the original paper; (2) Panel layout. Content should fit each panel; and the shape and position of panels should be optimized for readability and design appeal; (3) Graphical element (figure and table) arrangement. Within each panel, textual content can typically be sequentially presented, but for graphical elements, their size and placement should be carefully considered. Due to these challenges, to our knowledge, no automatic tools for scientific poster generation exist.

In this paper, we propose a data-driven method for automatic scientific poster generation (given a corresponding paper). Content extraction and layout generation are two key components in this process. For content extraction, we use TextRank [22] to extract textual content, and provide an interface for extraction of graphical content (e.g., figures, tables, etc.). Our approach focuses primarily on poster layout generation. We address the layout in three steps. First, we propose a probabilistic graphical model to infer panel attributes. Second, we introduce a tree structure to represent panel layout, based on which we further design a recursive algorithm to generate new layouts. Third, in order to synthesize layout within each panel, we train another probabilistic graphical model to infer the attributes of graphical elements.

Compared with posters designed by the authors, our approach is more efficient and versatile. Our approach can generate results that adapt to different paper sizes/aspect ratios or styles, by training our model with different dataset.

To the best of our knowledge, this paper presents the first method for scientific poster generation from the original academic papers. A preliminary version of this work appeared as a conference paper [34]. This paper extends the previous version in the following perspectives: (1)Enlarged dataset. We have enlarged and released our dataset111see http://www.ytqiang.com/ to the community as a new benchmark dataset for evaluating the problem of scientific poster generation. (2)Improved methodology.

We improve our method in several ways: (1) we propose a novel loss function to evaluate the panel arrangement, which helps our algorithm to find better panel layouts. (2) We refine the probabilistic graphical model framework for element composition within each panel, this refinement takes some design principles into consideration and makes our approach more effective. (3)

Additional Experiments. We provide more detailed performance analysis and extensive experiments to show the effectiveness of the new method.

The remainder of this paper is organized as follows. The related works is briefly introduced in Section 2. In Section 3, we describe our dataset and preprocessing work in detail. In section 4 and 5, we present a high-level overview and key components of our method separately. Experiments and evaluation are discussed in Section 6.

2 Related Work

In this section, we review three heavily studied topics of page layout generation, i.e., general graphical design (Sec. 2.1), comic layout generation (Sec. 2.2) and presentation layout automation (Sec. 2.3), and the differences between these topics and our task of scientific poster generation.

2.1 General Graphical Design

Graphical design has been studied extensively in computer graphics community. This involves several related, yet different topics. Geigel et al. [8]

made use of genetic algorithm

[12, 9] for photo album layout, which addresses the placement of each photo in an album. Yu et al. [33] automatically synthesized furniture objects arrangements using simulated annealing algorithm. In contrast, Merrell et al. [21] applied some simple design guidelines to solve a similar problem. Other graphical design problems such as interface design [7], circuit board layout [31], and graph layout [2] have also been studied. These works often present an optimization framework along with some design guidelines to synthesize and evaluate plausible layouts.

Nevertheless, these works are concerned more about graphical elements (e.g., photo, furniture), and they take visual aesthetics as the highest priority. In contrast, for scientific poster generation, textual content, original paper structure and the order of contents need to be considered to ensure the readability of a scientific poster.

2.2 Comic Layout Generation

Due to the popularity of comics, many related research topics, such as manga retargeting [20], comic episodes generation [11] and manga-like rendering [30] have drawn considerable research attention in computer graphics community. Particularly, several techniques have been studied to facilitate layout generation. For example, Arai et al. [1] and Pang et al. [26] studied how to automatically extract each panel from e-comics; and display e-comics on different devices. In order to convert conversational videos to comics, Jing et al. [17] made use of a rule based optimization scheme for layout generation. Cao et al. [3] presented a generative probabilistic framework to arrange input artworks into a manga page, and then used optimization techniques to refine it. Furthermore, Cao et al. [4] took text balloons and picture subjects into consideration for manga layout generation and guided the reader’s attention. However, in our poster generation, one has to consider both texts and graphical elements composition within each panel, which has not been discussed previously.

Our panel layout generation method is partly inspired by the recent work on manga layout [3]. We use a binary tree to represent the panel layout. By contrast, the manga layout trains a Dirichlet distribution to sample a splitting configuration, and different Dirichlet distribution for each kind of instance has to be trained as a result. Instead, we propose a recursive algorithm to search for the best splitting configuration along a binary tree.

2.3 Presentation Layout Automation

The emergence of data and information that we need to present, challenges our ability to present it manually; thus, automated layout of presentations is becoming increasingly important [14]. For automated document formatting, early works, such as [18, 27], focused largely on line breaking, paragraph arrangement and some other micro-typography problems. A common way to solve these problems is as a constrained optimization problem [19]. More recent works pay attention to presentation document layout. Jacobs et al. [15] presented a grid based dynamic programming method to select a page layout template. Damera-Venkata et al. [5] made use of Probabilistic Document Model (PDM) to facilitate document layout. By contrast, we focus on both macro-typography problems (e.g panel layout) and micro-typograph (e.g. graphical elements size decision) in this paper. Additionally, rather than using simple design guidelines as previous work [18, 27], we learn our layout generating model from the annotated training datasets.

Another piece of relate work is called single page graphical design [24]

, which made use of an energy-based model derived from design principles for graphic design layout. However, they regard texts as a rectangle block rather than text flow, which is inappropriate for scientific poster generation. Harrington

et al. [10] described a measure of document aesthetics, and an aesthetics driven layout engine is proposed in [29]. However, these approaches do not put constraints on the ordering of content, which is clearly important for scientific poster generation.

3 The NJU-Fudan Paper-Poster Da-
taset

Fig. 1: An example of human designed poster.

In this paper, we propose a new research topic of learning to generate posters of scientific papers. In general, a good poster for a scientific paper should follow the general design principles. One section in the paper should correspond to one panel in the poster. Each panel usually includes several bullet points and sentences that explain the corresponding bullet point, and each bullet point often corresponds to a sub-section or a paragraph in the paper. Important figures and tables in each paper section would also be included in the corresponding poster panel. Figure 1 shows such an example of human designed poster [28]. This type of scientific poster is readable, informative and visually aesthetic since it considers both the structure and key messages conveyed by the original paper, which makes it easy for readers to understand.

To further study the tasks of poster generation for scientific papers, we introduce a NJU-Fudan Paper-Poster dataset which contains pairs of scientific posters and their corresponding papers. A total of 85 computer science research paper-poster pairs were collected from an online website.

We further annotate the meta information for each paper-poster to facilitate the research of this topic. For each poster, we label both layout attributes (e.g. panel position, figure size) and content attributes (e.g. text length in each panel). In the corresponding paper, layout related information (e.g. figure size in original paper) is also manually labelled. We also provide annotation tool which can enable the annotation and labeling of further data. Both the dataset and annotation tool will be released.

4 Overview

Overview. To generate a readable, informative and aesthetic poster, we simulate the rule-of-thumb on how the researchers design posters in practice. We generate the panel layout for a scientific poster first, and then arrange the textual and graphical elements within each panel. As shown in Figure 2, the framework overall has four steps, namely, content extraction, panel attributes inference, panel layout generation, and composition within each panel.

Fig. 2: Overview of the proposed approach.

Problem Formulation. We formally introduce the problem of learning to generate posters of scientific papers before developing our contributions to each section. We have a set of posters M and their corresponding scientific papers. Each poster includes a set of panels , and each panel has a set of graphical elements (figures and tables) . Each panel is characterized by six attributes:

  • text length within a panel ();

  • text ratio (), text length within a panel relative to text length of the whole poster, ;

  • number of graphical elements within a panel ();

  • graphical elements ratio (), the size of graphical elements within a panel relative to the total size of graphical elements in the poster. Note that there is a little difference between and . Here instead of predicting the fixed figure size in poster, we directly use the corresponding figure from original paper;

  • panel size () and aspect ratio (), and , where and denote the width and height of a panel with respect to the poster.

Each graphical element has four attributes:

  • graphical element size () and aspect ratio (), and , where and denote the width and height of a graphical element relative to the whole paper respectively;

  • horizontal position (), inspired by the way how latex beamer makes poster, we arrange that panel content sequentially from top to bottom; hence only relative horizontal position needs to be considered, which is defined by a discrete variable ;

  • graphical element size in poster (), the ratio of the width of the graphical element with width of the panel it belongs to.

To learn how to generate a poster, our goal is to determine the above attributes for each panel and each graphical element , as well as the arrangement of the panels.

Intuitively, a trivial solution is to use a learning model (e.g., Support Vector Regression (SVR)) to learn how to regress these attributes, including

, , , and , while regarding attributes which can be known according to corresponding scientific paper (i.e. , , ,, and ) as features. However, such a solution takes those features as a whole, thereby lacks an insight mechanism for exploring the relationships between specific attributes (e.g. and

). It may fail to meet the requirements of readability, informativeness and aesthetics. We thus propose a Bayesian network to characterize the relationships among those attributes, where the Bayesian network is trained on the paper-poster dataset we collected. Then according to the Bayesian network we trained, we can infer the layout attributes by using likelihood-weighted sampling method.

5 Methodology

In this section, we will further explain each step of our framework as illustrated in Figure 2. Particularly, (1) in Sec. 5.1 we extract from the paper the text content and graphical content. The textual content can be summarized by the textual summary algorithms; and the graphical content (figures and tables) would usually occupy a rectangular area of the poster, and be extracted by user interactions. All extracted contents are sequentially arranged. (2) Inference of the key attributes for initial panel (such as panel size and aspect ratio ) is then conducted by learning a probabilistic graphical model from the training data in Sec. 5.2. (3) Furthermore, Sec. 5.3 synthesizes panel layout by developing our recursive algorithm to further update these key attributes (i.e., ) and generate an informative and aesthetic panel layout. (4) Finally, we compose these panels by utilizing our graphical algorithm to further synthesize the visual properties of each panel (such as the size and position of its graphical elements) in Sec. 5.4.

5.1 Content Extraction

Content extraction, which includes both textual content extraction and graphical content extraction, is the first step in our proposed scientific poster generation system.

For textual content, we employ the state-of-the-art textual summary algorithm to summarize the content of each section. In particular, we use TextRank [22].

For graphical content, our algorithm will parse the key meta data of the layout (i.e. width and height) of each figure and table. To better select the most important figures/tables, we add user interaction here to rank the importance of the tables and figures.

5.2 Panel Attributes Inference

We assume that in the poster each section should be represented by one rectangular panel, which should not only be of an appropriate size to contain the textual and graphical content of each corresponding section, but also be in a reasonable shape (aspect ratio) to maximize visually aesthetic appearance.

To enable such a goal, we learn a Bayesian network to infer the initial size and aspect ratio for each panel. As each panel is composed of both textual description and graphical elements, we assume that panel size () and aspect ratio () are conditionally dependent on text ratio , number of graphical elements and graphical element ratio

. Therefore, we define the joint probability of a set of panels

as,

(1)

where , and denote attributes set. and

are conditional probability distributions (CPDs) of

and given , and

. We further model them as two conditional linear Gaussian distributions:

(2)
(3)

where and are defined by the content extraction step demonstrated in Figure 2; and are parameters that leverage the influence of various factors; and

are the variances. The parameters (

, , and

) are estimated using maximum likelihood estimator.

Note that in order to learn from limited data, this step actually employs two assumptions: (1) and are conditionally independent; (2) the attribute sets for panels are independent.

0:    Panels which we learned from graphical model;rectangular page area , , , .
0:    
1:  if  then
2:     adjust panels to adapt to the whole rectangular area, return the aesthetic loss: ;
3:  else
4:     for each  do
5:        ;
6:         = Panel Arrangement( , , , , );
7:         = Panel Arrangement(, , , , );
8:        if  then
9:           ;
10:           record this arrangement;
11:        end if
12:         = Panel Arrangement(, , , , );
13:         = Panel Arrangement(, , , , );
14:        if  then
15:           ;
16:           record this arrangement;
17:        end if
18:     end for
19:  end if
20:  return  Loss and arrangement.
Algorithm 1 Panel layout generation

We need the panels to be neither too small in size (), nor too distorted in aspect ratio (), to ensure a readable, informative and aesthetic poster. The two assumptions introduced here are sufficient for this task. Furthermore, the attribute values estimated in this step are just good initial values for the property of each panel. We use the next two steps to further relax these assumptions and discuss the relationship between and , as well as the relationship among different panels (Algorithm 1).

To ease exposition, we denote the set of panels as , where and are the size and aspect ratio of -th panel , separately; and .

Fig. 3: Panel layout and the corresponding tree structure. The tree structure of a poster layout contains five panels. The first splitting is vertical with the splitting ratio (0.5, 0.5). The poster is further divided into three panels in the left, and two panels in the right. This makes the whole page as two equal columns. For the left column, we resort to a horizontal splitting with the splitting ratio (0.33, 0.67). The larger one is further horizontally divided into two panels with the splitting ratio (0.5, 0.5). We only split the right column once, with the splitting ratio (0.5, 0.5).

5.3 Panel Layout Generation

One conventional way to design posters is to simply arrange them in two or three columns style. This scheme, although simple, however, makes posters designed in this way look similar. Inspired by manga layout generation [3], we propose a more vivid panel layout generation method. Specifically, we arrange the panels with a binary tree structure to help represent the panel layout. The panel layout generation is then formulated as a process of recursively splitting of a page, as illustrated and explained in Figure 3.

Conveying information is the most important goal for a scientific poster, thus we attempt to maintain the relative size for each panel during panel layout generation. This motivates the following loss for the panel shape variation,

(4)

where is the aspect ratio of a panel after optimization.

On the other hand, we also evaluate the aesthetic for the splitting configuration. In our approach, the splitting configuration is composed of several splittings. Each splitting divides a set of panels into two parts, and the splitting ratio is decided by the ratio of the total size of the two panels. Since symmetry is an important guideline for design works, we evaluate the aesthetic for the panel layout configuration based on the symmetry of each partition. In particular, if a panel set is divided by a splitting as and , then the aesthetic loss for this splitting is defined as follow:

(5)

The loss for panel shape variation (Eq. 4) and splitting configuration (Eq. 5) lead to a combined loss for the panel layout arrangement

(6)

where is the panel set after optimization and is the set of splitting steps.

In each splitting step, the combinatorial choices for splitting positions can be recursively computed and compared with respect to the loss function above. We choose the panel attributes with the lowest loss (Eq. 6). The whole algorithm is summarized in Algorithm 1.

5.4 Composition within a Panel

Having inferred the layout of panels, we turn our attention to the composition of raw contents within each panel. Generally, each panel in a scientific poster is composed of textual and graphical content. Considering the readability of a scientific poster, each panel can be filled by these contents sequentially. However, for aesthetic consideration, the horizontal position and size of each graphical element need to be specified carefully. Therefore, we pose automated panel composition as an inference problem in a Bayesian network that incorporates some design constraints.

Designing the composition for each panel is complicated, both panel attributes and raw contents need to be considered. We aim at designing a Bayesian network to characterize how these variables interact with each other. Given the placement of each graphical element, textual contents can be filled into the panel sequentially; therefore, the composition of a panel can be defined by the horizontal position () and the size of each graphical element (

). In our approach, the layout within each panel is composed by first sampling random variable

representing the choice of horizontal position (left, right, center), and then sampling the variable representing the size of a graphical element.

In our Bayesian network, horizontal position () of a graphical element relies on both the shape () of the panel which the element belongs to and attributes of the element (, ) itself. For example, a portrait figure is more likely to be presented in the left or right of a landscape panel. To describe such relationship, the horizontal position of a graphical element in panel is sampled from a soft-max function,

(7)

where is the cardinality of the value set of ; is th row of .

For the size of a graphical element (), it has to meet two requirements: on the one hand, it needs to be appropriate to fill the panel; on the other hand, it also needs to harmonize with the occupation of the graphical element in the original paper. To this end, in our model, the size of each graphical element () is governed by both the panel attributes (, ) and it’s own properties (, ). We may sample the size of the graphical elements from the conditional linear Gaussian distribution,

(8)

where is the parameter to balance the influence of different factors.

For a set of graphical elements which belongs to the same panel , the probability of sampling process described above is simply the product the probabilities of all design choice made during the sampling process, it can be represented by the following distribution,

(9)

where and denote the assignments of horizontal position and size in panel for all graphical elements in , respectively; and represent the input attributes of .

Learning. The goal of the learning stage in this step is to estimate the parameters in our Bayesian network from training data, this can be done by maximizing the complete-data log likelihood since all the random variables in our model are observed. For conditional linear gaussian distribution (Eq. 8 ), with some algebraic manipulation we can compute the optimal ML estimate of and in a closed form:

(10)

where denote the training data. For soft-max function (Eq. 7), while there is no known closed-form ML solution, we can resort to an iterative optimization algorithm – iteratively reweighted least squares (IRLS) algorithm.

The Bayesian network described above models the relationship between different variables explicitly. However, it is also desirable to consider the relationship between panel size and content occupation. In a human designed poster, contents usually fill each panel up exactly, which makes the poster seems clean and informative. Therefore, we incorporate the design principles with our Bayesian network, and our goal is to find solution to this function:

(11)

in the equation above, the first term is defined in Eq. 9, it is a likelihood that determines how well the solution fits our Bayesian network. The second term measures how well the contents fit the panel size, and it assigns high probability if the contents fill the panel precisely and lower probability for deviations from the ideal.

Since the exact MAP inference is not tractable in our model, we perform approximate inference by using likelihood-weighted sampling method [23].

6 Experimental Results

6.1 Experimental Setup

NJU-Fudan Paper-Poster Dataset. Our dataset includes well-designed pairs of scientific papers and their corresponding posters, which is selected from publicly available pairs we collected. These papers are all about computer science topics, and their posters have relatively similar design styles. We further annotate panel attributes, such as panel width, panel height and so on. The annotated meta data is saved into an XML file.

Implementation details. The input content to our scientific poster generation approach is also specified in an XML file. This file specifies the structure and contents of a scientific paper, including chapters, sections, paragraphs and graphical elements. The other attributes such as caption and key words are also saved in the corresponding content block. Note that the equation and formulas are taken as normal texts since they can be written in latex format. For graphical elements, we only save the width and height in the XML file. In our experiment, sections and sub-sections corresponds to panels and bullets respectively. We use TextRank to extract textual content from the XML file. In order to give different importance of different sections, we can set different extraction ratio for each of them. This will result in important sections generating more content and hence occupying bigger panels. For simplicity, this paper uses equal important weights for all sections. The Bayesian Network Toolbox (BNT) [23] is used for key parameters estimation and sampling. For graphical element attributes inference, we generate samples by the likelihood-weighted sampling method [6] for Eq. 11. With the inferred metadata, the final poster is generated in latex Beamerposter format with Lankton theme. We will release all code upon paper acceptance.

Competitors and evaluation metrics

We compare several baselines on different sections of our model to evaluate the methods of attributes inference. Particularly, we compare ridge regression, regression tree, support vector regression (SVR) with linear kernel and RBF kernel respectively. And for graphical elements position (

) inference, we regard it as a classification problem, then compare the performance of our method with nearest neighbors classification (KNN), decision tree, support vector classification (SVC) with linear and RBF kernel. We employ the corresponding values for the original (human) designed posters as the ground-truth. We split the dataset into 80 pairs for training and validation, and the rest (5 pairs) for testing.

Comparing with human designed posters.

We then evaluate how well our approach facilitates scientific poster generation, as compared to novice designers and the original poster (which is designed by the author). We invite three second-year Phd students, who are not familiar with our project, to hand design posters for the test set. These three students work in computer vision and machine learning and have not yet published any papers on these topics; hence they are novices to research. Given the test set papers, we ask the students to work together and design a poster for each paper.

Running time. Our framework is very efficient in term of running cost. Our experiments were done on a PC with an Intel Xeon GHz CPU and GB RAM. Tab. I shows the average time we needed for each step. The total running time is significantly less than the time people require to design a good poster, it is also less than the time spent to generate the posters made by three novices in Sec. 6.2.

stage Average time
Text extraction 9.2362s
Panel attributes inference learn 0.33s
infer 0.004s
Panel layout generation 0.001s
Composition within panel learn 0.57s
infer 0.913
TABLE I: Running time of each step.

6.2 Quantitative Evaluation

Effectiveness of attribute inferences. To validate the effectiveness of this step, our model is compared against several state-of-the-art regression methods, including ridge regression, regression tree, linear SVR and RBF-SVR.

The results are shown in Table II. We use the panel attributes of original posters as the ground-truth and Root-Mean-Square Error (RMSE) is computed for the inferred size and aspect ratio of each panel. Specifically, we use the design of original poster as the ground-truth and the RMSE is computed as,

(12)

where represents the panel size of original panel, and represent the panel size inferred by learning model; indicates the total number of panels of all the posters. In Eq (12), we use as an example; the RMSE for and can be calculated in the same way.

To infer the panel size () and aspect ratio ( ) , we use the and as features. Comparing with all the other methods, the RMSE of our method is only and respectively, which is lower than all the other methods. This shows that our algorithm can better estimates the panel attributes than other methods, due to our probabilistic graphical formulation that effectively models the correlations and dependence among variables.

For graphical elements size () and horizontal position (), we use , , , , as features and our model is compared against all the other methods. RMSE and accuracy is used to evaluate the performance of each method on and , respectively. The accuracy is computed as

(13)

where represents the horizontal position of original panel, and represent the horizontal position inferred by learning model. As shown in Tab. III and Tab. II, our results beat those of all the other methods since design constraints are introduced in the inference stage by Eq.(11).

Attributes

Our Method

Ridge Regression

Regression Tree

Linear-SVR

RBF-SVR

panel size () 0.071 0.075 0.090 0.073 0.120
panel aspect ratio () 0.695 0.696 0.819 0.702 0.737
graphical element size () 0.0144 0.289 0.287 0.361 1.041
TABLE II: Performance of comparing the methods on panel size(), panel aspect ratio () and graphical element size (). RMSE is used as metric. Note that here we only consider the relative size of each panel which is normalized into . The lower value, the better performance.
Attributes

Our Method

KNN

Decision Tree

Linear-SVC

RBF-SVC

horizontal position ( ) 88.9% 66.7% 66.7% 72.2% 72.2%
TABLE III: The accuracy of predicting horizontal position(). The higher value, the better performance.
Metric Readability Informativeness Aesthetics Avg.
Our method 7.32 7.08 6.70 7.03
Posters by novices 6.82 6.80 6.58 6.73
Original posters 7.36 7.10 7.44 7.30
TABLE IV: User study of different posters generated.

6.3 Qualitative User Study Evaluation

(a) Our method generates the poster of [32].

(b) Our method generates the poster of [25].
Fig. 4: Example of our results

User study. User study is employed to compare our results with original posters and posters made by novices. We invited 10 researchers (who are experts on the evaluated topic and kept unknown to our projects) to evaluate these results on readability, informativeness and aesthetics. Each researcher is sequentially shown the three results generated (in randomized order) and asked to score the results from , where , and indicate the lowest, middle and highest scores of corresponding metrics. The final results are averaged across subjects. Note that, since our method mainly consider the layout of a poster, we provide novice designer and our method with contents same as the original poster. We argue that this is a more objective way to evaluate our method, because both text extracted by TextRank and novice designer may not be as good as the text in original poster, since it is summarized by the author of the paper. And different contents would affect researchers evaluation.

In Table IV, on readability and informativeness, our result is comparable to the original poster; and it is significantly better than posters made by novices. This validates the effectiveness of our method. On one hand, the inferred panel attributes and generated panel layout will save most valuable and important information. Besides, composition within each panel that inferred by our method would give proper emphasis on figures and tables, which may be overlooked by novice designer. In contrast, our method is lower than the original posters on aesthetics metric (yet, still higher than those from novices). This is reasonable, since aesthetics is a relatively subjective metric and it generally needs to involve lots of human interactions. Human designers can adjust the the poster layout via lots of latex commands again and again. In general, it is an open problem to generate more aesthetic posters from papers.

Qualitative Evaluation of Three Methods. We qualitatively compare our results (Figure 4(b) and Figure 4(e)) with the posters from novices (Figure 4(a) and Figure 4(d)) and the original posters (third blob in Figure 4(c) and Figure 4(f)). All of them are for the same paper and with same contents.

It is interesting to show that if compared with the panel layout of original poster, our panel layout looks more similar to the original one than the one by novices. This is due to, firstly, the Paper-Poster dataset has a relatively similar graphical design with high quality, and secondly, our split and panel layout algorithm works well to simulate the way how people design posters. In the first row of Figure 5, we can see that in order to arrange contents in the poster aesthetically, the order of each panel is rearranged in poster from the novice designer (Figure 4(a)), this would affect the readability of a poster. The second row of Figure 5 shows that, compared with novice designer, our method also achieve good performance on attributes inference for graphical elements. The size of graphical elements inferred by our method seems similar to the original poster. In contrast, the poster designed by novices in Figure 4(d) lose emphasis on figures in order to keep the content fit each panel.

(a) Designed by novice

(b) Our result

(c) Original poster[35]

(d) Designed by novice

(e) Our result

(f) Original poster[36]
Fig. 5: Results generated by different ways

6.4 Qualitative Evaluation by Design Principles

We further qualitatively evaluate our results (Figure 4) by the general graphical design principles [24], i.e., flow, alignment,and overlap and boundaries.

Flow. It is essential for a scientific poster to present information in a clear read-order, i.e. readability. People always read a scientific poster from left to right and from top to bottom. Since Algorithm 1 recursively splits the page of poster into left, right or top, bottom, the panel layout we generate ensure that the read-order matches the section order of original paper. Within each panel, our algorithm also sequentially organizes contents which also follow the section order of original paper and this improves the readability.

Alignment. Compared with the complex alignment constraint in [24], our formulation is much simpler and uses an enumeration variable to indicate the horizontal position of graphical elements . This simplification does not spoil our results which still have reasonable alignment as illustrated in Figure 4 and quantitatively evaluated by three metrics in Table IV.

Overlap and boundaries. Overlapped panels will make the poster less readable and less aesthetic. To avoid this, our approach (1) recursively splits the page for panel layout; (2) sequentially arranges the panels; (3) Design constraint in incorporated into our Bayesian network (Eq.11) to penalize the cases of overlapping between graphical elements and panel boundaries. As a result, our algorithm can achieve reasonable results without significant overlapping and/or crossing boundaries. Similar to the manually created poster – Figure 5(c), our result (i.e., Figure 5(b)) does not have significantly overlapped panels and/or boundaries.

7 Conclusion and Future Work

Automatic tools for scientific poster generation are important for poster designers. Designers can save a lot of time with these kinds of tools. Design is a hard work, especially for scientific posters, which require careful consideration of both utility and aesthetics. Abstract principles about scientific poster design can not help designers directly. In contrast, we propose an approach to learn design patterns from existing examples, and this approach can be used as an assistant tool of scientific poster generation to aid the designers.

As the future work, our framework can be also applicable to directly learn the general design patterns such as the web-page design, and single-page graphical design, if given the corresponding layout styles. Currently, we do not consider font types of posters which will be addressed in future.

References

  • [1] K. Arai and T. Herman. Method for automatic e-comic scene frame extraction for reading comic on mobile devices. In Information Technology: New Generations (ITNG), 2010 Seventh International Conference on, pages 370–375. IEEE, 2010.
  • [2] G. D. Battista, P. Eades, R. Tamassia, and I. G. Tollis. Graph Drawing: Algorithms for the Visualization of Graphs. Prentice Hall PTR, Upper Saddle River, NJ, USA, 1st edition, 1998.
  • [3] Y. Cao, A. B. Chan, and R. W. H. Lau. Automatic stylistic manga layout. ACM Trans. Graph., 31(6):141:1–141:10, Nov. 2012.
  • [4] Y. Cao, R. W. Lau, and A. B. Chan. Look over here: Attention-directing composition of manga elements. ACM Transactions on Graphics (TOG), 33(4):94, 2014.
  • [5] N. Damera-Venkata, J. Bento, and E. O’Brien-Strain. Probabilistic document model for automated document composition. In Proceedings of the 11th ACM symposium on Document engineering, pages 3–12. ACM, 2011.
  • [6] R. M. Fung and K.-C. Chang. Weighing and integrating evidence for stochastic simulation in bayesian networks. pages 209–220, 1990.
  • [7] K. Gajos and D. S. Weld. Preference elicitation for interface optimization. In Proceedings of the 18th annual ACM symposium on User interface software and technology, pages 173–182. ACM, 2005.
  • [8] J. Geigel and A. Loui. Using genetic algorithms for album page layouts. IEEE multimedia, (4):16–27, 2003.
  • [9] D. E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1st edition, 1989.
  • [10] S. J. Harrington, J. F. Naveda, R. P. Jones, P. Roetling, and N. Thakkar. Aesthetic measures for automated document layout. In Proceedings of the 2004 ACM symposium on Document engineering, pages 109–111. ACM, 2004.
  • [11] K. Hoashi, C. Ono, D. Ishii, and H. Watanabe. Automatic preview generation of comic episodes for digitized comic search. In Proceedings of the 19th ACM international conference on Multimedia, pages 1489–1492. ACM, 2011.
  • [12] J. H. Holland.

    Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence

    .
    MIT Press, Cambridge, MA, USA, 1992.
  • [13] A. Hunter, D. Slatter, and D. Greig. Web-based magazine design for self publishers, 2011.
  • [14] N. Hurst, W. Li, and K. Marriott. Review of automatic document formatting. In Proceedings of the 9th ACM symposium on Document engineering, pages 99–108. ACM, 2009.
  • [15] C. Jacobs, W. Li, E. Schrier, D. Bargeron, and D. Salesin. Adaptive grid-based document layout. 22(3):838–847, 2003.
  • [16] A. Jahanian, J. Liu, D. R. Tretter, Q. Lin, N. Damera-Venkata, E. O’Brien-Strain, S. Lee, J. Fan, and J. P. Allebach. Automatic design of magazine covers. In IS&T/SPIE Electronic Imaging, pages 83020N–83020N. International Society for Optics and Photonics, 2012.
  • [17] G. Jing, Y. Hu, Y. Guo, Y. Yu, and W. Wang. Content-aware video2comics with manga-style layout. Multimedia, IEEE Transactions on, 17(12):2122–2133, Dec 2015.
  • [18] D. E. Knuth and M. F. Plass. Breaking paragraphs into lines. Software: Practice and Experience, 11(11):1119–1184, 1981.
  • [19] S. Lok and S. Feiner. A survey of automated layout techniques for information presentations. Proceedings of SmartGraphics, 2001, 2001.
  • [20] Y. Matsui, T. Yamasaki, and K. Aizawa. Interactive manga retargeting. In ACM SIGGRAPH 2011 Posters, page 35. ACM, 2011.
  • [21] P. Merrell, E. Schkufza, Z. Li, M. Agrawala, and V. Koltun. Interactive furniture layout using interior design guidelines. ACM Transactions on Graphics (TOG), 30(4):87, 2011.
  • [22] R. Mihalcea and P. Tarau. Textrank: Bringing order into texts. Association for Computational Linguistics, 2004.
  • [23] K. Murphy. Bayes net toolbox for matlab. 2002.
  • [24] P. O’Donovan, A. Agarwala, and A. Hertzmann. Learning layouts for single-pagegraphic designs. Visualization and Computer Graphics, IEEE Transactions on, 20(8):1200–1213, 2014.
  • [25] T. Oron-Gilad and Y. Parmet. Can a driving simulator assess the effectiveness of hazard perception training in young novice drivers? Advances in Transportation Studies, 1(special issue):65–76, 2014.
  • [26] X. Pang, Y. Cao, R. W. Lau, and A. B. Chan. A robust panel extraction method for manga. In Proceedings of the ACM International Conference on Multimedia, ACM MM, 2014.
  • [27] A. J. H. Peels, N. J. M. Janssen, and W. Nawijn. Document architecture and text formatting. ACM Trans. Inf. Syst., 3(4):347–369, Oct. 1985.
  • [28] A. Pinto, H. Pedrini, W. R. Schwartz, and A. Face spoofing detection through visual codebooks of spectral temporal cubes. IEEE Transactions on Image Processing, 24(12):4726–4740, 2015.
  • [29] L. Purvis, S. Harrington, B. O’Sullivan, and E. C. Freuder. Creating personalized documents: An optimization approach. In Proceedings of the 2003 ACM Symposium on Document Engineering, DocEng ’03, pages 68–77, New York, NY, USA, 2003. ACM.
  • [30] Y. Qu, W.-M. Pang, T.-T. Wong, and P.-A. Heng. Richness-preserving manga screening. 27(5):155, 2008.
  • [31] M. Sarrafzadeh and D. T. Lee. Algorithmic Aspects of VLSI Layout. World Scientific Publishing Co., Inc., River Edge, NJ, USA, 1993.
  • [32] Y. Suh, R. Snodgrass, and R. Zhang. AZDBLab: A laboratory information system for large-scale empirical DBMS studies, volume 7, pages 1641–1644. Association for Computing Machinery, 13 edition, 2014.
  • [33] L.-F. Yu, S.-K. Yeung, C.-K. Tang, D. Terzopoulos, T. F. Chan, and S. J. Osher. Make it home: automatic optimization of furniture arrangement. ACM Transactions on Graphics (TOG)-Proceedings of ACM SIGGRAPH 2011, v. 30, no. 4, July 2011, article no. 86, 2011.
  • [34] Q. Yuting, F. Yanwei, G. Yanwen, Z. Zhi-Hua, and S. Leonid. Learning to generate posters of scientific papers. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, AAAI’16, pages 51–57, 2016.
  • [35] Y. Zhao and S. chun Zhu. Image parsing with stochastic scene grammar. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 24, pages 73–81. Curran Associates, Inc., 2011.
  • [36] C. Zhiyuan, M. Arjun, L. Bing, H. Meichun, C. Malu, and G. Riddhiman. Leveraging multi-domain prior knowledge in topic models. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence, IJCAI’13, pages 2071–2077, 2013.