Learning to Generate Posters of Scientific Papers

04/05/2016 ∙ by Yuting Qiang, et al. ∙ Nanjing University Disney Research 0

Researchers often summarize their work in the form of posters. Posters provide a coherent and efficient way to convey core ideas from scientific papers. Generating a good scientific poster, however, is a complex and time consuming cognitive task, since such posters need to be readable, informative, and visually aesthetic. In this paper, for the first time, we study the challenging problem of learning to generate posters from scientific papers. To this end, a data-driven framework, that utilizes graphical models, is proposed. Specifically, given content to display, the key elements of a good poster, including panel layout and attributes of each panel, are learned and inferred from data. Then, given inferred layout and attributes, composition of graphical elements within each panel is synthesized. To learn and validate our model, we collect and make public a Poster-Paper dataset, which consists of scientific papers and corresponding posters with exhaustively labelled panels and attributes. Qualitative and quantitative results indicate the effectiveness of our approach.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The emergence of large number of scientific papers in various academic fields and venues (conferences and journals) is noteworthy. For example, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) accepted over 600 papers in 2016 alone. It is time-consuming to read all of these papers for the researchers, particularly those interested to holistically assess state-of-the-art or emerge with understanding of core scientific ideas explored in the last year. Converting a conference paper into a poster provides important means to efficiently and coherently convey core ideas and findings of the original paper. To achieve this goal, it is therefore essential to keep the posters readable, informative and visually aesthetic. It is challenging, however, to design a high-quality scientific poster which meets all of the above design constraints, particularly for those researchers who may not be proficient at design tasks or familiar with design packages (e.g., Adobe Illustrator).

In general, poster design is a complicated and time-consuming task; both understanding of the paper content and experience in design are required.

Automatic tools for scientific poster generation would help researchers by providing them with an easier way to effectively share their research. Further, given avid amount of scientific papers on ArXiv and other on-line repositories, such tools may also provide a way for other researchers to consume the content more easily. Rather than browsing raw papers, they may be able to browse automatically generated poster previews (potentially constructed with their specific preferences in mind).

However, in order to generate a scientific poster in accordance with, and representative of, the original paper, many problems need to be solved: 1) Content extraction. Both important textual and graphical content needs to be extracted from the original paper; 2) Panel layout. Content should fit each panel, and the shape and position of panels should be optimized for readability and design appeal; 3) Graphical element (figures and tables) arrangement. Within each panel, textual content can typically be sequentially itemized, but for graphical elements, their size and placement should be carefully considered. Due to these challenges, there are few automatic tools for scientific poster generation.

In this paper, we propose a data-driven method for automatic scientific poster generation (given a corresponding paper). Contents extraction and layout generation are two key components in this process. For content extraction, we use TextRank [Mihalcea and Tarau2004] to extract textual content, and provide an interface for extraction of graphical content (e.g., figures, tables, etc.). Our approach focuses primarily on poster layout generation. We address the layout in three steps. First, we propose a simple probabilistic graphical model to infer panel attributes. Second, we introduce a tree structure to represent panel layout, based on which we further design a recursive algorithm to generate new layouts. Third, in order to synthesize layout within each panel, we train another probabilistic graphical model to infer the attributes of the graphical elements.

Compared with posters designed by the authors, our approach can generate different results to adapt to different paper sizes/aspect ratios or styles, by training our model with different dataset, and thus provides more expressiveness in poster layout. To the best of our knowledge, this paper presents the first framework for poster generation from the original scientific paper.
Our paper makes the following contributions:

  • Probabilistic graphical models are proposed to learn scientific poster design patterns, including panel attributes and graphical element attributes, from existing posters.

  • A new algorithm, that considers both information conveyed and aesthetics, is developed to generate the poster layout.

  • We also collected and make available a Poster-Paper dataset with labelled poster panels and attributes.

2 Related Work

General Graphical Design. Graphical design has been studied extensively in computer graphics community. This involves several related, yet different topics, including text-based layout generation [Jacobs et al.2003, Damera-Venkata, Bento, and O’Brien-Strain2011, Hurst, Li, and Marriott2009], single-page graphical design [O’Donovan, Agarwala, and Hertzmann2014, Harrington et al.2004], photo albums layout [Geigel and Loui2003], furniture layout [Merrell et al.2011, Yu et al.2011], and even interface design [Gajos and Weld2005]. Among them, text-based layout pays more attention on informativeness, while attractiveness also needs to be considered in poster generation. Other topics would take aesthetics as the highest priority. However, some principles (such as alignment or read-order) need to be followed in poster design. In summary, poster generation needs to consider readability, informativeness and aesthetics of the generated posters simultaneously.

Manga Layout Generation. Several techniques have been studied to facilitate layout generation for western comics or manga. For example,, for example, scene frame extraction [Arai and Herman2010, Pang et al.2014], automatic stylistic manga layout generation [Cao, Chan, and Lau2012, Jing et al.2015], and graphical elements composition [Cao, Lau, and Chan2014]. For preview generation of comic episodes [Hoashi et al.2011], both frame extraction and layout generation are considered. Other research areas, such as manga retargeting [Matsui, Yamasaki, and Aizawa2011] and manga-like rendering [Qu et al.2008] also draw considerable attention. However, none of these methods can be directly used to generate scientific posters, which is the focus of this paper.

Our panel layout generation is inspired by the recent work on Manga layout [Cao, Chan, and Lau2012]. We use a binary tree to represent the panel layout. By contrast, the manga Layout trains a Dirichlet distribution to sample a splitting configuration, and different Dirichlet distribution for each kind of instance need to be trained. Instead, we propose a recursive algorithm to search for the best splitting configuration along a tree.

3 Overview

Problem Formulation. Assume that we have a set of posters M and their corresponding scientific papers. Each poster includes a set of panels , and each panel has a set of graphical elements (figures and tables) . Each panel is characterized by five attributes:

text length ()

text length within a panel;

text ratio ()

text length within a panel relative to text length of the whole poster, ;

graphical elements ratio ()
111Note that there is a little difference between this variable and text ratio . We do not use the figure size in poster. Instead, we use the corresponding figure from the original paper.

the size of graphical elements within a panel relative to the total size of graphical elements in the poster.

panel size () and aspect ratio (),

and , where and denote the width and height of a panel with respect to the poster, separately.

Each graphical element has four attributes:

graphical element size () and aspect ratio (),

and , where and denote the width and height of a graphical element relative to the whole paper respectively;

horizontal position ()

we assume that panel content is arranged sequentially from top to bottom222This holds true when using latex beamer to make posters.; hence only relative horizontal position needs to be considered, which is defined by a discrete variable ;

graphical element size in poster ()

is the ratio of the width of the graphical element with width of the panel.

To learn how to generate the poster, our goal is to determine the above attributes of each panel and each graphical element , as well as to infer the arrangement of all panels.

Intuitively, a trivial solution is to use a learning model (e.g., SVR) to learn how to regress these attributes, including , , , and , while regarding , , , , and as features. However, such a solution lacks an insight mechanism for exploring the relationships between the panel attributes (e.g., ) and graphical elements attributes (e.g., ). And it may fail to meet the requirements of readability, informativeness, and aesthetics. We thus propose a novel framework to solve our problem.

Figure 1: Overview of the proposed approach.

Overview. To generate a readable, informative and aesthetic poster, we simulate the rule-of-thumb on how people design the posters in practice. We generate the panel layout, then arrange the textual and graphical elements within each panel.

Our framework overall has four steps (as shown in Figure 1). However, the core of our framework focuses on three specific algorithms designed to facilitate poster generation. We first extract textual content from the paper using TextRank [Mihalcea and Tarau2004]333We use TextRank for text content extraction, however, TextRank can be replaced with other state-of-the-art textual summary algorithms. , this will be detailed in the Experimental Result section. Non-textual content (figures and tables) are extracted by user interaction. All these extracted contents are sequentially arranged and represented by the first blob in Figure 1. Inference of the initial panel key attributes (such as panel size and aspect ratio ) is then conducted by learning a probabilistic graphical model from the training data. Furthermore, panel layout is synthesized by developing a recursive algorithm to further update these key attributes (i.e., and ) and generate an informative and aesthetic panel layout. Finally, we compose panels by utilizing the graphical model to further synthesize the visual properties of each panel (such as the size and position of its graphical elements).

4 Methodology

Panel Attribute Inference. Our approach tries to divide a scientific poster into several rectangular panel blocks. Each panel should not only be of an appropriate size, to contain corresponding textual and graphical content, but also be in a suitable shape (aspect ratio) to maximize aesthetic appeal. Our approach learns a probabilistic graphical model to infer the initial values for the size and aspect ratio of each panel.

As each panel is composed of both textual description and graphical elements, we assume that panel size () and aspect ratio () are conditionally dependent on text ratio and graphical element ratio . Therefore, the likelihood of a set of panels can be defined as:

(1)

where and

are conditional probability distributions (CPDs) of

and given and

. We define them as two conditional linear Gaussian distributions:

(2)
(3)

where and are defined by the content extraction step demonstrated in Figure 1; and are the parameters that leverage the influence of various factors; and

are the variances. The parameters (

, , and

) are estimated using maximum likelihood from training data. Using the learned parameters, initial attributes of each panel can be inferred.

Note that in order to learn from limited data, this step actually employs two assumptions: (1) and are conditionally independent; (2) The attribute sets of panels are independent. We need the panels to be neither too small in size (), nor too distorted in aspect ratio (), to ensure readable, informative and aesthetic poster. The two assumptions introduced here are sufficient for this task. Furthermore, the attribute values estimated from this step are just good initial values for the property of each panel. We use the next two steps to further relax these assumptions and discuss the relationship between and , as well as the relationship among different panels (Algorithm 1).

To ease exposition, we denote the set of panels as , where and are the size and aspect ratio of th panel , separately; with .

Panel Layout Generation. One conventional way to design posters is to simply arrange them in two or three columns style. This scheme, although simple, however, makes all posters look similar and unattractive. Inspired by manga layout generation [Cao, Chan, and Lau2012], we propose a more vivid panel layout generation method. Specifically, we arrange the panels with a binary tree structure to help represent the panel layout. The panel layout generation is then formulated as a process of recursively splitting of a page, as is illustrated and explained in Figure 2.

Conveying information is the most important goal for a scientific poster, thus we attempt to maintain relative size for each panel during panel layout generation. This motivates the following loss function for the panel shape variation,

Figure 2: Panel layout and the corresponding tree structure. The tree structure of a poster layout contains five panels. The first splitting is vertical with the splitting ratio (0.5, 0.5). The poster is further divided into three panels in the left, and two panels in the right. This makes the whole page as two equal columns. For the left column, we resort to a horizontal splitting with the splitting ratio (0.4, 0.6). The larger one is further horizontally divided into two panels with the splitting ratio (0.33, 0.67). We only split the right column once, with the splitting ratio (0.5, 0.5).
(4)

where is the aspect ratio of a panel after optimization. This will lead to a combined aesthetic loss for the poster,

(5)

where is the poster panel set after optimization. In each splitting step, the combinatorial choices for splitting positions can be recursively computed and compared with respect to the loss function above. We choose the panel attributes with the lowest loss (Eq. 5). The whole algorithm is summarized in Algorithm 1.

0:    Panels which we learned from graphical model;rectangular page area , , , .
0:    
1:  if  then
2:     adjust panels to adapt to the whole rectangular page area, return the aesthetic loss: ;
3:  else
4:     for each  do
5:        ;
6:         = Panel Arrangement( , , , , );
7:         = Panel Arrangement(, , , , );
8:        if  then
9:           ;
10:           record this arrangement;
11:        end if
12:         = Panel Arrangement(, , , , );
13:         = Panel Arrangement(, , , , );
14:        if  then
15:           ;
16:           record this arrangement;
17:        end if
18:     end for
19:  end if
20:  return  Loss and arrangement.
Algorithm 1 Panel layout generation

Composition within a Panel. Having inferred layout of the panels, we turn our attention to composition of graphical elements within the panels. We model and infer attributes of graphical elements using another probabilistic graphical model. Particularly, the key attributes we need to estimate are the horizontal position and graphical element size . In our model, horizontal position relies on , and , while relies on , and , so the likelihood is

(6)

and are the conditional probability distributions (CPDs) of and given and respectively. The conditional linear Gaussian distribution is also used here,

(7)

where is the parameter to balance the influence of different factors. Since we take horizontal position as an enumerated variable, a natural way to estimate it is to make it a classification problem by using the softmax function,

(8)

where is the cardinality of the value set of , i.e. , is the th row of . The maximum likelihood method is used to estimate parameters, including , and .

Different from Eq. 1, directly inferring and is not advisable, since the panel content may exceed the panel bounding box and affect the aesthetic measure of a poster. To avoid this problem, we employ the likelihood-weighted sampling method [Fung and Chang1990] to generate samples from the model, by maximizing the likelihood function (Eq. 6) with this strict constraint,

(9)

where and denote the width and height of a single character respectively. The first term of the above constraint indicates the height of graphical elements while the second term represents the height of textual contents.

5 Experimental Results

Experimental Setup. We collect and make available to the community the first Poster-Paper dataset. Specifically, we selected well-designed pairs of scientific papers and their corresponding posters from publicly available pairs we collected. These papers are all about scientific topics, and their posters have relatively similar design styles. We further annotate panel attributes, such as panel width, panel height and so on. We make a training and testing split: pairs for training and five for testing. There is total of panels in our dataset. for training and for testing.

We use TextRank to extract textual content from the original paper. In order to give different importance of different sections, we can set different extraction ratio for each section. This will result in important sections generating more content and hence occupying bigger panels. For simplicity, this paper uses equal important weights for all sections. User-interaction is also required to highlight and select important figures and tables from original paper. We use the Bayesian Network Toolbox (BNT) 

[Murphy2002] to estimate key parameters. For graphical element attributes inference, we generate samples by the likelihood-weighted sampling method [Fung and Chang1990] for Eq. 6 while the constraint Eq.9 is used. With the inferred metadata, the final poster is generated in latex Beamerposter format with Lankton theme.

For baseline comparison, we invite three second-year Phd students, who are not familiar with our project, to hand design posters for the test set. These three students work in computer vision and machine learning and have not yet published any papers on these topics; hence they are novices to research. Given the test set papers, we ask the students to work together and design a poster for each paper.

Running time. Our framework is very efficient. Our experiments were done on a PC with an Intel Xeon 2.0 GHz CPU and 144GB RAM. Tab. 1 shows the average time we needed for each step. Strictly speaking, we can not compare with “previous methods”, since we are the first work on poster generation and there is no existing directly comparable work. Nevertheless, we argue that the total running time is significantly less than the time people require to design a good poster, it is also less than the time spent to generate the posters made by three novices in Quantitative evaluation section.

(a) Designed by novice
(b) Our result
(c) Original poster
Figure 3: Results generated by different ways
stage Average time
Text extraction 28.81s
Panel attributes inference learn 0.85s
infer 0.013s
Panel layout generation 0.13s
Composition within panel learn 2.17s
infer 0.03s+19.09s
Table 1: Running time of each step. : it takes us 0.03s for inference computation and the 19.09s time for latex file generation.

Quantitative Evaluation. We quantitatively evaluate the effectiveness of our approach.

(1) Effectiveness of panel inference. For this step, we compare the inferred size and aspect ratio of panels with the trivial solution – SVR which trains a linear regressor444 and are used as features for SVR. The parameters are chosen using cross-validation. Nonlinear kernels (such as RBF) perform worse due to over-fitting on training data. to predict the panel size and panel aspect ratio from training data. We use the panel attributes from the original posters555Note that, though the panels of original poster may not be the best ones, they are the best candidate to serve as the ground truth here. as the ground-truth and compute the mean-square error (MSE) of inferred values versus ground-truth values. Our results can achieve 3650.4 and 0.67 for panel size and aspect ratio. By contrast, the values of SVR method are 3831.3 and 0.76 respectively. This shows that our algorithm can better estimates the panel attributes than SVR.

Metric Readability Informativeness Aesthetics Avg.
Our method 6.94 7.06 6.86 6.95
Posters by novices 6.69 6.83 6.12 6.54
Original posters 7.08 7.03 7.43 7.18
Table 2: User study of different posters generated.

(2) User study. User study is employed to compare our results with original posters and posters made by novices. We invited 10 researchers (who are experts on the evaluated topic and kept unknown to our projects) to evaluate these results on readability, informativeness and aesthetics. Each researcher is sequentially shown the three results generated (in randomized order) and asked to score the results from , where , and indicate the lowest, middle and highest scores of corresponding metrics. The final results are averaged for each metric item.

As in Tab. 2, our method is comparable to original posters on readability and informativeness; and it is significantly better than posters made by novices. This validates the effectiveness of our method, since the inferred panel attributes and generated panel layout will save most valuable and important information. In contrast, our method is lower than the original posters on aesthetics metric (yet, still higher than those from novices). This is reasonable, since aesthetics is a relatively subjective metric and aesthetics generally requires a “human touch”. It is an open problem to generate more aesthetic posters from papers.

(a)
(b)
Figure 4: Qualitative comparison of our result (b) and novice’s result (a). Please refer to supplementary material for larger size figures.

Qualitative Evaluation of Three Methods. We qualitatively compare our result (Figure 3(b)) with the poster from novices in Figure 3(a) and the original poster Figure 3(c). All of them are for the same paper.

It is interesting to show that if compared with the panel layout of original poster, our panel layout looks more similar to the original one than the one by novices. This is due to, first, the Poster-Paper dataset has a relatively similar graphical design with high quality, and second, our split and panel layout algorithms that work well to simulate the way how people design posters. In contrast, the poster designed by novices in Figure 3(a) has two columns, which appears less attractive to our 10 researchers; it takes the novices around 2 hours to finish all the posters.

Further Qualitative Evaluation. We further qualitatively evaluate our results (Figure 4) by the general graphical design principles [O’Donovan, Agarwala, and Hertzmann2014], i.e., flow, alignment,and overlap and boundaries.

Flow It is essential for a scientific poster to present information in a clear read-order, i.e. readability. People always read a scientific poster from left to right and from top to bottom. Since Algorithm 1 recursively splits the page of poster into left, right or top, bottom, the panel layout we generate ensure that the read-order matches the section order of original paper. Within each panel, our algorithm also sequentially organizes contents which also follow the section order of original paper and this improves the readability.

Alignment. Compared with the complex alignment constraint in [O’Donovan, Agarwala, and Hertzmann2014], our formulation is much simpler and uses an enumeration variable to indicate the horizontal position of graphical elements . This simplification does not spoil our results which still have reasonable alignment as illustrated in Figure 4 and quantitatively evaluated by three metrics in Tab. 2.

Overlap and boundaries. Overlapped panels will make the poster less readable and less esthetic. To avoid this, our approach (1) recursively splits the page for panel layout; (2) sequentially arranges the panels; (3) enforces the constraint Eq. 9 to penalize the cases of overlapping between graphical elements and panel boundaries. As a result, our algorithm can achieve reasonable results without significant overlapping and/or crossing boundaries. Similar to the manually created poster – Figure 3(c), our result (i.e., Figure 3(b)) does not have significantly overlapped panels and/or boundaries.

6 Conclusion and Future Work

Automatic tools for scientific poster generation are important for poster designers. Designers can save a lot of time with these kinds of tools. Design is a hard work, especially for scientific posters, which require careful consideration of both utility and aesthetics. Abstract principles about scientific poster design can not help designers directly. By contrast, we propose an approach to learn design patterns from existing examples, and this approach will hopefully lead to an automatic tool for scientific poster generation to aid designers.

Except for scientific poster design, our approach also provides a framework to learn other kinds of design patterns, for example web-page design, single-page graphical design and so on. And by providing different set of training data, our approach could generate different layout styles. Our work has several limitations. We do not consider font types in our current implementation and only adopt a simple yet effective aesthetic metric. We plan to address these problems in future.

7 Acknowledgements

We would like to thank the anonymous reviewers for their insightful suggestions in improving this paper.

References

  • [Arai and Herman2010] Arai, K., and Herman, T. 2010. Method for automatic e-comic scene frame extraction for reading comic on mobile devices. In Information Technology: New Generations (ITNG), 2010 Seventh International Conference on, 370–375. IEEE.
  • [Cao, Chan, and Lau2012] Cao, Y.; Chan, A. B.; and Lau, R. W. H. 2012. Automatic stylistic manga layout. ACM Trans. Graph. 31(6):141:1–141:10.
  • [Cao, Lau, and Chan2014] Cao, Y.; Lau, R. W.; and Chan, A. B. 2014. Look over here: Attention-directing composition of manga elements. ACM Transactions on Graphics (TOG) 33(4):94.
  • [Damera-Venkata, Bento, and O’Brien-Strain2011] Damera-Venkata, N.; Bento, J.; and O’Brien-Strain, E. 2011. Probabilistic document model for automated document composition. In Proceedings of the 11th ACM symposium on Document engineering, 3–12. ACM.
  • [Fung and Chang1990] Fung, R. M., and Chang, K.-C. 1990. Weighing and integrating evidence for stochastic simulation in bayesian networks. 209–220.
  • [Gajos and Weld2005] Gajos, K., and Weld, D. S. 2005. Preference elicitation for interface optimization. In Proceedings of the 18th annual ACM symposium on User interface software and technology, 173–182. ACM.
  • [Geigel and Loui2003] Geigel, J., and Loui, A. 2003.

    Using genetic algorithms for album page layouts.

    IEEE multimedia (4):16–27.
  • [Harrington et al.2004] Harrington, S. J.; Naveda, J. F.; Jones, R. P.; Roetling, P.; and Thakkar, N. 2004. Aesthetic measures for automated document layout. In Proceedings of the 2004 ACM symposium on Document engineering, 109–111. ACM.
  • [Hoashi et al.2011] Hoashi, K.; Ono, C.; Ishii, D.; and Watanabe, H. 2011. Automatic preview generation of comic episodes for digitized comic search. In Proceedings of the 19th ACM international conference on Multimedia, 1489–1492. ACM.
  • [Hurst, Li, and Marriott2009] Hurst, N.; Li, W.; and Marriott, K. 2009. Review of automatic document formatting. In Proceedings of the 9th ACM symposium on Document engineering, 99–108. ACM.
  • [Jacobs et al.2003] Jacobs, C.; Li, W.; Schrier, E.; Bargeron, D.; and Salesin, D. 2003. Adaptive grid-based document layout. 22(3):838–847.
  • [Jing et al.2015] Jing, G.; Hu, Y.; Guo, Y.; Yu, Y.; and Wang, W. 2015. Content-aware video2comics with manga-style layout. Multimedia, IEEE Transactions on 17(12):2122–2133.
  • [Matsui, Yamasaki, and Aizawa2011] Matsui, Y.; Yamasaki, T.; and Aizawa, K. 2011. Interactive manga retargeting. In ACM SIGGRAPH 2011 Posters,  35. ACM.
  • [Merrell et al.2011] Merrell, P.; Schkufza, E.; Li, Z.; Agrawala, M.; and Koltun, V. 2011. Interactive furniture layout using interior design guidelines. ACM Transactions on Graphics (TOG) 30(4):87.
  • [Mihalcea and Tarau2004] Mihalcea, R., and Tarau, P. 2004. Textrank: Bringing order into texts. Association for Computational Linguistics.
  • [Murphy2002] Murphy, K. 2002. Bayes net toolbox for matlab.
  • [O’Donovan, Agarwala, and Hertzmann2014] O’Donovan, P.; Agarwala, A.; and Hertzmann, A. 2014. Learning layouts for single-pagegraphic designs. Visualization and Computer Graphics, IEEE Transactions on 20(8):1200–1213.
  • [Pang et al.2014] Pang, X.; Cao, Y.; Lau, R. W.; and Chan, A. B. 2014. A robust panel extraction method for manga. In Proceedings of the ACM International Conference on Multimedia, ACM MM.
  • [Qu et al.2008] Qu, Y.; Pang, W.-M.; Wong, T.-T.; and Heng, P.-A. 2008. Richness-preserving manga screening. 27(5):155.
  • [Yu et al.2011] Yu, L.-F.; Yeung, S.-K.; Tang, C.-K.; Terzopoulos, D.; Chan, T. F.; and Osher, S. J. 2011. Make it home: automatic optimization of furniture arrangement. ACM Transactions on Graphics (TOG)-Proceedings of ACM SIGGRAPH 2011, v. 30, no. 4, July 2011, article no. 86.