An Advert Creation System for Next-Gen Publicity

08/01/2018 ∙ by Atul Nautiyal, et al. ∙ 0

With the rapid proliferation of multimedia data in the internet, there has been a fast rise in the creation of videos for the viewers. This enables the viewers to skip the advertisement breaks in the videos, using ad blockers and 'skip ad' buttons -- bringing online marketing and publicity to a stall. In this paper, we demonstrate a system that can effectively integrate a new advertisement into a video sequence. We use state-of-the-art techniques from deep learning and computational photogrammetry, for effective detection of existing adverts, and seamless integration of new adverts into video sequences. This is helpful for targeted advertisement, paving the path for next-gen publicity.



There are no comments yet.


page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the ubiquity of multimedia videos, there has been a massive interest from the advertisement and marketing agencies to provide targeted advertisements for the customers. Such targeted advertisements are useful, both from the perspectives of marketing agents and end users. The advertisement agencies can use a powerful media for marketing and publicity; and the users can interact via a personalized consumer experience. In this paper, we attempt to solve this by designing an online advert creation system for next-gen publicity. We develop and implement an end-to-end system for automatically detecting and seamlessly changing an existing billboard in a video by inserting a new advert. This system will be helpful for online marketers and content developers, to develop video contents for targeted audience.

Figure 1 illustrates our system. Our system automatically detects the presence of a billboard in an image frame from the video sequence. Post billboard detection, our system also localizes its position in the image frame. The user is given an opportunity to manually adjust and refine the detected four corners of the billboard. Finally, a new advertisement is integrated into the image, and tracked across all frames of the video sequence. Thereby, we generate a new composite video with the integrated advert.

Figure 1: New advert integrated into the scene at the place of an existing billboard.

Currently, there are no such existing framework available in the literature that aid the marketing agents to seamlessly integrate a new advertisement, into an original video sequence. However, a few companies viz. Mirriad [1] uses patented advertisement plantation technique to integrate 3D objects in a video sequence.

2 Technology

The backbone of our advert creation system is based on state-of-the-art techniques from deep learning and image processing. In this section, we briefly describe the underlying techniques used in the various components of the demo system. The different modules of our system are: advert- recognition, localization, and integration.

2.1 Advert Recognition

The first module of our advert creation system is used for the recognition of billboard 111In this paper, we interchangeably use both the terms, billboard and advert to indicate a candidate object for new advertisement integration in an image frame.

– does an image frame from the video sequence contain billboard? This helps the system user to automatically detect the presence of billboard in an image frame of the video. We use a deep neural network (DNN) as a binary classifier where classes represent

presence and absence of billboard in video frame respectively. We use a VGG-based network [4]

for billboard detection. We use transfer learning with pre-trained ImageNet weights. We freeze the corresponding weights of all layers apart from last

layers. We add fully connected layers with a softmax layer as the output layer. We train this deep network on our annotated dataset, containing both billboard and non-billboard images, and achieve good accuracy on billboard recognition.

2.2 Advert Localization

The second module of our advert creation system is used for localizing the position of recognized billboard – where is the billboard located in image frame? We use a encoder-decoder based deep neural network that localizes the billboard position in an image. We train this model on our billboard dataset comprising input images (cf. Fig. 2(a)) and corresponding binary ground truth image (cf. Fig. 2

(b)). We train the model for several thousands of epochs. The localized billboard is a probabilistic image, that denotes the probability of an image pixel to belong to

billboard class. We generate the binary threshold image from our computed heatmap using thresholding, and detect the various closed contours on the binary image. Finally, we select the contour with the largest area as our localized billboard position. We thereby compute the initial four corners from the binary image by circumscribing a rectangle on the selected contour with minimum bounding area. The localized advert is shown in Fig. 2.

(a) Input Image
(b) Ground Truth
(c) Detected Advert
(d) Localized Advert
Figure 2: Localization of billboard using our advert creation system. We localize the advert from the probabilistic heatmap, by circumscribing a rectangle with minimum bounding area.

2.3 Advert Integration

The third and final module of our system is advert integration – how to integrate a new advert in the video? In this stage, the localized billboard is replaced with a new advert in a seamless and temporally consistent manner. We use Poisson image editing [3] on the new advert, to achieve similar local illumination and local color tone, as the original video sequence. Furthermore, the relative motion of the billboard within the scene is tracked using Kanade-Lucas-Tomasi (KLT) [2] tracking technique.

3 Design and Interface

We have designed an online system to demonstrate the functionalities of the various modules 222A demonstration video of our advert creation system can be accessed via The web UI interface is designed in Vue.js - the progressive JavaScript Framework. The back end is supported via Express - Node.js web application framework. The deep neural networks for advert recognition and localization is designed in pure python, and the advert integration is implemented in C++. The web service to support advert detection is performed in python flask. The integration of a new advert into the existing video in the web server is executed via C++ binary.

Figure 3 illustrates a sample snapshot of our developed web-based tool. The web interface consists of primarily three sections: Home, Demo and Images. The page Home provides an overview of the system. The next page Demo

describes the entire working prototype of our system. The user selects a sample video from the list, runs the billboard detection module to accurately localize the billboard at sample image frames of the video. The detection module estimates the four corners of the billboard. However, the user also gets an option to

refine the four corners manually, if the detected four corners are not completely accurate. The refined four corners of the billboard are subsequently used for tracking and integration of a new advertisement into the video sequence. The third and final web page Images contains the list of all candidate adverts that can be integrated into the selected video sequence.

Figure 3: Interface of the demo for advert detection and integration.

Finally, our system integrates the new advertisement into the detected billboard position, and generates a new composite video with the implanted advertisement.

4 Conclusion and Future Work

In this paper, we have presented an online advert creation system on multimedia videos for a personalized and targeted advertisement. We use techniques from deep neural networks and image processing, for a seamless integration of new adverts into existing videos. Our system is trained on datasets that comprises outdoor scenes and views. Our future work involve further refining the performance of the system, and also generalizing it to other video sequence types.


The ADAPT Centre for Digital Content Technology is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.