ScreenSeg: On-Device Screenshot Layout Analysis

by   Manoj Goyal, et al.

We propose a novel end-to-end solution that performs a Hierarchical Layout Analysis of screenshots and document images on resource constrained devices like mobilephones. Our approach segments entities like Grid, Image, Text and Icon blocks occurring in a screenshot. We provide an option for smart editing by auto highlighting these entities for saving or sharing. Further this multi-level layout analysis of screenshots has many use cases including content extraction, keyword-based image search, style transfer, etc. We have addressed the limitations of known baseline approaches, supported a wide variety of semantically complex screenshots, and developed an approach which is highly optimized for on-device deployment. In addition, we present a novel weighted NMS technique for filtering object proposals. We achieve an average precision of about 0.95 with a latency of around 200ms on Samsung Galaxy S10 Device for a screenshot of 1080p resolution. The solution pipeline is already commercialized in Samsung Device applications i.e. Samsung Capture, Smart Crop, My Filter in Camera Application, Bixby Touch.


page 2

page 3

page 4

page 6

page 7

page 8


Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution

Visual information extraction (VIE), which aims to simultaneously perfor...

FONTNET: On-Device Font Understanding and Prediction Pipeline

Fonts are one of the most basic and core design concepts. Numerous use c...

DocBed: A Multi-Stage OCR Solution for Documents with Complex Layouts

Digitization of newspapers is of interest for many reasons including pre...

On- Device Information Extraction from Screenshots in form of tags

We propose a method to make mobile screenshots easily searchable. In thi...

Comixify: Transform video into a comics

In this paper, we propose a solution to transform a video into a comics....

MagicMix: Semantic Mixing with Diffusion Models

Have you ever imagined what a corgi-alike coffee machine or a tiger-alik...

Advanced Hough-based method for on-device document localization

The demand for on-device document recognition systems increases in conju...

Please sign up or login with your details

Forgot password? Click here to reset