Grale: Designing Networks for Graph Learning

07/23/2020
by   Jonathan Halcrow, et al.
0

How can we find the right graph for semi-supervised learning? In real world applications, the choice of which edges to use for computation is the first step in any graph learning process. Interestingly, there are often many types of similarity available to choose as the edges between nodes, and the choice of edges can drastically affect the performance of downstream semi-supervised learning systems. However, despite the importance of graph design, most of the literature assumes that the graph is static. In this work, we present Grale, a scalable method we have developed to address the problem of graph design for graphs with billions of nodes. Grale operates by fusing together different measures of(potentially weak) similarity to create a graph which exhibits high task-specific homophily between its nodes. Grale is designed for running on large datasets. We have deployed Grale in more than 20 different industrial settings at Google, including datasets which have tens of billions of nodes, and hundreds of trillions of potential edges to score. By employing locality sensitive hashing techniques,we greatly reduce the number of pairs that need to be scored, allowing us to learn a task specific model and build the associated nearest neighbor graph for such datasets in hours, rather than the days or even weeks that might be required otherwise. We illustrate this through a case study where we examine the application of Grale to an abuse classification problem on YouTube with hundreds of million of items. In this application, we find that Grale detects a large number of malicious actors on top of hard-coded rules and content classifiers, increasing the total recall by 89 alone.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2016

Semi-Supervised Classification with Graph Convolutional Networks

We present a scalable approach for semi-supervised learning on graph-str...
research
03/11/2019

L^γ-PageRank for Semi-Supervised Learning

PageRank for Semi-Supervised Learning has shown to leverage data structu...
research
02/23/2020

End-To-End Graph-based Deep Semi-Supervised Learning

The quality of a graph is determined jointly by three key factors of the...
research
02/20/2021

GLAM: Graph Learning by Modeling Affinity to Labeled Nodes for Graph Neural Networks

Graph Neural Networks have shown excellent performance on semi-supervise...
research
12/05/2022

Stars: Tera-Scale Graph Building for Clustering and Graph Learning

A fundamental procedure in the analysis of massive datasets is the const...
research
02/06/2021

Speaker attribution with voice profiles by graph-based semi-supervised learning

Speaker attribution is required in many real-world applications, such as...
research
12/26/2017

Scalable Prototype Selection by Genetic Algorithms and Hashing

Classification in the dissimilarity space has become a very active resea...

Please sign up or login with your details

Forgot password? Click here to reset