Fighting Redundancy and Model Decay with Embeddings

09/18/2018
by   Dan Shiebler, et al.
0

Every day, hundreds of millions of new Tweets containing over 40 languages of ever-shifting vernacular flow through Twitter. Models that attempt to extract insight from this firehose of information must face the torrential covariate shift that is endemic to the Twitter platform. While regularly-retrained algorithms can maintain performance in the face of this shift, fixed model features that fail to represent new trends and tokens can quickly become stale, resulting in performance degradation. To mitigate this problem we employ learned features, or embedding models, that can efficiently represent the most relevant aspects of a data distribution. Sharing these embedding models across teams can also reduce redundancy and multiplicatively increase cross-team modeling productivity. In this paper, we detail the commoditized tools, algorithms and pipelines that we have developed and are developing at Twitter to regularly generate high quality, up-to-date embeddings and share them broadly across the company.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/25/2020

Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter

In real-time, Twitter strongly imprints world events, popular culture, a...
research
04/10/2022

Decay No More: A Persistent Twitter Dataset for Learning Social Meaning

With the proliferation of social media, many studies resort to social me...
research
01/01/2021

Tweeting for the Cause: Network analysis of UK petition sharing

Online government petitions represent a new data-rich mode of political ...
research
06/13/2020

Through the Twitter Glass: Detecting Questions in Micro-Text

In a separate study, we were interested in understanding people's Q A ...
research
04/21/2019

Probabilistic Face Embeddings

Embedding methods have achieved success in face recognition by comparing...
research
05/12/2019

The Secret Lives of Names? Name Embeddings from Social Media

Your name tells a lot about you: your gender, ethnicity and so on. It ha...

Please sign up or login with your details

Forgot password? Click here to reset