DeepAI AI Chat
Log In Sign Up

Going Beyond T-SNE: Exposing whatlies in Text Embeddings

09/04/2020
by   Vincent D. Warmerdam, et al.
0

We introduce whatlies, an open source toolkit for visually inspecting word and sentence embeddings. The project offers a unified and extensible API with current support for a range of popular embedding backends including spaCy, tfhub, huggingface transformers, gensim, fastText and BytePair embeddings. The package combines a domain specific language for vector arithmetic with visualisation tools that make exploring word embeddings more intuitive and concise. It offers support for many popular dimensionality reduction techniques as well as many interactive visualisations that can either be statically exported or shared via Jupyter notebooks. The project documentation is available from https://rasahq.github.io/whatlies/.

READ FULL TEXT

page 4

page 5

page 6

page 7

09/05/2019

Fusing Vector Space Models for Domain-Specific Applications

We address the problem of tuning word embeddings for specific use cases ...
08/11/2017

Simple and Effective Dimensionality Reduction for Word Embeddings

Word embeddings have become the basic building blocks for several natura...
03/11/2019

ETNLP: A Toolkit for Extraction, Evaluation and Visualization of Pre-trained Word Embeddings

In this paper, we introduce a comprehensive toolkit, ETNLP, which can ev...
12/02/2021

Interactive Visualization of Spatial Omics Neighborhoods

Dimensionality reduction of spatial omic data can reveal shared, spatial...
04/06/2021

VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations

Word vector embeddings have been shown to contain and amplify biases in ...
04/27/2019

Enabling Open-World Specification Mining via Unsupervised Learning

Many programming tasks require using both domain-specific code and well-...
05/12/2023

ActUp: Analyzing and Consolidating tSNE and UMAP

tSNE and UMAP are popular dimensionality reduction algorithms due to the...

Code Repositories

whatlies

toolkit to help visualise - what lies in word embeddings


view repo