Scattertext: a Browser-Based Tool for Visualizing how Corpora Differ

03/02/2017
by   Jason S. Kessler, et al.
0

Scattertext is an open source tool for visualizing linguistic variation between document categories in a language-independent way. The tool presents a scatterplot, where each axis corresponds to the rank-frequency a term occurs in a category of documents. Through a tie-breaking strategy, the tool is able to display thousands of visible term-representing points and find space to legibly label hundreds of them. Scattertext also lends itself to a query-based visualization of how the use of terms with similar embeddings differs between document categories, as well as a visualization for comparing the importance scores of bag-of-words features to univariate metrics.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset