Self-organized Hierarchical Softmax

07/26/2017
by   Yikang Shen, et al.
0

We propose a new self-organizing hierarchical softmax formulation for neural-network-based language models over large vocabularies. Instead of using a predefined hierarchical structure, our approach is capable of learning word clusters with clear syntactical and semantic meaning during the language model training process. We provide experiments on standard benchmarks for language modeling and sentence compression tasks. We find that this approach is as fast as other efficient softmax approximations, while achieving comparable or even better performance relative to similar full softmax models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2016

Efficient softmax approximation for GPUs

We propose an approximate strategy to efficiently train neural network b...
research
08/23/2023

How to Protect Copyright Data in Optimization of Large Language Models?

Large language models (LLMs) and generative AI have played a transformat...
research
09/26/2016

Pointer Sentinel Mixture Models

Recent neural network sequence models with softmax classifiers have achi...
research
05/16/2020

MicroNet for Efficient Language Modeling

It is important to design compact language models for efficient deployme...
research
02/28/2019

Efficient Contextual Representation Learning Without Softmax Layer

Contextual representation models have achieved great success in improvin...
research
12/15/2015

Strategies for Training Large Vocabulary Neural Language Models

Training neural network language models over large vocabularies is still...
research
09/17/2019

Relaxed Softmax for learning from Positive and Unlabeled data

In recent years, the softmax model and its fast approximations have beco...

Please sign up or login with your details

Forgot password? Click here to reset