QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models

07/07/2023
by   Tommaso Pegolotti, et al.
0

We present ongoing work on a new automatic code generation approach for supporting quantized generative inference on LLMs such as LLaMA or OPT on off-the-shelf CPUs. Our approach is informed by the target architecture and a performance model, including both hardware characteristics and method-specific accuracy constraints. Results on CPU-based inference for LLaMA models show that our approach can lead to high performance and high accuracy, comparing favorably to the best existing open-source solution. A preliminary implementation is available at https://github.com/IST-DASLab/QIGen.

READ FULL TEXT
research
08/02/2023

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

We introduce OpenFlamingo, a family of autoregressive vision-language mo...
research
08/21/2023

RaLLe: A Framework for Developing and Evaluating Retrieval-Augmented Large Language Models

Retrieval-augmented large language models (R-LLMs) combine pre-trained l...
research
09/12/2023

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

We evaluate the use of the open-source Llama-2 model for generating well...
research
08/19/2023

Inductive-bias Learning: Generating Code Models with Large Language Model

Large Language Models(LLMs) have been attracting attention due to a abil...
research
11/15/2019

Corrfunc: Blazing fast correlation functions with AVX512F SIMD Intrinsics

Correlation functions are widely used in extra-galactic astrophysics to ...
research
02/08/2022

PGMax: Factor Graphs for Discrete Probabilistic Graphical Models and Loopy Belief Propagation in JAX

PGMax is an open-source Python package for easy specification of discret...
research
08/06/2023

LARCH: Large Language Model-based Automatic Readme Creation with Heuristics

Writing a readme is a crucial aspect of software development as it plays...

Please sign up or login with your details

Forgot password? Click here to reset