# A parallel algorithm for Gaussian elimination over finite fields

In this paper we describe a parallel Gaussian elimination algorithm for matrices with entries in a finite field. Unlike previous approaches, our algorithm subdivides a very large input matrix into smaller submatrices by subdividing both rows and columns into roughly square blocks sized so that computing with individual blocks on individual processors provides adequate concurrency. The algorithm also returns the transformation matrix, which encodes the row operations used. We go to some lengths to avoid storing any unnecessary data as we keep track of the row operations, such as block columns of the transformation matrix known to be zero. The algorithm is accompanied by a concurrency analysis which shows that the improvement in concurrency is of the same order of magnitude as the number of blocks. An implementation of the algorithm has been tested on matrices as large as 1 000 000× 1 000 000 over small finite fields.

READ FULL TEXT