Sparse Matrix

Code Basics

字数统计: 1.2k阅读时长: 7 min

 2020/05/17   Share

Materials

Introduction

In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. When storing and manipulating sparse matrices on a computer, it is beneficial and often necessary to use specialized algorithms and data structures that take advantage of the sparse structure of the matrix. Specialized computers have been made for sparse matrices, as they are common in the machine learning field. Operations using standard dense-matrix structures and algorithms are slow and inefficient when applied to large sparse matrices as processing and memory are wasted on the zeros. Sparse data is by nature more easily compressed and thus requires significantly less storage. Some very large sparse matrices are infeasible to manipulate using standard dense-matrix algorithms.

Storing Methods

Dictionary of keys (DOK)

Consists of a dictionary that maps (row, column)-pairs to the value of the elements.
Good for incrementally constructing a sparse matrix in random order,
But poor for iterating over non-zero values in lexicographical order.

List of lists (LIL)

Stores one list per row, with each entry containing the column index and the value.
These entries are kept sorted by column index for faster lookup.

Coordinate list (COO)

Stores a list of (row, column, value) tuples.
Entries are sorted first by row index and then by column index, to improve random access times.

Compressed sparse row (CSR, CRS or Yale format)

It is similar to COO, but compresses the row indices, hence the name.
This format allows fast row access and matrix-vector multiplications.

For the matrix:

$\left(\begin{array}{cccccc}10 & 20 & 0 & 0 & 0 & 0 \\ 0 & 30 & 0 & 40 & 0 & 0 \\ 0 & 0 & 50 & 60 & 70 & 0 \\ 0 & 0 & 0 & 0 & 0 & 80\end{array}\right)$

1
2
3

V         = [ 10, 20, 30, 40, 50, 60, 70, 80 ]
COL_INDEX = [  0,  1,  1,  3,  2,  3,  4,  5 ]   
ROW_INDEX = [  0,  2,  4,  7,  8 ]

where ROW_INDEX represents that:

V[0:2] is in the row No.1, whose corresponding col_index is COL_INDEX[0:2].
V[2:4] is in the row No.2, whose corresponding col_index is COL_INDEX[2:4].
V[4:7] is in the row No.3, whose corresponding col_index is COL_INDEX[4:7].
V[7:8] is in the row No.4, whose corresponding col_index is COL_INDEX[7:8].

In the situation where row number is smaller than data number, this representation costs smaller storing space.

Sparse Matrix in Scipy

DOK

Can be instantiated from a dense matrix or another sparse matrix.
Can also be instantiated as an empty matrix which support incremental construction.

import numpy as np
from scipy.sparse import dok_matrix
S = dok_matrix((5, 5), dtype=np.float32)
for i in range(5):
	for j in range(5):
	S[i, j] = i + j    # Update element

COO

coo_matrix(D) with a dense matrix D
coo_matrix(S) with another sparse matrix S (equivalent to S.tocoo())
coo_matrix((data, (i, j)), [shape=(M, N)])

Suppose we want to create an adjacency matrix of graph (1-2-3-4) consisting of only 4 nodes. We can use the COO matrix to easily achieve so.

from scipy.sparse import coo_matrix
data = 6 * [1.]
row = [0, 1, 1, 2, 2, 3]
col = [1, 0, 2, 1, 3, 2]
adj = coo_matrix((data, (row, col)), shape=[4, 4]).toarray()

CSR

csr_matrix(D) with a dense matrix D
csr_matrix(S) with another sparse matrix S (equivalent to S.tocoo())
csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])
csr_matrix((data, indices, indptr), [shape=(M, N)])

In the first form, its parameter form is in accordance with COO method. While in the second form, we should test its correctness. Using the former example, we have

from scipy.sparse import csr_matrix
V         = [ 10, 20, 30, 40, 50, 60, 70, 80 ]
COL_INDEX = [  0,  1,  1,  3,  2,  3,  4,  5 ]   
ROW_INDEX = [  0,  2,  4,  7,  8 ]

mat = csr_matrix((V, COL_INDEX, ROW_INDEX), shape=[4,6]).toarray()

the running result of which is:

array([[10, 20,  0,  0,  0,  0],
       [ 0, 30,  0, 40,  0,  0],
       [ 0,  0, 50, 60, 70,  0],
       [ 0,  0,  0,  0,  0, 80]])

CSC*

Only difference between CSC and CSR is that CSC compress the column list rather than row list.

Sparse Tensor in Pytorch

Torch supports sparse tensors in COO(rdinate) format, which can efficiently store and process tensors for which the majority of elements are zeros.

Creating a sparse tensor

i = torch.LongTensor([[0, 1, 1],
                      [2, 0, 2]])
v = torch.FloatTensor([3, 4, 5])
torch.sparse.FloatTensor(i, v, torch.Size([2,3])).to_dense()

Convert a scipy coo_matrix to pytorch sparse tensor (2D)

coo = coo_matrix(([3,4,5], ([0,1,1], [2,0,2])), shape=(2,3))

values = coo.data
indices = np.vstack((coo.row, coo.col))

i = torch.LongTensor(indices)
v = torch.FloatTensor(values)
shape = coo.shape

torch.sparse.FloatTensor(i, v, torch.Size(shape)).to_dense()

Convert a list of scipy coo_matrix to pytorch sparse tensor (3D)

Create a list of scipy coo_matrix:

1	coos = np.array([sp.sparse.coo_matrix(([3,4,5], ([0,1,1], [2,0,2])), shape=(2,3)) for _ in range(4)])

The result of which is

array([<2x3 sparse matrix of type '<class 'numpy.int64'>'
	with 3 stored elements in COOrdinate format>,
       <2x3 sparse matrix of type '<class 'numpy.int64'>'
	with 3 stored elements in COOrdinate format>,
       <2x3 sparse matrix of type '<class 'numpy.int64'>'
	with 3 stored elements in COOrdinate format>,
       <2x3 sparse matrix of type '<class 'numpy.int64'>'
	with 3 stored elements in COOrdinate format>], dtype=object)

Build a tensor out of them.

def coos2tensor(coos, shape):
    """
    convert a list of coo_matrices to 3D torch tensor.
    """
    values, indices = [], []
    for i, coo in enumerate(coos):
        values.extend(coo.data)

        indice = np.vstack([i*np.ones(len(coo.row)), coo.row, coo.col])
        indices.append(indice)

    indices = np.concatenate(indices, axis=1)

    i = torch.LongTensor(indices)
    v = torch.FloatTensor(values)

    return torch.sparse.FloatTensor(i, v, torch.Size(shape))

Sparse tensor mutiplication

We know that dense tensor nowadays is supported for the broadcasting during multiplication, for example, matrix $A$ of size $N\times D$ with batch size $M$, which is a tensor of size $M\times N \times D$, when multiplied with another matrix, which is usually the weight matrix $W$ of size $D\times D^\prime$, we expect the output is of size $M\times N \times D^\prime$. To achieve it, we simply use torch.matmul:

1
2
3

tensor_a = torch.from_numpy(np.arange(2*3*4).reshape([2,3,4]))
tensor_b = torch.from_numpy(np.arange(4*5).reshape([4,5]))
torch.matmul(tensor_a, tensor_b).shape