Materials
- wikipedia of sparse matrix
- official reference guide of sparse matrix in scipy
- stackoverflow on how to convert scipy sparse matrix to pytorch sparse tensor
- pytoch official reference on sparse matrix
- github pytorch issue regarding batch matmul with sparse matrix
Introduction
In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. When storing and manipulating sparse matrices on a computer, it is beneficial and often necessary to use specialized algorithms and data structures that take advantage of the sparse structure of the matrix. Specialized computers have been made for sparse matrices, as they are common in the machine learning field. Operations using standard dense-matrix structures and algorithms are slow and inefficient when applied to large sparse matrices as processing and memory are wasted on the zeros. Sparse data is by nature more easily compressed and thus requires significantly less storage. Some very large sparse matrices are infeasible to manipulate using standard dense-matrix algorithms.
Storing Methods
Dictionary of keys (DOK)
- Consists of a dictionary that maps (row, column)-pairs to the value of the elements.
- Good for incrementally constructing a sparse matrix in random order,
- But poor for iterating over non-zero values in lexicographical order.
List of lists (LIL)
- Stores one list per row, with each entry containing the column index and the value.
- These entries are kept sorted by column index for faster lookup.
Coordinate list (COO)
- Stores a list of (row, column, value) tuples.
- Entries are sorted first by row index and then by column index, to improve random access times.
Compressed sparse row (CSR, CRS or Yale format)
- It is similar to COO, but compresses the row indices, hence the name.
- This format allows fast row access and matrix-vector multiplications.
For the matrix:
1 | V = [ 10, 20, 30, 40, 50, 60, 70, 80 ] |
where ROW_INDEX
represents that:
V[0:2]
is in the row No.1, whose corresponding col_index isCOL_INDEX[0:2]
.V[2:4]
is in the row No.2, whose corresponding col_index isCOL_INDEX[2:4]
.V[4:7]
is in the row No.3, whose corresponding col_index isCOL_INDEX[4:7]
.V[7:8]
is in the row No.4, whose corresponding col_index isCOL_INDEX[7:8]
.
In the situation where row number is smaller than data number, this representation costs smaller storing space.
Sparse Matrix in Scipy
DOK
- Can be instantiated from a dense matrix or another sparse matrix.
- Can also be instantiated as an empty matrix which support incremental construction.
1 | import numpy as np |
COO
coo_matrix(D)
with a dense matrix Dcoo_matrix(S)
with another sparse matrix S (equivalent toS.tocoo()
)coo_matrix((data, (i, j)), [shape=(M, N)])
Suppose we want to create an adjacency matrix of graph (1-2-3-4) consisting of only 4 nodes. We can use the COO matrix to easily achieve so.
1 | from scipy.sparse import coo_matrix |
CSR
csr_matrix(D)
with a dense matrix Dcsr_matrix(S)
with another sparse matrix S (equivalent toS.tocoo()
)csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])
csr_matrix((data, indices, indptr), [shape=(M, N)])
In the first form, its parameter form is in accordance with COO
method. While in the second form, we should test its correctness. Using the former example, we have
1 | from scipy.sparse import csr_matrix |
the running result of which is:
1 | array([[10, 20, 0, 0, 0, 0], |
CSC*
Only difference between CSC and CSR is that CSC compress the column list rather than row list.
Sparse Tensor in Pytorch
Torch supports sparse tensors in COO(rdinate) format, which can efficiently store and process tensors for which the majority of elements are zeros.
Creating a sparse tensor
1 | i = torch.LongTensor([[0, 1, 1], |
Convert a scipy coo_matrix to pytorch sparse tensor (2D)
1 | coo = coo_matrix(([3,4,5], ([0,1,1], [2,0,2])), shape=(2,3)) |
Convert a list of scipy coo_matrix to pytorch sparse tensor (3D)
Create a list of scipy coo_matrix:
1 | coos = np.array([sp.sparse.coo_matrix(([3,4,5], ([0,1,1], [2,0,2])), shape=(2,3)) for _ in range(4)]) |
The result of which is
1 | array([<2x3 sparse matrix of type '<class 'numpy.int64'>' |
Build a tensor out of them.
1 | def coos2tensor(coos, shape): |
Sparse tensor mutiplication
We know that dense tensor nowadays is supported for the broadcasting during multiplication, for example, matrix $A$ of size $N\times D$ with batch size $M$, which is a tensor of size $M\times N \times D$, when multiplied with another matrix, which is usually the weight matrix $W$ of size $D\times D^\prime$, we expect the output is of size $M\times N \times D^\prime$. To achieve it, we simply use torch.matmul
:
1 | tensor_a = torch.from_numpy(np.arange(2*3*4).reshape([2,3,4])) |
For sparse tensors, the multiplication needs to be done iteratively, and there is no known broadcasting mechanism supported in pytorch.