Matrix dialect #180

NeuralCoder3 · 2023-02-01T14:14:41Z

A simple matrix dialect.
It exposes a matrix type Mat: Π [n: .Nat, S: «n; .Nat», T: *] -> * such that
Mat (n, s, T) is an n-dimensional tensor with size s_0 * ... * s_{n-1}.

The matrix operations are:

shape -- returns the size along the ith dimension
constMat -- returns a new matrix
read -- reads an entry of a matrix
insert -- replaces an entry of a matrix
init -- creates a matrix but does not initializes entries
prod -- computes the matrix-matrix product of two two-dimensional matrices (over a floating point type)
transpose -- transposes a two-dimensional matrix
sum -- sums up all entries of a matrix
mapReduce -- performs an arbitrary mapping and reduction operation (see below)

All operations are inside the memory monad, allowing for a matrix implementation involving side effects.
In fact, the current matrices are nested pointers to arrays that are manipulated in-place.

An alternative might be a immutable array implementation like skew binary random access list or one array implementation from haskell (e.g. diff arrays)

MapReduce

mapReduce is inspired by the einstein sum notation and implementations like

Tensorflow / XLA: einsum
Pytorch: einsum
NumPy: einsum
Halide
Haskell: Tensor DSL
Ricci Calculus
Einstein Notation
Pytorch DSL

It takes m matrices, a zero, and a combination function.
The combination function takes the accumulated (initially zero) and elements from the input matrices and returns the new accumulator.
The result is a matrix.

Pseudocode:

out_matrix = init
for output_indices:
  acc = zero
  for input_indices:
    element_[0..m] = read(matrix[0..m], indices)
    acc = f (acc, elements)
  insert (out_matrix, output_indices, acc)
return out_matrix

Optimization Pipeline

The matrix operations and type are translated using a staging approach that allows intercepting the process at different levels.

High-Level Rewrites

First, high-level operations like transpose, sum, and prod are rewritten into the mapReduce form.
To do so, pre-defined functions of the form internal_mapRed_matrix_[name] are looked up. The functions should agree on the type with the corresponding axiom.

High-Level Externalization

Alternatively, certain operations like prod could be dispatched to external libraries like blas.
This is however not implemented in the current version.

Medium-Level Lowering

The next step is to lower mapReduce to affine for loops.
The conceptual idea corresponds to the pseudocode above.

Low-Level Lowering

The last step is to eliminate all remnants of the matrix dialect.
We remove the remaining internal_mapRed_ functions (due to a missing association dialect).

Afterward, we lower the low-level matrix operations and types.

The matrix type is replaced by a pointer to n nested arrays.
init is replaced with alloc
read becomes lea+load
insert becomes lea+store
constMat becomes alloc+pack+store

Low-Level Functional Lowering

We could lower the matrix to a functional array representation like Haskell arrays or random access lists at this point.

Additional Operations

One could implement further operations either deeply or shallowly:

parallel versions of other operations
a specialized map
a fold (functional speak for reduce)
zipWith (a map on two matrices)

Known Issues

Edge cases like zero inputs or outputs are not handled correctly in every case for mapReduce.

NeuralCoder3 · 2023-03-17T10:49:56Z

The current issue is in lower_matrix_mediumlevel.cpp : counting_for.
Specifically, the computation of the accumulator type fails.
This probably is caused by uninitialized components in the acc generated in line 241.

Fixed in fd73b1e

leissa

Really cool :)

dialects/affine/affine.h

dialects/core/be/ll/ll.cpp

dialects/matrix/matrix.h

dialects/matrix/matrix.thorin

dialects/matrix/passes/lower_matrix_lowlevel.h

dialects/matrix/passes/lower_matrix_mediumlevel.cpp

dialects/matrix/passes/lower_matrix_mediumlevel.h

leissa · 2023-03-27T22:12:32Z

Side note: I need to adjust my email settings. Sometimes I only see that you tagged me as a reviewer days later ... Sorry, for that.

NeuralCoder3 and others added 30 commits March 8, 2022 10:27

simple peephole optimization for 0+x=x

bb3e879

added partial eval optim to optim pass 3

65a3074

tests with flat cn

c00f797

temp fix for tuple coupling

4f05015

temp fix for sigma problems (conditional, return tuple)

2ec0969

fixed ptr arg init

45a06d8

merge

fa3dbb1

replaced lam->app

e64e8a5

Merge https://github.com/AnyDSL/thorin2 into autodiff

08c190f

cleanup, fat pointer signature

c54dd02

alloc fat-ptr implementation

d0da3ec

fat_ptr alloc fix

7ec5f04

correct left & tangent type for ptr to arrays

f2fbea0

fixed unreachable

18d453d

forwarded correct A (instead left transformed one)

cf10485

array fat ptr input pb

4826c15

wip lea fat ptr

724f68d

lea fat_ptr

4aa790c

fix lea, alloc, bitcast

a06e48c

zero for arrays

185bd71

merge master

a78f65b

fixed changes from merge

63bc50c

correct zero arrays

c379aa4

vec_add fat_ptr implementation

eb3647c

vec add in call position

fde5754

loop over fat ptr

183c344

vec add for pointer

5b1530b

added correct ptr sum & fixed mem

9ce2e5e

fix non-flat one-hot

68b95a7

temporary fix to preserve fat pointer in extract pb

fe0b12e

Marcel Ullrich added 7 commits March 15, 2023 16:13

temporarily add fix from AnyDSL#187

e163d31

fixed test case

f50d141

revisited unresolved tests

7c2a31a

attempt to fix register_pass not found

465fdd7

removed old tag relict

da8bc3d

explicitely include pipelinebuilder

680fb6c

replaced casted initializer lists

e81096f

Marcel Ullrich added 5 commits March 20, 2023 10:55

replaced span with array

fd73b1e

disable timing for non-linux platforms

ecefff0

fixed doxygen

4e63757

c++ code cleanup

13c3360

thorin code cleanup

a66d590

NeuralCoder3 marked this pull request as ready for review March 20, 2023 12:42

removed comments in normalizers

fb5e7d8

NeuralCoder3 requested review from leissa and fodinabor March 23, 2023 08:23

leissa reviewed Mar 27, 2023

View reviewed changes

leissa and others added 7 commits March 28, 2023 21:51

Merge branch 'anydsl_master' into matrix_dialect

9cce49a

compile fixes/fix warnings

b88cf04

Merge branch 'anydsl_master' into matrix_dialect

e34828d

refactor

78c4ec2

Merge branch 'anydsl_master' into matrix_dialect

1532a18

sort indices to avoid non-deterministic map access

e147add

fixed type error

22a608c

leissa mentioned this pull request Mar 29, 2023

Application Error with matching domain #174

Open

updated lit commands

e70c5ce

leissa merged commit 905cf5a into AnyDSL:master Mar 29, 2023

leissa deleted the matrix_dialect branch March 29, 2023 19:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matrix dialect #180

Matrix dialect #180

NeuralCoder3 commented Feb 1, 2023

NeuralCoder3 commented Mar 17, 2023 •

edited

Loading

leissa left a comment

leissa commented Mar 27, 2023

Matrix dialect #180

Matrix dialect #180

Conversation

NeuralCoder3 commented Feb 1, 2023

MapReduce

Optimization Pipeline

High-Level Rewrites

High-Level Externalization

Medium-Level Lowering

Low-Level Lowering

Low-Level Functional Lowering

Additional Operations

Known Issues

NeuralCoder3 commented Mar 17, 2023 • edited Loading

leissa left a comment

Choose a reason for hiding this comment

leissa commented Mar 27, 2023

NeuralCoder3 commented Mar 17, 2023 •

edited

Loading