ONNX-MLIR Linalg Dialect Integration: Compilation Flow and Optimization Benefits
Published:
1. Problem of current ONNX-MLIR Compilation Flow
\[\text{ONNX} \xrightarrow{Lowering} \text{Krnl} \xrightarrow{Lowering} \text{Affine}\xrightarrow{Lowering} \text{LLVM IR}\]To apply sophisticated optimizations specialized for matrix operations (Tiling, Fusion), complex manual passes must be written at the Krnl level.
2. Linalg Dialect
Linalg operations have defined structures such as linalg.matmul, linalg.conv_2d_nhwc_hwcf, etc. Linalg’s design is engineered to easily apply the following transformations:
- Parametric Tiling: Divides large operations into smaller blocks (tiles) considering the memory hierarchy (cache).
- Tiled Fusion: Fuses producer-consumer operations within tile boundaries to keep intermediate data in cache, reducing memory overhead.
- Promotion to Temporary Buffer: Moves data from slow memory to fast temporary buffers (scratchpad memory) to optimize data access speed.
- Vectorization: Converts Linalg operations to vector Dialect to facilitate SIMD instruction (AVX, NEON) utilization.
By replacing Krnl Dialect with Linalg Dialect, we can take advantage of Linalg’s benefits.
\[\text{ONNX} \xrightarrow{Lowering} \text{Linalg} \xrightarrow{\text{Tiling/Bufferization/Vectorization}} \text{...}\]3. Linalg Dialect-Based Compilation Pipeline
The final target pipeline is as follows:
graph TD
ONNX["ONNX Dialect<br/>(High-level Operations)"]
LinalgTensor["Linalg Dialect<br/>(Tensor-level)"]
subgraph "Optimization Phase"
Tiling["Tiling Passes"]
Fusion["Fusion Passes"]
Vectorization["Vectorization Passes"]
end
Bufferization["Bufferization Pass<br/>(LinalgBufferize)"]
LinalgMemRef["Linalg Dialect<br/>(MemRef-level)"]
subgraph "Lowering Phase"
Affine["Affine Dialect<br/>(Explicit Loops)"]
Vector["Vector Dialect<br/>(SIMD Operations)"]
end
LLVM["LLVM Dialect<br/>(Target IR)"]
LLVMIR["LLVM IR<br/>(Final Code)"]
ONNX -->|"ONNXToLinalg Conversion"| LinalgTensor
LinalgTensor --> Tiling
LinalgTensor --> Fusion
LinalgTensor --> Vectorization
Tiling --> Bufferization
Fusion --> Bufferization
Vectorization --> Bufferization
Bufferization -->|"Tensor → MemRef"| LinalgMemRef
LinalgMemRef --> Affine
LinalgMemRef --> Vector
Affine --> LLVM
Vector --> LLVM
LLVM --> LLVMIR
Series Posts
- Previous: What is an onnx-mlir?
- Next: ONNXToLinalg Pipeline Construction: MatMul Operation Conversion Implementation
Language: 한국어 (Korean)
