MLIR-Tutorial

Creating an MLIR dialect out-of-tree means describing your operations in TableGen (.td) files and then implementing the connection in C++. Here’s how and why each part matters, presented as a logical workflow, not as a step-by-step tutorial for beginners, but as a technical, narrative explanation.


The process begins by defining the dialect itself in a file like include/TutoDialect/TutoDialect.td. This file is concise: it gives MLIR the dialect’s name (tuto) and its C++ namespace, which enables TableGen to generate all C++ symbols and the syntax used in MLIR files. For example:

include "mlir/IR/OpBase.td"

def Tuto_Dialect : Dialect {
    let name = "tuto";
    let summary = "A Tuto out-of-tree MLIR dialect.";
    let description = [{
        This dialect is an example of an out-of-tree MLIR dialect designed to
        illustrate the basic setup required to develop MLIR-based tools without
        working inside of the LLVM source tree.
    }];
    let cppNamespace = "::mlir::tuto";
}

On the same principle, you then describe the dialect’s operations in a dedicated TableGen file, typically include/TutoDialect/TutoDialectOps.td. This file gathers all operations: each operation is declared with its input/output types and names, and its MLIR assembly format. For instance, an addition of floats:

include "TutoDialect.td"
include "mlir/IR/AttrTypeBase.td"
include "mlir/IR/DialectBase.td"
include "mlir/Interfaces/InferTypeOpInterface.td"
include "mlir/Interfaces/SideEffectInterfaces.td"


def AddOp : Tuto_Op<"add", [Pure]> {
  let summary = "Addition";
  let description = [{
    Addition operation between two values.
  }];

  let arguments = (ins F64:$lhs, F64:$rhs);
  let results = (outs F64:$res);

  let assemblyFormat = [{
    $lhs `,` $rhs `:` type($lhs) attr-dict
  }];
}

This file is the blueprint for generating the full C++ class for the operation via TableGen, ensuring parsing, syntax, and printing are consistent and correct.

Articulating TableGen and C++: The Skeleton of an Out-of-Tree MLIR Dialect

Designing an MLIR dialect outside the LLVM source tree is fundamentally about separating declaration from implementation. This architectural split declarative TableGen and connecting C++ is what allows MLIR to scale and remain maintainable, even as dialects grow.

1. TableGen: Declarative Core of the Dialect

Everything starts with TableGen .td files.

Key Point:
TableGen .td files are the sole source of truth for the syntax, signatures, and metadata of your dialect and ops.
All boilerplate and repetitive code (parsing, printing, verification stubs, etc.) is generated from here.


2. The C++ Headers: Connecting Generated Code to the Project

The glue between TableGen and the MLIR C++ API consists of several headers:

TutoOps.h

#ifndef TUTO_TUTOOPS_H
#define TUTO_TUTOOPS_H

#include "mlir/IR/BuiltinTypes.h"
#include "mlir/IR/Dialect.h"
#include "mlir/IR/OpDefinition.h"
#include "mlir/Interfaces/InferTypeOpInterface.h"
#include "mlir/Interfaces/SideEffectInterfaces.h"

#define GET_OP_CLASSES
#include "TutoDialect/TutoOps.h.inc"

#endif // TUTO_TUTOOPS_H

TutoDialect.h

#ifndef TUTO_TUTODIALECT_H
#define TUTO_TUTODIALECT_H

#include "mlir/IR/Dialect.h"
#include "TutoDialect/TutoOpsDialect.h.inc"

#endif // TUTO_TUTODIALECT_H

3. The .cpp Files: Registration and Implementation

TutoOps.cpp

#include "TutoDialect/TutoOps.h"
#include "TutoDialect/TutoDialect.h"
#include "mlir/IR/OpImplementation.h"

#define GET_OP_CLASSES
#include "TutoDialect/TutoOps.cpp.inc"

TutoDialect.cpp

#include "TutoDialect/TutoDialect.h"
#include "TutoDialect/TutoOps.h"
#include "mlir/IR/DialectImplementation.h"

using namespace mlir;
using namespace mlir::tuto;

void TutoDialect::initialize() {
  addOperations<
#define GET_OP_LIST
#include "TutoDialect/TutoOps.cpp.inc"
  >();
}

4. The Build Process: Automation via CMake and TableGen


5. Using the Dialect

After this pipeline is in place, you can write, parse, and print your custom operations in .mlir files.
Your driver binary (tuto-opt) is now able to:


Key Takeaways

Once these definitions are written, TableGen (invoked by CMake during the build) generates all the backend C++ (headers and intermediate sources). Then, you just need to implement the minimal glue in C++: the main dialect file, for example lib/TutoDialect/TutoDialect.cpp, is responsible for registering all operations of the dialect within MLIR. This is done with a simple initialize() method that adds your operations to the dialect’s table. Nothing magic, this is the key that makes your operations usable in tools like mlir-opt or your own binary (tuto-opt).

The main looks like :

#include "mlir/IR/Dialect.h"
#include "mlir/IR/MLIRContext.h"
#include "mlir/InitAllDialects.h"
#include "mlir/InitAllPasses.h"
#include "mlir/Pass/Pass.h"
#include "mlir/Pass/PassManager.h"
#include "mlir/Support/FileUtilities.h"
#include "mlir/Tools/mlir-opt/MlirOptMain.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/InitLLVM.h"
#include "llvm/Support/SourceMgr.h"
#include "llvm/Support/ToolOutputFile.h"
#include <mlir/Dialect/Linalg/IR/Linalg.h>
#include "mlir/Dialect/Math/IR/Math.h"
#include "mlir/Dialect/MemRef/IR/MemRef.h"
#include "mlir/Dialect/SCF/IR/SCF.h"
#include "mlir/Dialect/Tensor/IR/Tensor.h"
#include "mlir/Dialect/Affine/IR/AffineOps.h"
#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
#include "TutoDialect/TutoDialect.h"
#include "TutoDialect/TutoOpsDialect.cpp.inc"
#include "mlir/Dialect/Func/IR/FuncOps.h"

int main(int argc, char **argv) {

  mlir::DialectRegistry registry;
  registry.insert<mlir::tuto::TutoDialect>();
  registry.insert<mlir::arith::ArithDialect>();
  registry.insert<mlir::math::MathDialect>();
  registry.insert<mlir::tensor::TensorDialect>();
  registry.insert<mlir::affine::AffineDialect>();
  registry.insert<mlir::linalg::LinalgDialect>();
  registry.insert<mlir::memref::MemRefDialect>();
  registry.insert<mlir::LLVM::LLVMDialect>();
  registry.insert<mlir::func::FuncDialect>();
  return mlir::asMainReturnCode(
      mlir::MlirOptMain(argc, argv, "Tuto optimizer driver\n", registry));
}

In an out-of-tree MLIR project, the main driver source (as shown) serves as the interface between your dialect and the MLIR ecosystem. Its purpose is not to hard-code logic but to register the set of dialects you want your tool to support including your own and to delegate all actual IR handling, verification, parsing, pass execution, and pretty-printing to MLIR’s robust infrastructure.

Here, the inclusion of all core dialect headers, alongside your own, signals to MLIR what kinds of operations and types should be recognized and parsed. The dialect registry object is a central component: by inserting your dialect (mlir::tuto::TutoDialect) and any others (arith, math, tensor, affine, linalg, memref, LLVM, func), you make their ops available as first-class citizens in your IR. This registry becomes the catalogue that MLIR uses at runtime for all dialect resolution and IR manipulation.

The key function is MlirOptMain, which is a generic driver for IR files and passes, provided directly by MLIR. It expects to be handed a dialect registry and takes care of everything else: loading IR, handling passes, running analyses, producing diagnostics, and emitting transformed IR. It abstracts away boilerplate so that your binary focuses solely on declaring support for dialects, not reimplementing existing tooling.

There is no stepwise logic or custom orchestration here; the code is deliberately minimal, reflecting the compositional, declarative design MLIR encourages. Your dialect integrates seamlessly with all standard passes and dialects simply by being registered. The out-of-tree nature is reflected in the lack of special-casing: your dialect is just another extension point, managed at runtime via the registry, never hardwired into MLIR itself.

This is the architectural pattern that enables scalability, extensibility, and modularity in the MLIR ecosystem.


With these files in place, you simply build the project. The dialect can then be used in a .mlir file like:

func.func @add_example(%arg0: f32, %arg1: f32) -> f32 {
  %res = tuto.add %arg0, %arg1 : f32
  return %res : f32
}

And you can test it using your binary:

./bin/Tuto-opt test/TutoTest.mlir

Result: your dialect and operations are fully integrated into MLIR and ready to be extended, add types, patterns, lowerings, whatever you need.


Logical summary: