MLIR (software)

MLIR
Original author(s)Chris Lattner, Mehdi Amini, Uday Bondhugula, and others
Developer(s)LLVM Developer Group
Initial release2019
Written inC++
Operating systemCross-platform
TypeCompiler
LicenseApache License 2.0 with LLVM Exception
Websitemlir.llvm.org

MLIR (Multi-Level Intermediate Representation) is an open-source compiler infrastructure project developed as a sub-project of the LLVM project. It provides a modular and extensible intermediate representation (IR) framework intended to facilitate the construction of domain-specific compilers and improve compilation for heterogeneous computing platforms. MLIR supports multiple abstraction levels in a single IR and introduces dialects, a mechanism for defining custom operations, types, and attributes tailored to specific domains.[1] The name "Multi-Level Intermediate Representation" reflects the system’s ability to model computations at various abstraction levels and progressively lower them toward machine code.

MLIR was originally developed in 2018 by Chris Lattner at Google, and publicly released as part of LLVM in 2019.[2] It was designed to address challenges in building compilers for modern workloads such as machine learning, hardware acceleration, and high-level synthesis by providing reusable components and standardizing the representation of intermediate computations across different programming languages and hardware targets.[1][3]

MLIR is used in a range of systems including TensorFlow, Mojo, TPU-MLIR, and others.[4] It is released under the Apache License 2.0 with LLVM exceptions and is maintained as part of the LLVM project.[1]

History

[edit]

Work on MLIR began in 2018, led by Chris Lattner at Google in collaboration with Mehdi Amini, River Riddle, and others, as a response to the growing complexity of modern compiler toolchains.[1][2] The project aimed to improve the modularity, composability, and maintainability of compiler infrastructures, particularly in domains such as machine learning, high-level synthesis, and hardware acceleration. It was formally introduced at the 2019 LLVM Developer Meeting and was open-sourced later that year as part of the LLVM monorepository.[5][6]


MLIR’s architecture was shaped by prior experiences building compilers such as XLA and LLVM, where limitations in existing intermediate representations hindered optimization and reuse across abstraction levels. To address this, MLIR introduced a novel concept of multi-level IRs that could coexist in the same system and be gradually lowered through well-defined transformations. A foundational design feature was the use of dialects, allowing different domains and hardware targets to define custom operations and type systems while maintaining interoperability.[2]

Since its release, MLIR has been adopted by multiple compiler ecosystems and research efforts. In TensorFlow, MLIR serves as the foundation for rewriting and lowering transformations in components such as XLA and TensorFlow Runtime. The language Mojo, developed by Modular Inc., relies on MLIR to achieve ahead-of-time compilation for artificial intelligence workloads.[3] Additional projects that have built on MLIR include TPU-MLIR for compiling models to Tensor Processing Unit hardware,[7] ONNX-MLIR for interoperable machine learning models,[8] MLIR-AIE for targeting Xilinx AI Engines,[9] IREE for compiling and executing machine learning models across CPUs, GPUs, and accelerators,[10] DSP-MLIR, a compiler infrastructure tailored for digital signal processing (DSP) applications,[11] and torch-mlir, which brings MLIR-based compilation capabilities to the PyTorch ecosystem.[12][4]

MLIR continues to evolve as part of the LLVM Project and follows the project's release schedule and development policies. It is developed collaboratively by contributors from industry, academia, and the broader open-source community.

Dialects

[edit]

In MLIR, a dialect defines a self-contained namespace of operations, types, attributes, and other constructs. Dialects are the primary mechanism for extensibility, allowing developers to introduce domain-specific abstractions while maintaining compatibility within the broader MLIR framework. Each operation within a dialect is identified by a unique name and may include optional operands, results, attributes, and regions. Operands and results follow the static single-assignment form (SSA), and each result is associated with a type. Attributes represent compile-time metadata, such as constant values. Regions consist of ordered blocks, each of which may take input arguments and contain a sequence of nested operations.[13] While MLIR is designed around SSA, it avoids traditional PHI nodes by using block arguments in conjunction with the operands of control-flow operations to model value merging.[14]

The general syntax for an operation is the following:

%res:2 = "mydialect.morph"(%input#3) ({
            ^bb0(%arg0: !mydialect<"custom_type"> loc("mysource.cc":10:8)):
                // nested operations
         }) { some.attribute = true, other_attribute = 1.5 }
         : (!mydialect<"custom_type">) -> (!mydialect<"other_type">, !mydialect<"other_type">)
         loc(callsite("foo" at "mysource.cc":10:8))

This operation, named morph, belongs to the mydialect dialect. It takes one input operand (%input#3) of type custom_type and produces two output values of type other_type. The operation includes two attributes—some.attribute and other_attribute—and contains a region with a single block (^bb0) that accepts one argument. The loc keyword specifies source-level location information, which can be used for debugging or diagnostic reporting.[15]

The syntax of operations, types and attributes can also be customized according to the user preferences by implementing proper parsing and printing functions within the operation definition.[16]

Core dialects

[edit]

The MLIR dialects ecosystem is open and extensible, allowing end-users to define new dialects that capture the semantics of specific computational domains. At the same time, the MLIR codebase provides a variety of built-in dialects that address common patterns found in intermediate representations. These core dialects are designed to be self-contained and interoperable, making them suitable for reuse across different compiler stacks.[17]

For example, the arith dialect includes basic mathematical operations over integers and floating-point types, while the memref dialect provides operations for memory allocation and access. Control-flow abstractions are handled by dialects such as affine, which supports affine loop nests suitable for polyhedral optimization, and scf, which provides structured control flow using constructs like for, if, and while. The func dialect supports function definitions and calls, while the gpu dialect introduces primitives for GPU programming models. Additionally, the tosa dialect defines a portable and quantization-friendly operator set for machine learning inference. Finally, the llvm dialect provides a one-to-one mapping to LLVM IR, enabling seamless lowering to LLVM’s backend and reuse of its optimization and code generation infrastructure.[4]

The following code defines a function that takes two floating point matrices and performs the sum between the values at the same positions:

func.func @matrix_add(%arg0: memref<10x20xf32>, %arg1: memref<10x20xf32>) -> memref<10x20xf32> {
    %result = memref.alloc() : memref<10x20xf32>

	affine.for %i = 0 to 10 {
		affine.for %j = 0 to 20 {
			%lhs = memref.load %arg0[%i, %j] : memref<10x20xf32>
			%rhs = memref.load %arg1[%i, %j] : memref<10x20xf32>
			%sum = arith.addf %lhs, %rhs : f32
			memref.store %sum, %result[%i, %j] : memref<10x20xf32>
		}
	}
    
    func.return %result : memref<10x20xf32>
}

Although different dialects may be used to express similar computations, the level of abstraction and the intended compilation flow may vary. In the example above, the affine dialect enables polyhedral analysis and optimizations, while the memref and arith dialects express memory and arithmetic operations, respectively.[17]

Operation definition specification

[edit]

The operations of a dialect can be defined using the C++ language, but also in a more convenient and robust way by using the Operation definition specification (ODS).[18] By using TableGen, the C++ code for declarations and definitions can be then automatically generated.[19]

The autogenerated code can include parsing and printing methods – which are based on a simple string mapping the structure of desired textual representation – together with all the boilerplate code for accessing fields and perform common actions such verification of the semantics of each operation, canonicalization or folding.[20]

The same declaration mechanism can be used also for types and attributes, which are the other two categories of elements constituting a dialect.[20]

The following example illustrates how to specify the assembly format of an operation expecting a variadic number of operands and producing zero results. The textual representation consists in the optional list of attributes, followed by the optional list of operands, a colon, and types of the operands.[18]

let assemblyFormat = "attr-dict ($operands^ `:` type($operands))?";

Transformations

[edit]

Transformations can always be performed directly on the IR, without having to rely on built-in coordination mechanisms. However, in order to ease both implementation and maintenance, MLIR provides an infrastructure for IR rewriting that is composed by different rewrite drivers. Each driver receives a set of objects named patterns, each of which has its own internal logic to match operations with certain properties. When an operation is matched, the rewrite process is performed and the IR is modified according to the logic within the pattern.[21]

Dialect conversion driver

[edit]

This driver operates according to the legality of existing operations, meaning that the driver receives a set of rules determining which operations have to be considered illegal and expects the patterns to match and convert them into legal ones. The logic behind those rules can be arbitrarily complex: it may be based just on the dialect to which the operations belong, but can also inspect more specific properties such as attributes or nested operations.[22]

As the names suggests, this driver is typically used for converting the operations of a dialect into operations belonging to a different one. In this scenario, the whole source dialect would be marked as illegal, the destination one as legal, and patterns for the source dialect operations would be provided. The dialect conversion framework also provides support for type conversion, which has to be performed on operands and results to convert them to the type system of the destination dialect.[22]

MLIR allows for multiple conversion paths to be taken. Considering the example about the sum of matrices, a possible lowering strategy may be to generate for-loops belonging to the scf dialect, obtaining code to be executed on CPUs:

#map = affine_map<(d0, d1) -> (d0, d1)>

module {
    func.func @avg(%arg0: memref<10x20xf32>, %arg1: memref<10x20xf32>) -> memref<10x20xf32> {
        %alloc = memref.alloc() : memref<10x20xf32>
        %c0 = arith.constant 0 : index
        %c10 = arith.constant 10 : index
        %c1 = arith.constant 1 : index
        
        scf.for %arg2 = %c0 to %c10 step %c1 {
            %c0_0 = arith.constant 0 : index
            %c20 = arith.constant 20 : index
            %c1_1 = arith.constant 1 : index
            
            scf.for %arg3 = %c0_0 to %c20 step %c1_1 {
                %0 = memref.load %arg0[%arg2, %arg3] : memref<10x20xf32>
                %1 = memref.load %arg1[%arg2, %arg3] : memref<10x20xf32>
                %2 = arith.addf %0, %1 : f32
                memref.store %2, %alloc[%arg2, %arg3] : memref<10x20xf32>
            }
        }
        
        return %alloc : memref<10x20xf32>
    }
}

Another possible strategy, however, could have been to use the gpu dialect to generate code for GPUs:

#map = affine_map<(d0, d1) -> (d0, d1)>

module {
    func.func @avg(%arg0: memref<10x20xf32>, %arg1: memref<10x20xf32>) -> memref<10x20xf32> {
        %alloc = memref.alloc() : memref<10x20xf32>
        %c0 = arith.constant 0 : index
        %c10 = arith.constant 10 : index
        %0 = arith.subi %c10, %c0 : index
        %c1 = arith.constant 1 : index
        %c0_0 = arith.constant 0 : index
        %c20 = arith.constant 20 : index
        %1 = arith.subi %c20, %c0_0 : index
        %c1_1 = arith.constant 1 : index
        %c1_2 = arith.constant 1 : index
        
        gpu.launch blocks(%arg2, %arg3, %arg4) in (%arg8 = %0, %arg9 = %c1_2, %arg10 = %c1_2) threads(%arg5, %arg6, %arg7) in (%arg11 = %1, %arg12 = %c1_2, %arg13 = %c1_2) {
            %2 = arith.addi %c0, %arg2 : index
            %3 = arith.addi %c0_0, %arg5 : index
            %4 = memref.load %arg0[%2, %3] : memref<10x20xf32>
            %5 = memref.load %arg1[%2, %3] : memref<10x20xf32>
            %6 = arith.addf %4, %5 : f32
            memref.store %4, %alloc[%2, %3] : memref<10x20xf32>
            gpu.terminator
        }
        
        return %alloc : memref<10x20xf32>
    }
}

Greedy pattern rewrite driver

[edit]

The driver greedily applies the provided patterns according to their benefit, until a fixed point is reached or the maximum number of iterations is reached. The benefit of a pattern is self-attributed. In case of equalities, the relative order within the patterns list is used.[21]

Traits and interfaces

[edit]

MLIR allows to apply existing optimizations (e.g., common subexpression elimination, loop-invariant code motion) on custom dialects by means of traits and interfaces. These two mechanisms enable transformation passes to operate on operations without knowing their actual implementation, relying only on some properties that traits or interfaces provide.[23][24]

Traits are meant to be attached to operations without requiring any additional implementation. Their purpose is to indicate that the operation satisfies certain properties (e.g. having exactly two operands).[23] Interfaces, instead, represent a more powerful tool through which the operation can be queried about some specific aspect, whose value may change between instances of the same kind of operation. An example of interface is the representation of memory effects: each operation that operates on memory may have such interface attached, but the actual effects may depend on the actual operands (e.g., a function call with arguments possibly being constants or references to memory).[24]

Applications

[edit]

The freedom in modeling intermediate representations enables MLIR to be used in a wide range of scenarios. This includes traditional programming languages,[25] but also high-level synthesis,[26][27] quantum computing[28] and homomorphic encryption.[29][30][31] Machine learning applications also take advantage of built-in polyhedral compilation techniques, together with dialects targeting accelerators and other heterogeneous systems.[32][33][34][35][36]

See also

[edit]

References

[edit]
  1. ^ a b c d "Multi-Level Intermediate Representation Overview". mlir.llvm.org. Retrieved 2025-06-05.
  2. ^ a b c Lattner, Chris; Amini, Mehdi; Bondhugula, Uday; Cohen, Albert; Davis, Andy; Pienaar, Jacques; Riddle, River; Shpeisman, Tatiana; Vasilache, Nicolas; Zinenko, Oleksandr (2021). MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). pp. 2–14. doi:10.1109/CGO51591.2021.9370308.
  3. ^ a b "Why Mojo". docs.modular.com. Retrieved 2025-06-05.
  4. ^ a b c "Users of MLIR". mlir.llvm.org. Retrieved 2025-06-05.
  5. ^ "LLVM Developer Meetings". llvm.org. Retrieved 2025-06-05.
  6. ^ "llvm-project". GitHub. LLVM Project. Retrieved 2025-06-16.
  7. ^ "TPU-MLIR Developer Manual". doc.sophgo.com. Sophgo. Retrieved 2025-06-16.
  8. ^ "ONNX-MLIR". GitHub. ONNX Project. Retrieved 2025-06-16.
  9. ^ "MLIR-AIE". GitHub. Xilinx. Retrieved 2025-06-16.
  10. ^ "IREE". iree.dev. Retrieved 2025-06-16.
  11. ^ Kumar, Abhinav; Khedkar, Atharva; So, Hwisoo; Kuo, Megan; Gurjar, Ameya; Biswas, Partha; Shrivastava, Aviral (2025). "DSP-MLIR: A Domain-Specific Language and MLIR Dialect for Digital Signal Processing". Proceedings of the 26th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES '25). New York, NY, USA: Association for Computing Machinery. pp. 146–157. doi:10.1145/3735452.3735527.
  12. ^ "torch-mlir". GitHub. LLVM Project. Retrieved 2025-06-16.
  13. ^ "MLIR Language Reference - MLIR". mlir.llvm.org. Retrieved 2023-07-05.
  14. ^ "MLIR Rationale - MLIR". mlir.llvm.org. Retrieved 2023-07-05.
  15. ^ Amini, Mehdi; Riddle, River. "MLIR Tutorial" (PDF). Retrieved 2025-06-05.
  16. ^ Stroustrup, Bjarne (2015). The C++ programming language: C++ 11 (4. ed., 4. print ed.). Upper Saddle River, NJ: Addison-Wesley. ISBN 978-0-321-56384-2.
  17. ^ a b "Dialects - MLIR". mlir.llvm.org. Retrieved 2023-07-07.
  18. ^ a b "Operation Definition Specification (ODS) - MLIR". mlir.llvm.org. Retrieved 2023-07-05.
  19. ^ "TableGen Overview — LLVM 17.0.0git documentation". llvm.org. Retrieved 2023-07-05.
  20. ^ a b "Defining Dialects - MLIR". mlir.llvm.org. Retrieved 2023-07-07.
  21. ^ a b "Pattern Rewriting : Generic DAG-to-DAG Rewriting - MLIR". mlir.llvm.org. Retrieved 2023-07-06.
  22. ^ a b "Dialect Conversion - MLIR". mlir.llvm.org. Retrieved 2023-07-06.
  23. ^ a b "Traits - MLIR". mlir.llvm.org. Retrieved 2023-07-05.
  24. ^ a b "Interfaces - MLIR". mlir.llvm.org. Retrieved 2023-07-05.
  25. ^ Moses, William S.; Chelini, Lorenzo; Zhao, Ruizhe; Zinenko, Oleksandr (2021). Polygeist: Raising C to Polyhedral MLIR. 30th International Conference on Parallel Architectures and Compilation Techniques (PACT). pp. 45–59. doi:10.1109/PACT52795.2021.00011. ISBN 978-1-6654-4278-7.
  26. ^ Agostini, Nicolas Bohm; Curzel, Serena; Amatya, Vinay; Tan, Cheng; Minutoli, Marco; Castellana, Vito Giovanni; Manzano, Joseph; Kaeli, David; Tumeo, Antonino (2022-10-30). "An MLIR-based Compiler Flow for System-Level Design and Hardware Acceleration". Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. Association for Computing Machinery. pp. 1–9. doi:10.1145/3508352.3549424. hdl:11311/1229389. ISBN 978-1-4503-9217-4.
  27. ^ Ruizhe, Zhao; Jianyi, Cheng (2021). "Phism: Polyhedral High-Level Synthesis in MLIR". arXiv:2103.15103 [cs.PL].
  28. ^ McCaskey, Alexander; Nguyen, Thien (October 2021). "A MLIR Dialect for Quantum Assembly Languages". 2021 IEEE International Conference on Quantum Computing and Engineering (QCE). IEEE. pp. 255–264. arXiv:2101.11365. doi:10.1109/QCE52317.2021.00043. ISBN 978-1-6654-1691-7. S2CID 231718965.
  29. ^ Park, Sunjae; Song, Woosung; Nam, Seunghyeon; Kim, Hyeongyu; Shin, Junbum; Lee, Juneyoung (2023-06-06). "HEaaN.MLIR: An Optimizing Compiler for Fast Ring-Based Homomorphic Encryption". Proceedings of the ACM on Programming Languages. 7 (PLDI): 196–220. doi:10.1145/3591228. ISSN 2475-1421.
  30. ^ Govindarajan, Sanath; Moses, William S. "SyFER-MLIR: Integrating Fully Homomorphic Encryption Into the MLIR Compiler Framework" (PDF).
  31. ^ "HEIR: Homomorphic Encryption Intermediate Representation". GitHub. Retrieved 2023-09-05.
  32. ^ Jin, Tian; Bercea, Gheorghe-Teodor; Le, Tung D.; Chen, Tong; Su, Gong; Imai, Haruki; Negishi, Yasushi; Leu, Anh; O'Brien, Kevin; Kawachiya, Kiyokuni; Eichenberger, Alexandre E. (2020). "Compiling ONNX Neural Network Models Using MLIR". arXiv:2008.08272 [cs.PL].
  33. ^ Pienaar, Jacques (2020), MLIR in TensorFlow Ecosystem, retrieved 2023-07-06
  34. ^ Hu, Pengchao; Lu, Man; Wang, Lei; Jiang, Guoyue (2022). "TPU-MLIR: A Compiler For TPU Using MLIR". arXiv:2210.15016 [cs.PL].
  35. ^ Katel, Navdeep; Khandelwal, Vivek; Bondhugula, Uday (2022-03-19). "MLIR-based code generation for GPU tensor cores". Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction. ACM. pp. 117–128. doi:10.1145/3497776.3517770. ISBN 978-1-4503-9183-2. S2CID 247522110.
  36. ^ Bik, Aart; Koanantakool, Penporn; Shpeisman, Tatiana; Vasilache, Nicolas; Zheng, Bixia; Kjolstad, Fredrik (2022-12-31). "Compiler Support for Sparse Tensor Computations in MLIR". ACM Transactions on Architecture and Code Optimization. 19 (4): 1–25. arXiv:2202.04305. doi:10.1145/3544559. ISSN 1544-3566. S2CID 246680261.
[edit]