NaleRaphael/mlir_toy
Learn MLIR the hard way (probably) with Zig.
Learn MLIR the hard way (probably) with Zig.
I believe it's a way to make me learn more and gain a solid understanding of LLVM/MLIR internals.
Please check out utils/llvm/README.md.
# Basic usage: build with specific chapter
$ zig build -Dllvm_dir=${LLVM_DIR} -Dmlir_dir=${MLIR_DIR} -Dchapters=${CHAPTER}
Please check out build.sh for more details.
# - Run the output with data provided in MLIR toy example
$ ./zig-out/bin/toyc-chX ./toy_example/ChX/XXX [OPTIONS}
# - Or compare the output with the binary compiled from official toy example
# (remember to update the variables in this script based on your case)
$ ./compare_output.sh
NOTE
It's recommended to create symlinks of compiled binaries of MLIR toy example and data to this directory. Or you might need to change the default paths set incompare_output.sh
accordingly.
# Assume that you cloned LLVM source repo to "~/workspace/tool/llvm-17", and
# ran the script "./utils/llvm/build_llvm_mlir.sh" to build MLIR.
$ LLVM_ROOT_DIR=~/workspace/tool/llvm-17
$ MLIR_INST_DIR=${LLVM_SRC_DIR}/out/mlir
$ ln -s ${MLIR_INST_DIR}/examples toy_bin
$ ln -s ${LLVM_ROOT_DIR}/mlir/test/Examples/Toy toy_examples
$ cd tests/mlir
# Remember to update the variables `LLVM_DIR`, `MLIR_DIR`, `FILECHECK_BIN`
$ ./run_tests.sh
# Usage: ./tests/compare_toyc_output.sh <ChN>
# - ChN: chapter number (0 ~ 7)
#
# Environment variables:
# - VERBOSE: (0 or 1) show details when running test
# - ENABLE_DEBUG: (0 or 1) add debug options to CLI when running toyc binaries
# Example: compare the outputs of C++ and our Zig implementation for Ch7, and
# show the details while running.
$ VERBOSE=1 ./tests/compare_toyc_output.sh 7
NOTE
When comparing the output for Ch7, the test would fail whenENABLE_DEBUG=1
because there is an extra canonicalizer pass added to the pipeline (see here). It's considered as a redundant pass but it would show the IR before & after processing because--mlir-print-ir-before-all
and--mlir-print-ir-after-all
is enabled. So you can safely comment these 2 options out if you want to run the script withENABLE_DEBUG=1
for Ch7.
Here are things that cannot be done currently as initially planned, I would come back to them if possible.
The build of a MLIR dialect library heavily depends on things defined in CMake
modules like AddMLIR.cmake
and AddLLVM.cmake
.
Though I think it's possible to replicate those macros without using CMake,
it's really time-consuming and it requires to maintain if I want to switch to
different version of LLVM/MLIR.
But the major barrier stops me working on this for now is the lack of supports
for some compiler & linker features in zig cc/c++
, e.g., unsupported
linker arg: -rpath-link. So my current workaround for this is
to use CMake within a build script build_dialect.sh
and integrate this step
in build.zig
. You can check this out in each chapter folder to know how it
works, and see also "src/sample/" for some notes and code I've
done while digging into this topic.
If I understand correctly, currently it's not possible to implement Ops of custom dialect without directly writing C++. Because they rely on C++'s CRTP for type traits and supports of some internal features like IR verification. So that's why we still keep some C++ implementations in "ChX/toy/cpp" folder.
Regarding passes, it's possible to work with zig and MLIR C-API directly. You can check out how it work in "tests/mlir/CAPI/pass.zig".
As for other things that're not support in current MLIR C-API, we have to extend it by ourselves. See also "ChX/toy/c" folder for details.
--verbose-cc
and --verbose-cimport
to get details for debugging.$LLVM_SRC/mlir/include/mlir-c
,
but some C-APIs are defined in "Passes.td" files and they will be generated
as ".capi.h" and ".capi.cpp" files after being built.tests/mlir/CAPI
are also resources to learn how to use MLIR
C-API in Zig.Honestly, I didn't even know it's possible to complete this tutorial using Zig and the MLIR C-API at first. But I just want to push my limits and see what I could achieve, even though it's my first time building something related to a compiler.
Using Zig forced me to learn more low-level details. It made me struggle at times and exposed things I hadn't considered before, such as polymorphism via vtable/static dispatch and memory management while handling errors.
Working with the MLIR C-API was also challenging, it pushed me to understand how specific steps work behind the scenes in MLIR when I couldn't directly manipulate data in C++.
And issues related to the build system (like issue #1) also forced me to dive deeper into understanding how compilers, linkers and CMake work when building ELFs.
But I really enjoyed it because I did learn a lot in this process.
Now this project is almost finished, though there are still plenty things to learn to build the thing I want with MLIR. Working on this project has sparked my interest in exploring other techs for building compilers without relying on LLVM/MLIR. For instance, I really want to see how Zig compiler would evolve after the announcement about removing LLVM-related dependencies.
So, if you're looking to learn MLIR in an unconventional way, hope this project would be helpful to you as well.
And here are more things that might interest you about Zig: