A first try on Zig and C interop
Tuesday 18 June 2024 ยท 44 mins read ยท Viewed 59 timesTable of contents ๐
Introduction ๐
Zig is still in development, and the language API is not stable. The code in this article may not work in future versions of Zig.
The version of Zig used in this article is 0.13.0-dev.380+b32aa99b8
.
Lately, I've been programming in Zig for the Advent of Code 2023. What I've learned about the language is that it works perfectly well for low-level programming, but lacked in some areas (my biggest gripe is the lack of HTTP2).
However, Zig claims to be able to interop with C, which is something that I've been wanting to try for a long time. As you know, I used to use CGO in Go, which permits me to use complex C libraries with a high-level layer in Go.
In this article, I will try to demonstrate my experience with Zig and C interop. I will use a simple example: a C library that transcode a video into AV1 format, and a Zig program that uses this library.
The concept ๐
Transcoding a video seems quite a complex task, but thanks to the libavcodec
and libavformat
C libraries, it is quite "easy" to do (or at least, to understand).
Transcoding a video follows these steps:
- Open the input video file.
- Demux the input video file: read packets from the input video file.
- Decode the packets: decode the packets into frames by passing them to the decoder.
- Encode the frames: encode the frames into packets by passing them to the encoder.
- Mux the packets: write the packets to the output video file.
The aim of this article is to demonstrate how to use a C library without the need to make a Zig wrapper around it. This is a common practice in Go, where you can use CGO to call C functions directly.
Zig and tricks ๐
I'm taking account that people reading this article are not familiar with Zig, so I will explain some concepts that I've learned while programming in Zig.
Memory allocators ๐
Zig does not have a garbage collector, so you have to manage memory "yourself". By yourself, I mean that you have to allocate and deallocate memory manually.
Compared to C, you are free to choose the type of allocator you want to use. The standard library provides a std.mem.Allocator
interface that you can implement to create your own allocator.
In this article, I will use two allocators:
std.heap.GeneralPurposeAllocator
: a simple allocator. We could have used thestd.heap.page_allocator
or thestd.heap.c_allocator
, but for the sake of being simple, I will use theGeneralPurposeAllocator
.std.heap.ArenaAllocator
: an allocator that groups allocations in arenas. This is useful when you want to deallocate a group of allocations at once, like the arguments of the program.
In Zig, the GeneralPurposeAllocator
can be used to detect memory leaks:
1var gpa = std.heap.GeneralPurposeAllocator(.{}){};
2const gpa_allocator = gpa.allocator();
3
4pub fn main() !void {
5 // Detect memory leaks
6 defer std.debug.assert(gpa.deinit() == .ok);
7 // Your code here
8}
Oh yeah, Zig has a defer
statement, which is quite similar to Go's defer
, but it is scoped to the curly braces, compared to Go's defer
, which is scoped to the function. And the .ok
is an enum value.
About the gpa
declaration, I'm calling a function with an "empty" struct as an argument, which returns a type. Do note I'm quoting "empty", because Zig has default values for structs, which is quite useful.
To instanciate the type std.heap.GeneralPurposeAllocator(.{})
, we add the curly braces after the type name:
1const my_struct: type = struct {
2 a: u8,
3 b: u8,
4};
5
6var object = my_struct{};
7var object: my_struct = .{}; // Equivalent
Easy memory management with defer
๐
The best feature of Zig is the defer
statement because it completes the "flow" of the function. Similar to Go, defer
can be used to clean up resources at the end of the function:
1func do() !void {
2 var array = try allocator.alloc(i64, array_size);
3 defer allocator.free(array);
4
5 // Your code here
6}
Compared to C:
1int do() {
2 int ret = 0;
3 int *array = malloc(array_size * sizeof(int));
4 if (array == NULL) {
5 ret = 1;
6 goto end;
7 }
8
9 // Your code here
10
11end:
12 if (array != NULL) free(array);
13 return ret;
14}
But, one issue that I've had is that since defer
is scoped to the curly braces, it's almost unusable in if
statements:
1var data: ?[]u8 = null;
2if (first_file) {
3 try allocator.alloc(u8, data_size);
4 defer allocator.free(data);
5} // Is cleared here
Zig has errdefer
, which is "almost" what I what, but only triggers when an error occurs:
1var data: ?[]u8 = null;
2if (first_file) {
3 try allocator.alloc(u8, data_size);
4 errdefer allocator.free(data);
5}
6
7return; // errdefer is not triggered: no error returned.
It would be nice to have an actual equivalent of Go's defer
in Zig.
Error handling ๐
Zig has an error handling similar to Go, but slightly more primitive. To compare:
C:
1int my_function() {
2 if (error_condition) {
3 return -1;
4 }
5 return 0;
6}
Go:
1func myFunction() error {
2 if errorCondition {
3 return errors.New("my error")
4 }
5 return nil
6}
Zig:
1fn my_function() !void {
2 if (error_condition) {
3 return error.MyError;
4 }
5}
In Zig, errors are not implementations of an error interface like in Go, but are enums. And errors can have subsets and supersets, which is somewhat confusing at first, but quite powerful:
1const std = @import("std");
2
3const FileOpenError = error{
4 AccessDenied,
5 OutOfMemory,
6 FileNotFound,
7};
8
9const AllocationError = error{
10 OutOfMemory,
11};
12
13test "coerce subset to superset" {
14 const err = subset_to_superset(AllocationError.OutOfMemory);
15 try std.testing.expect(err == FileOpenError.OutOfMemory);
16}
17
18fn subset_to_superset(err: AllocationError) FileOpenError {
19 return err;
20}
error
is the base superset.
As you can see, there is some flexibility in error handling in Zig, but has one downside: error does not have any value (no message, no custom data). However, Zig errors remember the stack trace, which is quite useful for debugging and could help avoid the need to pass custom data in the error.
Zig's quick error handling ๐
In Go, we often have this pattern:
1func myFunction() error {
2 if err := someFunction(); err != nil {
3 return err
4 }
5 return nil
6}
In Zig, we can use the try
keyword to return an error immediately:
1fn my_function() !void {
2 try some_function();
3}
This helps to streamline the error handling in Zig:
1fn complex_function() !void {
2 try handle_data(try fetch_data());
3}
C interop ๐
Zig has its own C compiler and own "C-translator". To include a C library in Zig, you have to use the @cImport
directive:
1const c = @cImport({
2 @cInclude("libavcodec/avcodec.h");
3 @cInclude("libavformat/avformat.h");
4 @cInclude("libavutil/avutil.h");
5});
6
7pub fn main() !void {
8 c.call_some_c_function();
9}
Which is quite similar to Go:
1/*
2#cgo pkg-config: libavformat libavcodec libavutil
3#include <libavcodec/avcodec.h>
4#include <libavformat/avformat.h>
5#include <libavutil/avutil.h>
6*/
7import "C"
8
9func main() {
10 C.call_some_c_function()
11}
About pkg-config
, Zig has its own build system. You can create a build.zig
and pass the C libraries you want to link to:
1pub fn build(b: *std.Build) void {
2 const target = b.standardTargetOptions(.{});
3 const optimize = b.standardOptimizeOption(.{});
4
5 const exe = b.addExecutable(.{
6 .name = "av1-transcoder",
7 .root_source_file = b.path("src/main.zig"),
8 .target = target,
9 .optimize = optimize,
10 .link_libc = true,
11 });
12
13 exe.addIncludePath(.{
14 .src_path = .{ .owner = b, .sub_path = "src" },
15 });
16 exe.linkSystemLibrary2("libavcodec", .{ .preferred_link_mode = .static });
17 exe.linkSystemLibrary2("libavutil", .{ .preferred_link_mode = .static });
18 exe.linkSystemLibrary2("libavformat", .{ .preferred_link_mode = .static });
19 exe.linkSystemLibrary2("swresample", .{ .preferred_link_mode = .static });
20 exe.linkSystemLibrary2("SvtAv1Dec", .{ .preferred_link_mode = .static });
21 exe.linkSystemLibrary2("SvtAv1Enc", .{ .preferred_link_mode = .static });
22
23 b.installArtifact(exe);
24}
Your IDE/LSP won't be able to detect the C symbols at first, but after compiling the project, it will be able to detect them.
But one difference is certain: Zig has less "bridges" between Zig and C, which makes the code more readable than Go's:
1// In the .zig-cache directory, there is the translation of the C library to Zig
2pub extern fn avformat_open_input(ps: [*c][*c]AVFormatContext, url: [*c]const u8, fmt: [*c]const AVInputFormat, options: [*c]?*AVDictionary) c_int;
3
4// Usage
5fn test_avformat_open_input(input [*:0]const u8) c_int {
6 var ifmt_ctx: ?[*]c.AVFormatContext = null;
7 var ret = c.avformat_open_input(&ifmt_ctx, input_file, null, null);
8}
1// This is stored in the $HOME/.cache/go-build directory. The translation is not readable.
2//go:cgo_unsafe_args
3func _Cfunc_avformat_open_input(p0 **_Ctype_struct_AVFormatContext, p1 *_Ctype_char, p2 *_Ctype_struct_AVInputFormat, p3 **_Ctype_struct_AVDictionary) (r1 _Ctype_int) {
4 _cgo_runtime_cgocall(_cgo_fa42a779fc4c_Cfunc_avformat_open_input, uintptr(unsafe.Pointer(&p0)))
5 if _Cgo_always_false {
6 _Cgo_use(p0)
7 _Cgo_use(p1)
8 _Cgo_use(p2)
9 _Cgo_use(p3)
10 }
11 return
12}
13
14// Usage
15func testAVFormatOpenInput(input string) C.int {
16 var ifmt *C.AVFormatContext = nil
17 return C.avformat_open_input(&ifmt, C.CString(input), nil, nil)
18}
Oh wait, again! Here, Zig has multiple features that enhance the safety of the code:
[*:0]const u8
indicates a slice ([]const u8
) that is sentinel-terminated ([:0]const u8
) and is the pointer ([*]const u8
, which makes[*:0]const u8
). To summarize, this is a C string. Zig strings does not need to be manipulated with a pointer.?[*]c.AVFormatContext
is a nullable pointer to ac.AVFormatContext
struct. Zig has basic null safety. Since C does not have any null safety, you may see instead[*c]c.AVFormatContext
which is a C pointer to ac.AVFormatContext
struct and can be null.
Both languages suffer from one major issue: the comments are not passed to the translation, which means that deprecation notices or warnings are not passed to the Zig/Go code.
Overall, Zig has a slightly better C interop than Go.
Limitations of the C interop ๐
Zig has some limitations when it comes to C interop:
-
Some macros are translated to Zig, but not all of them. You may have to write the Zig equivalent of the macro:
1pub const av_err2str = @compileError("unable to translate C expr: expected ')' instead got '['");
The worst issue that I had is due to the strictness of Zig's type system. Some macros do not translate well between
u64
,c_int
,usize
... This is quite a pain, because FFmpeg (libavutil
) uses a macro to define errors at compile time. -
const
-hell. Zig is able to handleconst
at pointer ("pointer's value is immutable") and struct level ("struct is immutable"). However, Zig is quite picky when passing aconst
pointer to a C function (developper's fault):1c.av_guess_frame_rate(@constCast(ifmt_ctx), @constCast(in_stream), null);
Technically,
av_guess_frame_rate
accepts a const pointer because it does not modify the pointer.
Visibility ๐
Zig visibility is scoped to the file, which is more similar to Python or C than Go. This forces you to mostly develop in a single file, which is quite a pain when you have a large project.
To export a function, you have to use the pub
keyword:
1pub fn my_function() void {
2 // Your code here
3}
To import a function, you have, well..., to import it:
1const my_module = @import("my_module.zig");
2
3pub fn main() void {
4 my_module.my_function();
5}
Somewhat, I prefer Go's visibility, which is scoped to the package and allows you to separate responsibilities easily.
I mean, just look at the std
package in Zig. It's quite a mess. (Example: general_purpose_allocator.zig). Tests are also in the same file, which, I guess, it's fine.
The reason why I think that Zig is more messy than C and Python is because C's header indicates explicitly what is exported and what is not. And, Python hasn't really a visibility system, but it's quite easy to understand what is exported simply by looking at the variables and functions names.
Overall, Zig's visilibilty tries to be the best of both worlds: everything private in one file like in C, but without the hassle of header files, with the sacrifice of having one messy file. I hope there will be some styling guidelines in the future.
Syntax ๐
Zig's syntax has a lot of quality of life improvements compared to C and Go. I won't go too much into detail, but here are some examples:
-
Dereferencing and null-safety chaining:
1my_ptr.?.*.another_ptr.?.*.nullable_struct.?.ptr.*
Go equivalent:
1if myPtr != nil && myPtr.anotherPtr != nil && myPtr.anotherPtr.nullableStruct != nil { 2 myPtr.anotherPtr.nullableStruct.ptr // Implicit dereference 3}
C equivalent:
1if (my_ptr != NULL && my_ptr->another_ptr != NULL && my_ptr->another_ptr->nullable_struct != NULL) { 2 *(my_ptr->another_ptr->nullable_struct->ptr); 3}
-
For loops can uses ranges and zip (simultanous iteration)
-
Etc...
Developing the AV1 transcoder ๐
Developing the Remuxer ๐
I had some issue with developing with libav*
libraries due to the lack of resources. But after a while, I was able to do it in Zig without any wrapper.
To transcode a video, you must first think about remuxing the video: Read packets and write packets into a new container. Remuxing is relatively easy to do, since it's all about concatenating packets.
The steps are:
- Open the input video file.
- Open the output video file.
- Demux the input video file: read packets (
av_read_frame
) from the input video file in a while loop. - Process the packet: rescale the timestamps of the packet, and fix eventual discontinuities.
- Mux the packets: write the packets to the output video file (
av_interleaved_write_frame
) - Close the input and output video files. Clean up everything.
The example given by the FFmpeg documentation is quite accurate (minus the mpegts discontinuities fix)
I won't go too much into details, but here are the sexy stuff which was improved by Zig:
-
No more
goto
, no more danglingret
. Zigdefer
is powerful enough to handle most the memory management:1// Using Zig's allocator 2var stream_mapping = try allocator.alloc(i64, stream_mapping_size); 3defer allocator.free(stream_mapping); 4 5// Using C's (libav) allocator 6var enc_ctx = c.avcodec_alloc_context3(enc); 7if (enc_ctx == null) { 8 // Error handling 9} 10defer c.avcodec_free_context(&enc_ctx);
-
Slightly object-oriented, to bind multiple lifecycles into one:
1const Context = struct { 2 stream_mapping: []i64, 3 dts_offset: []i64, 4 // ... 5 allocator: std.mem.Allocator, 6 7 fn init(allocator: Allocator, size: usize) !Context { 8 return .{ 9 .stream_mapping = try allocator.alloc(i64, size), 10 .dts_offset = try allocator.alloc(i64, size), 11 // ... 12 }; 13 } 14 15 fn deinit(self: *Context) void { 16 self.allocator.free(self.stream_mapping); 17 self.allocator.free(self.dts_offset); 18 // ... 19 } 20};
Now, let's develop the transcoding side of the program.
Developing the Transcoder ๐
Transcoding a video adds 6 steps:
- Initializing the decoder.
- Initializing the encoder.
- Add a
while
loop to decode frames. - Add a
while
loop to encode frames. - Flush the encoder.
- Flush the decoder.
The steps include also fixing the timestamps and the frame rate.
You can pretty much use the example given by the FFmpeg documentation (minus the filters).
The code looks like this:
1// .. First loop, in the remuxer
2while (true) {
3 ret = c.av_read_frame(ifmt_ctx, pkt);
4 if (ret < 0) {
5 // No more packets
6 break;
7 }
8 defer c.av_packet_unref(pkt);
9
10 const in_stream_index = @as(usize, @intCast(pkt.stream_index));
11
12 // Packet is blacklisted
13 if (in_stream_index >= stream_mapping_size or stream_mapping[in_stream_index] < 0) {
14 continue;
15 }
16 const out_stream_index = @as(usize, @intCast(stream_mapping[in_stream_index]));
17 pkt.stream_index = @as(c_int, @intCast(out_stream_index));
18
19 const stream_ctx = stream_ctxs[out_stream_index];
20
21 try stream_ctx.fix_discontinuity_ts(pkt);
22
23 // Input to decoder timebase
24 try stream_ctx.transcode_write_frame(pkt);
25} // while packets.
26
27// ...
28
29// Second loop, in the decoder (stream_ctx)
30fn transcode_write_frame(self: StreamContext, pkt: ?*c.AVPacket) !void {
31 // Send packet to decoder
32 try self.decoder.send_packet(pkt);
33
34 while (true) {
35 // Fetch decoded frame from decoded packet
36 const frame = self.decoder.receive_frame() catch |e| switch (e) {
37 AVError.EAGAIN => return,
38 AVError.EOF => return,
39 else => return e,
40 };
41 defer c.av_frame_unref(frame);
42
43 frame.*.pts = frame.*.best_effort_timestamp;
44
45 if (frame.*.pts != c.AV_NOPTS_VALUE) {
46 frame.*.pts = c.av_rescale_q(frame.*.pts, self.decoder.dec_ctx.?.*.pkt_timebase, self.encoder.enc_ctx.?.*.time_base);
47 }
48
49 try self.encode_write_frame(frame);
50 }
51}
52
53// Third loop, in the encoder (also in stream_ctx)
54fn encode_write_frame(self: StreamContext, dec_frame: ?*c.AVFrame) !void {
55 self.encoder.unref_pkt();
56
57 try self.encoder.send_frame(dec_frame);
58
59 while (true) {
60 // Read encoded data from the encoder.
61 var pkt = self.encoder.receive_packet() catch |e| switch (e) {
62 AVError.EAGAIN => return,
63 AVError.EOF => return,
64 else => return e,
65 };
66
67 // Remux the packet
68 pkt.stream_index = @as(c_int, @intCast(self.stream_index));
69
70 // Encoder to output timebase
71 c.av_packet_rescale_ts(pkt, self.encoder.enc_ctx.?.*.time_base, self.out_stream.*.time_base);
72
73 try self.fix_monotonic_ts(pkt);
74
75 // Write packet
76 const ret = c.av_interleaved_write_frame(self.ofmt_ctx, pkt);
77 if (ret < 0) {
78 err.print("av_interleaved_write_frame", ret);
79 return ret_to_error(ret);
80 }
81 }
82}
Or in simple words:
- Read a packet from the input video file by calling
av_read_frame
. - Send packet to the decoder by calling
avcodec_send_packet
. - Receive a frame from the decoder by calling
avcodec_receive_frame
. - Send frame to the encoder by calling
avcodec_send_frame
. - Receive a packet from the encoder by calling
avcodec_receive_packet
. - Write the packet to the output video file by calling
av_interleaved_write_frame
.
And that's it! You have a video transcoder in Zig. (You'll also need to fix the timestamps and discontinuities, but that's another story).
Last part: the build system ๐
The build.zig
file that I've given earlier is quite enough to build the project. Zig automatically statically links the libraries, making the executable portable.
Oh, and you'll need to fork SVT-AV1 to enable the static
flag:
1# svt-av1-9999.ebuild, using Gentoo's Portage to build the package
2
3multilib_src_configure() {
4 append-ldflags -Wl,-z,noexecstack
5
6 local mycmakeargs=(
7 -DBUILD_TESTING=OFF
8 -DCMAKE_OUTPUT_DIRECTORY="${BUILD_DIR}"
9 -DBUILD_SHARED_LIBS="$(usex static-libs OFF ON)" # Enable static libraries based on the USE flag "static-libs"
10 )
11
12 [[ ${ABI} != amd64 ]] && mycmakeargs+=(-DCOMPILE_C_ONLY=ON)
13
14 cmake_src_configure
15}
However, I have one MAJOR issue: the libc is not statically linked, which means I'm unable to create a distroless Docker image. It would have been nice to have a prefered_link_mode
for the libc.
When disabling link_libc
, the executable does not compile since symbols are missing (even with linkSystemLibrary2("c", .{ .preferred_link_mode = .static })
). Normally, Zig automatically links the libc statically, but it seems that this isn't the case here.
To force the static linking, you can enable .linkage = .static
in the addExecutable
function. And instead of using linkSystemLibrary2
, you can use addObjectFile
. The issue with this technique is that you have to do everything manually, and you cannot use pkg-config
to find the missing includes and libraries:
1pub fn build(b: *std.Build) void {
2 const target = b.standardTargetOptions(.{});
3 const optimize = b.standardOptimizeOption(.{});
4
5 const exe = b.addExecutable(.{
6 .name = "av1-transcoder",
7 .root_source_file = b.path("src/main.zig"),
8 .target = target,
9 .optimize = optimize,
10 .linkage = .static,
11 .link_libc = true,
12 });
13
14 exe.addIncludePath(.{
15 .src_path = .{ .owner = b, .sub_path = "src" },
16 });
17 exe.addIncludePath(.{
18 .src_path = .{ .owner = b, .sub_path = "/usr/include" },
19 });
20 exe.addObjectFile(.{ .src_path = .{
21 .owner = b,
22 .sub_path = "/usr/lib/libavcodec.a",
23 } });
24 exe.addObjectFile(.{ .src_path = .{
25 .owner = b,
26 .sub_path = "/usr/lib/libavutil.a",
27 } });
28 exe.addObjectFile(.{ .src_path = .{
29 .owner = b,
30 .sub_path = "/usr/lib/libavformat.a",
31 } });
32 exe.addObjectFile(.{ .src_path = .{
33 .owner = b,
34 .sub_path = "/usr/lib/libswresample.a",
35 } });
36 exe.addObjectFile(.{ .src_path = .{
37 .owner = b,
38 .sub_path = "/usr/lib/libSvtAv1Dec.a",
39 } });
40 exe.addObjectFile(.{ .src_path = .{
41 .owner = b,
42 .sub_path = "/usr/lib/libSvtAv1Enc.a",
43 } });
44
At this point, the generated artifact is a static executable:
1$ ldd ./zig-out/bin/av1-transcoder
2ldd: ./zig-out/bin/av1-transcoder: Not a valid dynamic program
Yay!
Small issue: Because the paths are hardcoded, it will be quite difficult to cross-compile the project at the moment.
Conclusion ๐
Zig C interop is almost impeccable, and at least way better than Go's. Symbols are directly translated to Zig, and the memory management is quite easy to handle. The Zig API plugs well with the C API, making C code slightly safer.
However, the build system, while quite powerful, lacks of flexibility around the libc linking and pkg-config
support. Perhaps sticking to a Makefile would be better for now.
Overall, Zig presents some potential for low-level programming, or at least for dynamic libraries development. The features and syntax complement very well with C. But, I would still recommend C++ if you want to develop stable and production-ready software. While C++ is complex due to the richness of the language, C++ offers its kind of safety (smart pointers) which can help you avoid memory leaks and dangling pointers.
Lastly, because Zig is still in development, Zig lacks of high-level libraries and frameworks, which limits the use of Zig in production (no gRPC).
So, to conclude, I will use Zig for competitive programming and for basic API like my AV1 transcoder bot. But for production, I will stick to Go.