Fixing GCC Errors With Act And Docker
Troubleshooting GCC Fatal Errors with act and Docker
Hey guys, ever run into a brick wall when trying to get your CI/CD pipelines humming? I recently wrestled with a particularly nasty issue involving act
and some gcc
fatal errors, and I figured I'd share my experience. It might save you some headaches down the road.
The Setup: act, CETL, and Docker
First off, let's set the stage. I was working on a project called CETL, which is part of the open cyphal ecosystem. CETL uses build containers defined in the docker_toolchains repository. I wanted to use act
to simulate my GitHub Actions workflows locally. act
is a fantastic tool that lets you run your GitHub Actions workflows locally by using Docker.
I'm on an M3 Max laptop, and I can manually build the containers and run them. The problem comes when I tried to do the exact same thing with act
. I use the exact same repository, Docker Desktop instance, and everything; when I try to use act gcc crashes inexplicably?
Here’s a snapshot of the problem. When act
tried to compile the code using g++
, it was getting terminated by a Killed signal
. This led to a series of compilation failures, as shown in the error messages I was getting.
| FAILED: suites/unittest/CMakeFiles/test_pf17_variant_ctor_3__googletest_objlib.dir/Release/test_pf17_variant_ctor_3.cpp.o
| /usr/bin/g++ -DCETL_ENABLE_DEBUG_ASSERT=0 -DCETL_VERSION=\"0.0.0\" -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_FAST -DCMAKE_INTDIR=\"Release\" -I/Users/thirtytwobits/workspace/github/thirtytwobits/CETL/cetlvast/include -I/Users/thirtytwobits/workspace/github/thirtytwobits/CETL/include -I/Users/thirtytwobits/workspace/github/thirtytwobits/CETL/cetlvast/build_external/o1heap/o1heap -isystem /Users/thirtytwobits/workspace/github/thirtytwobits/CETL/cetlvast/build_external/googletest/googletest/include -isystem /Users/thirtytwobits/workspace/github/thirtytwobits/CETL/cetlvast/build_external/googletest/googletest -isystem /Users/thirtytwobits/workspace/github/thirtytwobits/CETL/cetlvast/build_external/googletest/googlemock/include -isystem /Users/thirtytwobits/workspace/github/thirtytwobits/CETL/cetlvast/build_external/googletest/googlemock -O3 -DNDEBUG -std=c++17 -pedantic -Wall -Wextra -Werror -Wfloat-equal -Wconversion -Wunused-parameter -Wunused-variable -Wunused-value -Wcast-align -Wmissing-declarations -Wmissing-field-initializers -Wdouble-promotion -Wswitch-enum -Wtype-limits -Wno-error=array-bounds -O3 -fno-delete-null-pointer-checks -Wsign-conversion -Wsign-promo -Wold-style-cast -Wzero-as-null-pointer-constant -Wnon-virtual-dtor -Woverloaded-virtual -MD -MT suites/unittest/CMakeFiles/test_pf17_variant_ctor_3__googletest_objlib.dir/Release/test_pf17_variant_ctor_3.cpp.o -MF suites/unittest/CMakeFiles/test_pf17_variant_ctor_3__googletest_objlib.dir/Release/test_pf17_variant_ctor_3.cpp.o.d -o suites/unittest/CMakeFiles/test_pf17_variant_ctor_3__googletest_objlib.dir/Release/test_pf17_variant_ctor_3.cpp.o -c /Users/thirtytwobits/workspace/github/thirtytwobits/CETL/cetlvast/suites/unittest/test_pf17_variant_ctor_3.cpp
| g++: fatal error: Killed signal terminated program cc1plus
| compilation terminated.
[7/179] Building CXX object suites/unittest/CMakeFiles/test_pf17_variant_assignment_2__googletest_objlib.dir/Release/test_pf17_variant_assignment_2.cpp.o
The Culprit: Container Mismatch?
The core problem appeared to be that act
wasn't quite using the containers I was specifying in my action.yml
file. The action.yml
file defines the container that should be used for the builds.
runs-on: ubuntu-latest
container: ghcr.io/opencyphal/toolshed:ts24.4.3
This tells GitHub Actions to use a specific container image, which, in turn, contains the required toolchains and dependencies for my project. However, act
wasn't behaving as expected, so it seemed that this container wasn't being properly utilized. The crashes strongly suggested that the build environment within act
was not the same as the one I was setting up. This mismatch was causing gcc
to choke, likely due to missing dependencies or incompatible versions of tools.
Digging Deeper: Investigation and Debugging
To get to the bottom of this, I needed to confirm whether act
was correctly pulling and using the specified Docker image (ghcr.io/opencyphal/toolshed:ts24.4.3
). Here are the steps I took:
- Verify Docker Setup: I made sure that Docker Desktop was running smoothly on my machine, and that I could pull and run the container image manually using
docker run
. This step confirmed that the image itself was valid and accessible. - Verbose Mode in act: I ran
act
with the-v
(verbose) flag. This provided detailed output, allowing me to see what Docker imagesact
was attempting to use and any potential errors during the process. This proved very helpful in identifying exactly whatact
was doing. - Check Environment Variables: I reviewed my
actrc
file (located in~/Library/Application Support/act/actrc
) and the environment variables. Misconfigurations here can mess up howact
interacts with Docker. I double-checked that all my container definitions were correct. - Inspect the Build Process: I studied the detailed build logs provided by
act
to pinpoint exactly where thegcc
errors were occurring. This information was key to understanding the underlying issues.
Potential Solutions and Workarounds
After a fair bit of head-scratching, I identified a few areas to look into, and these are potential solutions to the problems described above.
-
Ensure Container Architecture Compatibility: When using the
--container-architecture
flag, be sure to specify the correct architecture (e.g.,linux/amd64
,linux/arm64
). In my case, this seemed to make a difference, as it forcesact
to use the right architecture. I tried this by running:act -v -j verification-arm64 release --container-architecture linux/amd64 act -v -j verification-amd64 release --container-architecture linux/amd64 act -v -j verification-arm64 release --container-architecture linux/arm64 act -v -j verification-amd64 release --container-architecture linux/arm64
This ensures that
act
is pulling and running the correct container architecture. -
Act Configuration: Double-check your
actrc
file and any environment variables. Make sure there aren't any conflicting settings that could be interfering with the container selection. -
Clean up: Sometimes, Docker can be a bit finicky. Try cleaning up your Docker environment by removing unused images and containers. This can help prevent conflicts and ensure
act
is working with a clean slate. Use commands likedocker system prune -a
. -
Check for Resource Limitations: Ensure that your Docker Desktop setup has sufficient resources allocated (CPU, memory). Insufficient resources can lead to compilation failures and other issues, especially with complex builds.
-
Update act: Make sure you're running the latest version of
act
. Older versions might have bugs related to container handling.
The Resolution (and What Worked for Me)
In the end, the key to resolving the issue was a combination of these steps. I made sure that:
- The correct container architecture was being specified using the
--container-architecture
flag. - Docker Desktop had sufficient resources allocated.
- I was using the latest version of
act
.
By focusing on these key areas, I was able to get act
to play nicely with my build containers. The gcc
errors disappeared, and my workflows ran successfully.
Conclusion: Sharing the Pain (and the Fix!)
Hopefully, this breakdown helps you if you run into similar issues with act
and Docker. The key takeaways are:
- Verify Container Usage: Always double-check that
act
is using the correct container image and architecture. - Use Verbose Mode: The
-v
flag is your friend! It provides valuable insights into what's going on under the hood. - Resource Management: Ensure Docker has enough resources.
Debugging CI/CD pipelines can be a real pain, but with a systematic approach, you can track down these issues. Good luck, and happy coding!