Skip to content

CI: work around LeakSanitizer crashes with use_tls=0

Pekka Paalanen requested to merge pq/weston:mr/leaksanity into main

Without this fix, we have randomly been getting CI failures due to LeakSanitizer itself crashing after all the tests in a program have succeeded. This has been happening randomly for a long time, but !1486 (merged) made it very reliably repeatable in the job x86_64-debian-full-build (and no other job) in the test-subsurface-shot program.

--- Fixture 2 (GL) ok: passed 4, skipped 0, failed 0, total 4
Tracer caught signal 11: addr=0x1b8 pc=0x7f6b3ba640f0 sp=0x7f6b2cc77d10
==489==LeakSanitizer has encountered a fatal error.

I was also able to get a core file after twiddling, but there it ended up with lsan aborting itself rather than a segfault.

We got some clues that use_tls=0 might work around this, from https://github.com/google/sanitizers/issues/1342 and https://github.com/google/sanitizers/issues/1409 and some other projects that have cargo-culted the same workaround.

Using that cause more false leaks to appear, so they need to be suppressed. I suppose we are not interested in catching leaks in glib using code, so I opted to suppress g_malloc0 altogether. Pinpointing it better might have required much more slower stack tracing.

wl_shm_buffer_begin_access() uses TLS, so no wonder it gets flagged.

ld-*.so is simply uninteresting to us, and it got flagged too.

Since this might have been fixed already in LeakSanitizer upstream, who knows, leave some notes to revisit this when we upgrade that in CI.

This fix seems to make the branch of !1486 (merged) in my quick testing.

Suggested-by: @derekf


I have tested this in the https://gitlab.freedesktop.org/pq/weston/-/tree/wip/fragshort-debug?ref_type=heads branch on top of !1486 (merged) twice and it worked. First time there were two infrastructure failures, but a retry helped those. Then I retried all the test jobs, and all succeeded.

This seems like an acceptable trade-off to me, getting more reliable leak checking at the cost of the added suppressions. I don't know of other disadvantages.

Merge request reports

Loading