Leaking file handles stalling the sound server
I've been having this issue where librewolf (firefox) won't correctly failover when the sound server has issues, requiring a browser restart.
The first thing I did was seeing if it's reproducible:
$ pactl subscribe
Event 'new' on client #17163 //librewolf started
Event 'new' on sink-input #17164 //audio starts playing
Event 'change' on sink-input #17164
Event 'change' on sink-input #17164
Event 'change' on sink-input #17164
Event 'change' on sink-input #17164
Event 'change' on sink-input #17164
Event 'change' on sink-input #17164
Event 'change' on sink-input #17164
Event 'remove' on sink-input #17164 //second media source started, audio is now permanently dead until librewolf is restarted
Event 'remove' on client #17163 //librewolf closed
It is extremely reproducible and appears to stall the entire sound server (pactl list fails when this happens).
I narrowed it down to too many audio streams being started at once, and pipewire-pulse reported:
$ journalctl -xe | grep pipewire
Jun 14 21:30:10 hostname pipewire-pulse[2319]: mod.adapter: can't load spa node: Too many open files
Jun 14 21:30:10 hostname pipewire-pulse[2319]: mod.adapter: usage: node.name=<string>
Jun 14 21:30:10 hostname pipewire-pulse[2319]: pw.resource: usage: node.name=<string>
Jun 14 21:30:10 hostname pipewire-pulse[2319]: pw.stream: 0x5a6b9ff3fd80: can't make node: Invalid argument
Jun 15 14:16:21 hostname pipewire-pulse[2319]: mod.protocol-pulse: 0x5a6b99bc8b40: failed to connect client: Too many open files
Jun 15 14:16:21 hostname pipewire-pulse[2319]: mod.protocol-pulse: client 0x5a6b9fc01d40 [LibreWolf]: ERROR command:9 (SET_CLIENT_NAME) tag:1 error:10 (Too many open files)
$ journalctl -xe | grep wireplumber
Jun 14 21:12:33 hostname wireplumber[1288]: <WpAsyncEventHook:0x5b215678f750> failed: <WpSiStandardLink:0x5b21569d4fa0> link failed: some node was destroyed before the link was created
#There are a lot of these, removed for brevity
Alright, so something is leaking way too many file handles.
$ cat /proc/sys/fs/file-max
9223372036854775807
$ cat /proc/sys/fs/nr_open
1073741816
The issue doesn't appear to be at system level, those limits are sane.
pactl list
output:
I don't see any obvious problems here.
$ lsof -p `pidof pipewire-pulse` | wc -l
1402
$ lsof -p `pidof pipewire-pulse`
That does look like a potential problem. Note how there are tons of DEL entries just being held.
Quitting out pavucontrol immediately dropped that amount to 81, started some new audio streams and closed them with pavucontrol ballooned that to 241, quitting pavucontrol again dropped it back down to 107:
We can therefore conclude that pavucontrol, which I like to keep open to manage program sounds on-the-fly, holds onto file handles long after their expiry and is therefore leaking file handles (1402 is still not near the system-wide limits but probably is near some pipewire limit, even if this limit were raised it's only making the problem take longer to manifest).
Just starting pavucontrol almost doubles the file handles, but these leak forever: