Tootfinder

Opt-in global Mastodon full text search. Join the index!

@mgorny@social.treehouse.systems
2025-11-09 17:02:23

Yeah, so the GNU #make jobserver protocol is trivial, which can be a blessing and a curse. It puts the job management entirely on the clients, which means that they must reliably return job tokens, or otherwise the jobserver will be left with no jobs available and everything will hang. The make documentation is clear on this:
> Your tool should be sure to write back the tokens it read, even under error conditions. This includes not only errors in your tool but also outside influences such as interrupts (SIGINT), etc. You may want to install signal handlers to manage this write-back.
#NinjaBuild jobserver implementation may not handle this correctly, but fortunately it does. The irony is, it turns out that GNU make does not…
#Gentoo

@mgorny@social.treehouse.systems
2025-12-23 13:11:20

I'm building webkit-gtk right now. It's one of these messy packages where a few source files need a lot of memory to compile, and ninja can randomly order jobs so that all of them suddenly start compiling simultaneously. So to keep things going smoothly without OOM-ing, I've been dynamically adjusting the available job count via steve the #jobserver.
While doing that, I've noticed that ninja isn't taking new jobs immediately after I increased the job count. So I've started debugging steve, and couldn't find out anything wrong with it. Finally, I've looked into ninja and realized how lazy their code is.
So, there are two main approaches to acquiring job tokens. Either you do blocking reads, and therefore wait for a token to become available, or you use polling to get noticed when it becomes available. Ninja instead does non-blocking reads, and if there are no more tokens available… it waits till one of its own jobs finish.
This roughly means that as other processes release tokens, ninja won't take them until one of its own jobs finish. And if ninja didn't manage to acquire any job tokens to begin with, it is just running a single process via implicit slot, and that process finishing provides it with the only chance to acquire additional tokens. So realistically speaking, as long as there are other build jobs running in parallel, ninja is going to need to be incredibly lucky to ever get a job token, since all other processes will grab the available tokens immediately.
This isn't something that steve can fix.
#Gentoo #NinjaBuild