On December 3rd, 2021, a friend of mine needed help: their program was crashing on Wine, and they wanted to know how to fix it.
(And I’m only getting around to publishing this now, several months later. I’m a bit disorganized…)
Normally, the answer to this question is not very complicated, because a lot of stuff works just fine in Wine provided the environment is setup correctly, and some stuff simply doesn’t work. Usually problems fall into one of those two categories; however, in this case, there really wasn’t any obvious reason (at least to me) why it shouldn’t work.
The program in question is compiled with MSys2’s MinGW-w64 package, using GCC 10.3. It contains a few other libraries also compiled with the same toolchain, as DLLs, including libpng
and zlib
, which I am about to become very familiar with.
Some debugging had already been done, and they knew which call the application was failing in: a call to png_read_info
. I ran it under Wine using the WINE_DEBUG=+all
option and generated a gigantic log file of mostly useless information. After a bit of correlation, I found the last API call before the crash: an msvcrt._read
, returning successfully… and then we crash.
After a bit of misdirection, I realized a crucial detail that I had been glancing over for a bit: the access violation was an execute, not a read or write. That means that RIP is landing in the middle of a page that is not executable. Hmmm. Stack corruption, somehow?
Something interesting about Wine is that you can run it under Valgrind, which, with a few flags, does actually work correctly. But upon doing this, I discovered nothing particularly interesting, certainly nothing that would suggest stack corruption, so I moved on.
At this point I decided to break out rr, a special debugger that can record and replay program execution. Honestly, it’s a bit overkill here, but it does make it easier to analyze crashes, and this seemed like a good excuse to pull it out. There is a bit of trickiness with using rr on top of Wine, but it works more or less just fine; it’s just a bit of a pain to get the replay working. I never quite figured out how to get debug symbols to map correctly with this Wine-under-GDB setup going on, so I had to manually explore the address space to figure out what I was looking at.
After much ado, the program crashes into… nowhere. It crashes at 0x2'fe8f'2910
. Nothing is mapped here. Hmm.
Using the magic of rr, I can replay to some point directly before the crash and then step into it. A few hundred stepi
s later, and I found the culprit: e8 20 45 7e 96
, at the address 0x3'6810'e3eb
. AKA, CALL 0x2fe8f2910
. In other words: there is an explicit CALL
to nowhere.
At this point, I threw libpng
in a disassembler, and found the instruction at 0x36810e3eb
. The instruction?

Bizarre. That CALL
has a completely different address. It is e8 78 9e 03 00
, not e8 20 45 7e 96
which is non-sense and points backwards to before the entire module.
So who’s modifying the CALL? Is it the program? Is it libpng? Is it Wine?

One thing we do know about the .text
segment is that it’s read-only. Of course, you should at least verify this in your disassembler, but I did, and indeed, it’s read-only. That means that in order to modify the segment, someone would need to deliberately mark it writable. On UNIX-like platforms, you would use a syscall like mprotect
, whereas Windows provides VirtualProtect
in kernel32
. Thankfully, there’s really no way that libpng would link to VirtualPro
–

…what exactly is libpng doing calling this?

Apparently, it reaches back to sub_3680F1200
, which is just the entry point of the DLL–there’s a stub over at the “true” entry point, but IDA does not count the jmp
in the call graph, so you can’t see it here.
In order to try to identify what this code was, I used the tried and true strategy of looking for interesting strings, and quickly found a few, but the most interesting was this one: "Unknown pseudo relocation bit size %d"
– hrm, what’s a pseudo relocation?
I’ve mostly glossed over many of the lower level details in this post, but I think this one merits some more attention.
What’s a normal relocation?
Before answering what a pseudo relocation is, I’d like to discuss regular relocations. When a linker links a program module, it has to pick some arbitrary “base address” to use for position-dependent code and data. What does that mean? Let’s say you have a global, statically-initialized variable that is a pointer to another global variable. This is allowed. The pointer written into the executable file during compilation (specifically linking) is the address that would be correct if the program module was loaded into its preferred base address. Much code and data is position-independent, and thus does not need relocations, but any place where an absolute offset into the address space must be written, such as static pointers, relocations will be needed.
However, being loaded at your preferred base address is somewhat rare these days. For one thing, almost all executable loaders have to support relocating the module to a different base address, because otherwise, it’d be impossible to simultaneously load two modules whose preferred base addresses lead to an overlap, and these cannot be coordinated ahead of time in most cases. In addition, modern AMD64 machines have plenty of address space, so for security reasons, a mitigation called ASLR is almost always used, which essentially just randomizes the base address of program modules even if they are not initially overlapping. (This is a bit of an oversimplification.)
If we move (that is, change the location of) the program module in memory, the addresses that the linker had to write based off of the preferred base address don’t line up, as the module is now at a different address, and all offsets are now shifted by some value. In order to adjust this