Üllar Seerme
Using UPX for compression might work against you
May 30, 2023
So I’m reading Hacker News, as one does1, about starting a Go project in 2023 and one commenter advises against using UPX (as does the original author, to a degree). UPX is something I only have a passing familiarity with in one of my Go-based pet projects—Wishlist Lite. It’s not that the binary’s size would be anything of real significance to begin with, but it just seemed a low enough hanging fruit that I would be foolish not to go for it. The contrarian from the comment above made me reconsider though. Here’s why.
Memory usage
While UPX succeeded in taking down the size of the binary from just 3.2MB when using CGO_ENABLED=0 go build -ldflags="-s -w" -trimpath
to build, to 1.3MB after upx --best
, it has the hidden cost of increasing memory usage, both in a single and multiple instance scenario. To quickly validate this I relied on ‘ps_mem’ and used the following steps:
- build application (compress if already tested normal build);
- run two instances of it;
- grab the memory usage statistics for both processes;
- compare results.
For the regular build (using the command above):
$ ps_mem -p $(pgrep wishlistlite -d ",")
Private + Shared = RAM used Program
4.5 MiB + 2.6 MiB = 7.2 MiB wishlistlite (2)
---------------------------------
7.2 MiB
=================================
The Shared column indicates that some amount of memory is shared between the two instances. In a single instance scenario the Shared value is expectedly low:
$ ps_mem -p $(pgrep wishlistlite -d ",")
Private + Shared = RAM used Program
4.9 MiB + 0.5 KiB = 4.9 MiB wishlistlite
---------------------------------
4.9 MiB
=================================
Now, after compressing with UPX in a single instance scenario:
$ ps_mem -p $(pgrep wishlistliteupx -d ",")
Private + Shared = RAM used Program
5.3 MiB + 0.5 KiB = 5.3 MiB wishlistlite
---------------------------------
5.3 MiB
=================================
The memory usage has gone up somewhat, which isn’t all that noticeable for an executable of an already small size, but crank up the instance count by just one and the differences are becoming more stark:
$ ps_mem -p $(pgrep wishlistliteupx -d ",")
Private + Shared = RAM used Program
10.5 MiB + 1.0 KiB = 10.5 MiB wishlistlite (2)
---------------------------------
10.5 MiB
=================================
There is now almost no shared memory between the two instances as the entirety of the program needs to be loaded in order to work. This is still all very trivial in terms of memory usage given the times that we’re living in and Wishlist Lite is an exceedingly simple application as well, but as evidenced the orders of magnitude can quite quickly start ramping up.
Start-up speed
Mention of start-up speed was also mentioned, which is something I’m extremely picky about, especially for command-line programs, so I wanted to investigate that too. Since Wishlist Lite is a terminal user interface it will just display something until the user does something. To get around that I just added a simple quitting procedure right after the first draw of the interface, which would, in my eyes at least, simulate a complete start-up of the application which could also be done rapidly N number of times.
The quickest way for me to measure things these days is to use a simple alias that relies on the ‘perf’ command and the differences between the two executables are apparent. First, here’s the regular build:
$ perf stat --null --table --repeat 10 ./wishlistlite
Performance counter st