Written 2022-12-07, updated 2022-12-07
The first post on this blog was “What’s bad about Julia” – a collection of the worst things about my favourite language, which turned out to be quite the Hacker News bait. The most commmon responses I got was along the lines of: “If Julia has all these flaws, why not just use another language?”. At the time, I just said that despite its flaws, Julia was still amazing, that it would take another 4,000 word post to elaborate on why, and then I left it at that.
Recently I’ve been thinking a lot about one of Julia’s major drawbacks, and have been drafting up a post that goes in depth about the subject. But honestly, posting another verbose criticism of Julia would risk giving a misleadingly bad impression of my experience with the lovely language, even if I bracket a wall of criticism with a quick endorsement. After all, I’ve chosen to use the language for my daily work about two years ago, and I don’t regret that choice in the slightest.
Now is the right time for that 4,000 word post on the best parts of Julia.
It’s both fast and dynamic
Julia’s speed is the first selling point of Julia, and for a reason. Speed is not the most groundbreaking or novel feature or Julia – that award probably goes to making multiple dispatch the only dispatch paradigm – but it’s the aspect that makes using Julia an un-ignorable option for some use cases. Simply put, for dynamic languages like Python, R or Perl, there are no good options for performance, only a wide selection of poor choices. Before moving my work to Julia, I’ve had the misfortune of being exposed to several of the awkward performance hacks of Python:
-
I’ve shoehorned my program logic to be vectorizable by Numpy, and ran into plenty problems when I reached fundamentally serial code
-
I’ve used Numba and run into its arbitrary limitations to support normal Python code, such as custom classes
-
I’ve used Cython and experienced un-debuggable errors, linker issues on package installation, the inability to distribute my package as source code, and the clunkiness of a separate compilation step in a scripting language.
After having dealt with all those bullshit workarounds, moving my work to Julia was like suddenly breathing in fresh air. I just wrote my code – and optimised it – and then it was as fast as I could want it. And suddenly, all the awkward gymnastics I had been doing simply due to the limitation of Python seemed silly.
That great combination of speed and dynamism is sometimes phrased as “As easy as Python, as fast as C”. The phrase is a little off, in my opinion – it’s not really possible to have a language where you write as carelessly as you do for a casual Python script, and it still runs like optimised C code. Code can only ever be fast if it’s written with the contraints of computer hardware in mind, and idiomatic Python isn’t.
A better catchphrase for Julia might be “The best expressiveness / performance tradeoff you have ever seen”. Idiomatic Julia code remains high-level, generic and readable when being optimised – only at the most extreme optimisation, when you have to micro-optimise assembly code or manually unroll loops does the code degrade and begin to appear low-level and clunky.
The gradual and subtle difference between high-level Python-like Julia code and high-performance Julia means that it feels natural to prototype and iterate on inefficient, carelessly thrown together code, and then incrementally optimise only the bottlenecks once performance become an issue. Often, you’ll find only a small fraction of the code actually needs to be optimised for the whole program to run fast. This kind of gradual performance is not something I’ve seen in the other languages I’ve coded in. They tend to be either slow but expressive, like Perl and Python, or fast but rigid, like Rust and Zig.
The “dynamic” half of the “fast and dynamic” duo should not be understated, either. I’m a scientist, which means my job description can be paraphrased as working with stuff I don’t understand, trying to make sense of it. In that context, it’s critically important to be able to pivot and iterate on a small script quickly as you test out and explore ideas – preferrably in an interactive manner on a dataset already in memory.
This process is cumbersome and awkward to do with static languages. Rust, for example, may have a wonderfully expressive type system, but it’s also boilerplate heavy, and its borrowchecker makes writing any code that compiles at all quite a time investment. An investment, which most of the time gives no returns when you’re trying to figure how to approach the problem in the first place. It’s also not entirely clear how I would interactively visualise and manipulate a dataset using a static language like Rust.
The package manager is amazing
These days, the package manager is probably the most central piece of software written for a programming language, other than the compiler itself. Here, Julia shines: Pkg.jl is an absolute joy to work with. Even after having used Julia for about 5 years, I’m still occasionally surprised by the thoughtfullness and convenience of Pkg. Coming from Python, which admittedly has a particularly bad package management story, Pkg is an absolute godsend.
Like the Rust manager Cargo, but unlike, say, Python’s Conda, Pkg separates the environment specification (the “project”) from the resolved environment (the “manifest”). This allows you do distinguish between direct and indirect dependencies, and means that unused indirect dependencies are automatically removed. For software engineering, only the project is necessary, and the manifest can be considered ephemeral. If you’re a scientist and want to completely reproduce the environment that the code was originally run with, you can simply command Pkg to instantiate an exact environment from the manifest.
Pkg is also delightfully fast. Resolving environments feels instant, as opposed to the glacially slow Conda that Python offers. The global “general” registry is downloaded as a single gzipped tarball, and read directly from the zipped tarball, making registry updates way faster than updating Cargo’s crates.io. Or, if you want, you can easily toggle Pkg to offline mode and skip updating the index alltogether. The ease and speed of making environments and installing packages into them encourages users to create many separate environments for each little experiment or task, which in turns leads to smaller environments, which reduces the risk of upgrade deadlock.
Beside specifying a version or a range of versions of a package you want to install, Pkg also allows you to install specific git commits or git branches. You can seamlessly install packages from remote git repositories, from local files, or from various registries. I say “various” registries, because Pkg is federated, and allows you to easily and freely mix multiple public and private package registries, even if they have no knowledge of each others and contain different packages with the same names.
The ease of making and using custom registries makes it attractive for even small organisations to maintain their own private registry of Julia packages, instead of using large Julia monorepositories. For example, in my last job, I created my own registry to keep track of the software used in my department. This way, different packages in the same code base can pick their own versions of internal packages to use. This makes incremental upgrades, or simultanous development of two interdependent packages, much easier.
The package manager also manages arbitrary binary artifacts, such as compiled libraries and executables. The BinaryBuilder package allows you to cross-compile the same program to all platforms supported by Julia, and automatically create a small Julia library (jll) package which automatically selects and wraps the correct binary depending on platform, allowing it to be executed with a single Julia function. This means you can create Julia packages which depend on, say, C++ executables, and still have it automatically installed by Pkg. In my experience it has been much, much easier to create binary packages compared to using Conda.
Optimising Julia code is pure joy
Julia code is not exactly the fastest compiled language, but it’s through