In response to last week’s post, I received the following e-mail:
“I might be a little late for a question about your recent post, but just in case I’m not: Do you have any strategies for, or stories about, dealing with an external library that you couldn’t get rid of that violated some (or all) of these API design guidelines? It’s a vague question, but I’m really just asking about any past experience as a user of an API that really sticks out in your mind.”
This reminded me that I’d always wanted to go write down the steps necessary to use a bad API, just to highlight how terrible it can be for the programmer. I don’t think people who make APIs really appreciate how important it is to get them right, and how much unnecessary work their mistakes can cause for hundreds, thousands, sometimes even millions of other programmers. So I felt like it was important to spend an article walking through an API and showing just how much unnecessary work an API can manufacture.
It’d probably be a nice column on its own — a weekly dissection of a bad API. But since I don’t have time for something like that, if I was only going to dissect one API, the most important question was, which API should I choose?
Event Tracing for Windows
It’s a great time in the history of computing to be writing an article about bad APIs (which is another way of saying it’s a terrible time to actually have to program for a living). There’s so many bad APIs out there, I could have picked one at random and been very likely to find enough problems to fill a 3000-word article. But if I was only going to pick apart one specific operation in one API, it seemed only right to try to pick the worst API I’d ever actually used.
Now there are a lot of APIs out there that routinely turn in top-ranking efforts for the “worst API” leaderboard. CSS, for example, can probably claim half the spots on the top 10 for any year in which there’s a new version. DirectShow, while it was still a going concern, certainly dominated the rankings for its era. And in the modern age, newcomers like the Android SDK are showing real potential with development environments so convoluted that the quality of the APIs when called from actual C++ code are the last thing you’ll worry about when trying to ship something with them.
But when I thought long and hard about who the all-time heavyweight bad API champion was, there was one clear winner: Event Tracing for Windows.
Event Tracing for Windows is an API that does something very simple: it allows any component of the system (including end-user software) to announce “events” which any other component can then “consume”. It is a logging system, and it is used to record performance and debugging information by everything from the kernel upwards.
Now, normally, a game developer would have no reason to use the Event Tracing for Windows API directly. You can use tools like PerfMon to view logged information about your game, like how much working set it was using or how much disk I/O it did. But there is one specific thing that directly accessing Event Tracing gives you that you can’t get anywhere else: context switch timing.
Yes, if you have any relatively recent version of Windows (like 7 or 8), the kernel will log all thread context switches, and using the CPU timestamp included in those events, you can actually correlate them with your own in-game profiling. This is incredibly useful information to have, and is the kind of thing you often only get from console hardware. It’s the reason tools like RAD’s Telemetry can show you when your running threads were interrupted and had to wait for system threads to do work, something that can often be critical to debugging weird performance problems.
So far, the API is sounding pretty good. I mean, context switch timing is very vaulable information, so even if the API was a little janky, it’d still be pretty great, right?
Right?
Write the Usage Code First
Before we take a look at the actual Event Tracing for Windows API, I want to walk the walk here and do exactly what I said to do in last week’s lecture: write the usage code first. Whenever you evaluate an API, or create a new one, you must always, always, ALWAYS start by writing some code as if you were a user trying to do the thing that the API is supposed to do. This is the only way to get a nice, clean perspective on how the API would work if it had no constraints on it whatsoever. If it was “magical”, as it were. And then, once you have that, you can move forward and start thinking about the practical problems, and what the best way is for you to get to something implementable.
So, if I were a programmer, with no knowledge of the Event Tracing for Windows API, how would I want to get a list of context switches? Well, two methods come to mind.
The most straightforward approach would be something like this:
{
{
}
}
which would imply an API design that looks like this:
{
};
{
};
{
{
};
};
{
};
That’s one way to do it. Very simple, trivial to understand, pretty hard to mess up. Someone stepping into this with the debugger would be able to see exactly what was going on, and you’d be able to tell pretty easily if you’d done something wrong.
However, I could imagine a scenario where performance-critical code would not want to pay the cost of the copy from the kernel’s buffer to your buffer, which this API requires (ETWGetEvents must copy the events from some OS-internal buffer, since it has to get them from somewhere). So a slightly more complex version would be to get some mapped memory back from the API that you use as a reading buffer:
{
{
}
}
All I have done here is changed the return mechanism from a copy to a ranged pointer. In ETWBeginTrace, the user now passes in the number of events they want to buffer at maximum, and the kernel reserves room in the user’s address space for that many events. It then writes directly into that memory if it can, avoiding unnecessary copies. When the user calls ETWBeginEventRead(), a begin and end pointer are returned that span some part of the event memory. Since it will be treated as a circular buffer, the caller is expected to loop on in case there are two ranges (a “head” and “tail”) that need to be returned. I included an end call, since certain methods of implementation might require the kernel to know what part of the buffer the user is looking at, so it can avoid writing into memory that is actively being read. I don’t really know that this sort of thing would be necessary, but if you wanted to cover your bases and give the kernel the maximum implementation flexibility, this definitely supports more implementations than the ETWGetEvents() version.
The API would be updated like this:
{
};
If one were so inclined, one could even support both retrieval methods with the same API just by still allowing the ETWGetEvents() call. Also, to complete the API with some error reporting, it might also be nice to have something like:
to allow you to check after each ETWGetEvents() whether too many events had occurred since the last check, and the kernel was forced to throw some away.
To each his own, but I suspect that most of the programmers I know wouldn’t have a lot of complaints with my API proposal as written. Everyone has their own taste, so I’m sure they would each tweak something to be more to their liking, but I doubt anyone would say it was horrible. It’s all pretty straightforward, and I suspect most programmers could integrate it into their code trivially without really thinking too much about it.
The reason the API is so straightforward is not because I employed a sophisticated set of API design practices to finesse my way to a good API. Quite the contrary. The API is simple because the problem it’s solving is trivial. It’s essentially the simplest API problem you can have in a system: how to move data from one place to another. It’s a glorified memcpy().
But it is precisely the simplicity of the problem that allows the Event Tracing for Windows API to really shine. Even though all it has to do is move memory from one place to another, it manages to involve almost every kind of complexity you can see in an API.
I don’t know how anyone is supposed to actually learn how to use the Event Tracing for Windows API. Maybe there are good examples floating around that I just never found. I had to piece the usage code together over the course of many hours of experimentation, pulling from various snippets of documentation. Each time I figured out another step of the process I thought, “Wait, seriously?” And each time Microsoft implicitly replied, “Seriously.”
Having me tell you how to call the API does take some away from the sheer awe of the experience, so I will say, if you want the full monty, stop reading now and go try to get context switch timestamps on your own. I can assure you it will be hours of fun and excitement. Those who’d rather save time at the expense of a day full of facepalm moments, read on.
OK, here we go. The equivalent to my proposed ETWBeginTrace() is Microsoft’s StartTrace() call. At first glance, it seems innocent enough:
However, when you look at what you need to pass in for the Properties parameter, things start to get a little hairy. The EVENT_
{
};
where the first member is actually a s