It’s common to be asked a question like “what’s the hardest bug you’ve debugged?” at job interviews in tech. This post is about the bug I usually describe. The snag is that it’s quite involved and I don’t actually understand it all the way through – there are one or two aspects to it I often hand-wavingly gloss over. The hope was that by writing it out and fact checking it I’d have a better handle on it; this is what came out.
This came up during my first internship at Google in 2012. I was on a team in Ads, but I was mostly working on Google Web Toolkit, which was what many Ads applications were written in. My project was to add support in the GWT compiler for measuring code coverage from browser automation tests using WebDriver. This has nothing to with the bug except that it also involved coverage runs and because I had been messing with coverage stuff I’d been cc’d on the bug tracker and decided to look into it.
The bug manifested like this. There was a team in Ads that had a GWT application and a whole bunch of tests written against it. A single one of these tests had a strange property: it would pass during regular test runs, but it would fail during coverage runs. (In particular, there was an automatic nightly coverage run which always showed up as failing because of this.)
So this was weird – why would running a test for coverage change its outcome?
I don’t remember what the test was actually testing, but modulo identifiers the relevant bit looked something like this:
1 2 3 4 5
ImmutableSet refers to this Guava collection, and
Option is some
enum defined by the application. In Java, all enums are automatically equipped with a
values() method, which returns an array of all the values on that enum, and so
the intent here is to build a set containing all the enum values and compare that against
the set built up by the test’s setup logic.
Digging through the error logs of the coverage run, the error looked liked this:
If you’re a Java person, the problem is fairly obvious; the thing on the left (corresponding to the first argument
assertEquals) is an
ImmutableSet<Option> containing the result of calling
instead of an
ImmutableSet<Option> containing a copy of the array. If you look at the docs linked above,
you’ll notice that
ImmutableSet doesn’t actually have
of(E) method, which means that it wouldn’t copy the contents of the array. Instead the type inference
comes up with
ImmutableSet<Option> and the method called is the
of(E) method where
E = Option.
So the fix was to use
copyOf instead of
of, and that’s that. But now it seemed like the test shouldn’t have
been passing in the first place, even during regular runs, since it was relying on this non-existent
that copies arrays. So what was going on really?
Since the compiler takes in Java source code, it can’t deal with native methods, which are common in the standard
library classes. It can also be useful to reimplement standard classes in such a way as to take advantage of
Array, for instance.
Given this, GWT has a notion of emulated classes; classes are reimplemented in GWT-friendly Java,
a special directive can be passed to the GWT compiler to let it find the emulated source,
GWT includes emulated versions of a subset of the standard library, and the abovementioned directive
is also available for applications or other libraries to use.
As it happens, one of these libraries was Guava – several Guava collections, including
emulated versions for use by GWT applications.
Switching gears slightly, one of the big selling points of GWT when it came out was that it was compatible with a lot of popular Java tools – you could step through your app using Eclipse’s debugger, run tests with JUnit, and measure code coverage using Emma, a popular open source Java coverage tool.
Now Emma is a coverage tool that works with JVM bytecode – in a manner similar to what I described in my last post, it instruments class files, so that as they run, they create some files containing coverage data, which are later picked up to produce coverage reports. GWT’s Emma support is the aspect of this whole thing that I understand least. (There exists a wiki page with some notes on Emma support, but it’s not very enlightening). It seems that whatever magic GWT needs to perform to play nice with Emma is not done for emulated classes. The only thing I could find related to this is this old thread from the Scala+GWT project, which also isn’t very enlightening, but seems to say that when Emma support was being developed it was decided that emulated classes weren’t worth supporting since, among other things, they weren’t used much. In any case, this means that for coverage runs, not only does the code run as Java in a JVM, but it runs against the “real” versions of classes.
The final piece of the puzzle is that this mystery
ImmutableSet.of(E) method used to exist; you can browse
an old version of the docs to see it. It was deprecated for a long time and eventually removed.
However, for whatever reason, the GWT-emulated version was not kept in sync with these changes. So the GWT-emulated
version still had that method, and this was why the regular test run, which ran against the emulated classes, passed.
The coverage run, which ran against the “real” Guava, failed.