A Cake for Kotlin Part 3: Compiler performance in Scala
In the first article in this series, I presented my motivation for wanting to do dependency injection inspired by the Cake Pattern in Kotlin (and other languages without self-types), despite the known issues with that style of coding. I presented a way of doing so that I believe will allow coders to keep the good things from the Cake, while avoiding some of the problems. In the second article, I presented a larger example of what a program written in this style would look like.
While I didn’t go into detail at the time, the first article did mention that we at Knowit have suspected the Cake Pattern of worsening compiler performance, admittedly based on very circumstantial evidence. Compiler performance is already fairly bad in Scala (one of the reasons more and more projects at Knowit are looking at Kotlin), so anything making it worse is a potential problem.
The reason we were suspicious of the Cake Pattern comes from memory problems. We’ve had several cases in our larger Cake-based Scala projects where the compiler has run out of memory after we’ve added new modules to the program. The error messages involved also seemed to indicate these problems could be related to the use of self-types. Of course, memory use doesn’t necessarily correlate with speed, so we’ve never been entirely sure — just suspicious.
After finding a way that Cake Pattern-like dependency injection could be done without self-types, we decided to rewrite a program into the new style — still keeping it in Scala — to see if there was any improvement to performance from just switching styles. The program we chose belongs to one of our customers, so I regrettably can’t show you the code — but it is simply a change from a classic Scala Cake Pattern to the style described in the previous articles of this series.
The program is an application meant for reporting and handling unwanted incidents that happen in a large organization. It has several integrations with other systems. The frontend is written in JavaScript with React.js, and is not relevant to our current discussion. The backend, however, is written entirely in Scala, using Scalatra and some minor libraries. Database interaction is done directly using JDBC, no extra framework on top. The Scala code is organized into 41 Cake Pattern modules, each of which depend on anywhere from 0 to 17 other modules. The code has a high coverage of unit tests, each of which depend on a Cake Pattern test registry to mock out other modules (using Mockito). The case isn’t very large compared to some code bases out there, but it is among the largest we have that are based on the Cake Pattern.
Rewriting to the new style
Rewriting from the original Cake Pattern to the new “Kotlin-friendly” cake style, presented in previous articles, took me an afternoon. The work mostly consisted of removing the enclosing, self typed module-traits, and replacing them with constructor parameters. I discovered we had several circular dependencies I didn’t know about, which is telling: Circular dependencies are so easily supported in the (original) Cake Pattern that you can have them without knowing about it. It also turned out that several of these circular dependencies were easily resolved by moving small bits of functionality from one module to another — a refactoring operation which ended up making the code much tidier at little cost. For other cases, we had to rely on the solution we suggested in the first article in this series, and depend directly on the Component Registry.
The rewrite also revealed that no less than 14 modules weren’t properly mocked out in the test registry — a weakness to the cake pattern that I discussed in the first article.
Rewriting the code to the new style of cake injection involved no major difficulties. It revealed several weaknesses in the code, allowing me to improve the code base. This would seem to indicate that our new way of writing the cake pattern is an improvement over the old. It could be argued that any simple but involved rewrite of a given program would reveal weaknesses. But the weaknesses we revealed in this case consisted entirely of things the old style cake pattern allows without checks, and that the new style reacts to with compiler errors. This leads me to think that our new style of cake pattern is an improvement on the old when it comes to aiding the developer in maintaining code quality.
Compiler performance in Scala
I decided to keep things simple when measuring performance: The project would be built using “mvn clean compile test-compile -DskipFrontend”, which in our setup forces compilation of both normal and test classes, while not triggering anything related to the frontend and without running tests or building artifacts. The project uses Scala 2.12.4, the Zulu distribution of Java 8 and maven 3.5.3 for building.
Using the default output from Maven to measure time, the old code — the one utilizing self-types for dependency injection — took 1 minute and 15 seconds to compile on a developer machine. Variance from build to build was no more than 2 seconds. Building the old and the new code several times, it became immediately obvious that the change made no practical difference when it came to compilation speed. The version without self-types was never slower than the old version, but only between 1 and 4 seconds faster than the old version. In other words: While staying on Scala, avoiding self-types in your Cake Pattern seems to make no significant impact on build time.
What about memory use? We used visualvm to monitor the build, and discovered a fairly significant difference: The original version, the one using self-types, peaked at approximately 870 megabytes of used heap. The refactored version, avoiding self-types, peaked at about 530 megabytes. While memory use in Java is a bit unpredictable due to garbage collection, this behaviour was consistent across several builds. The difference indicates a nearly 40% reduction in memory use during compilation, just from a minor change in code style. Of course, memory use during compilation is unlikely to be a major problem for most projects — but the difference is significant, and in a larger codebase could be a concern.
We tried measuring various other things as well: Application footprint and performance, as well as artifact size. We found no measurable differences in these things between the two styles of code.
Conclusions
That Kotlin has a much better compiler performance than Scala is well known, even though Scala compile time has improved recently. It would be interesting to rewrite the program discussed above to Kotlin (as well as upgrading it to the most recent version of Scala) and look at the difference, but the code base is just a tad bit too large to do so easily. (Even upgrading Scala to the most recent version is likely to be a time-consuming task due to dependencies.)
It should be mentioned that going from Scala to Kotlin isn’t just about improving compile time. There are many other differences as well, and it seems far from decided which of the two languages will “win” on the JVM — if either of them will. But as we mentioned in our previous article, most programmers and projects at Knowit currently favour Kotlin over Scala.
Our hypothesis that the approach based on self-types caused slower compilation has been rejected — the experiment uncovered nothing that indicates noticeable improvements to compilation speed with the new style — apart form allowing us to move to Kotlin. But neither did the experiment show any downsides to using the “Kotlin-friendly” way of writing Cake Pattern-like code in Scala. Instead, it confirmed other advantages to the new style: Memory use by the compiler is significantly lower than with the traditional cake pattern, and more importantly: It seems that our proposed new way of writing the code makes it easier to maintain high code quality by helping us to avoid circular dependencies and to keep test registries properly mocked.