Lessons Learned: Implementing ETag Support for Microsoft Research's Garnet Store



Link to my monolith PR: Native ETag Support by hamdaankhalid · Pull Request #908 · microsoft/garnet


Over the past few months, as part of my role at Microsoft Azure, I was tasked with implementing ETag support in Garnet, our RESP-compliant key-value store. Garnet is Microsoft's competitor to Redis, known for its ability to handle data larger than memory while still delivering exceptional performance.


This article documents the challenges, mistakes, and lessons I learned while implementing this feature—particularly given Garnet’s focus on nano-second level performance optimizations.



----------------------------------------


Timeline Estimation


The core functionality took two weeks to develop and test end-to-end. Achieving 100% unit test coverage required an additional two weeks, followed by two more weeks for PR feedback and RESP API standardization.

However, I underestimated the complexity of optimizing for nano-second level performance. Addressing these regressions doubled my original timeline, requiring extensive benchmarking and tuning. Key efforts included:

  • Rearranging if-else statements to improve branch prediction.
  • Addressing cache misses caused by poor struct alignment.
  • Reading Intermediate Language (IL) and Assembly to understand unintended instructions introduced by features like pattern matching.
  • Experimenting with branchless programming, only to find it didn’t yield significant improvements in my case.

Lesson Learned: Always budget additional time for benchmarking when working on performance-critical systems. The process of measuring and optimizing each change takes far longer than anticipated.


--------------------


Branch Prediction vs. Branchless Programming


Modern processors use branch prediction to optimize conditional control flow. When conditions are predictably true or false, branch prediction can be incredibly efficient. However, contrary to my intuition, I found that traditional if-else conditions often outperformed branchless programming for predictable cases.

Micro-benchmarking was critical in uncovering this insight. Without it, my use of branchless programming would have added unnecessary complexity without any measurable benefits.

Lesson Learned: Your optimization is meaningless if you don’t measure it. Avoid “Schrödinger’s cat” scenarios with performance—always benchmark before and after making changes.


---------------------


The Hidden Costs of Syntactic Sugar


I’m a fan of C# when it feels like C++—straightforward and easy to mentally map to IL/Assembly. However, newer C# features, like pattern matching, significantly improve readability when used correctly.

Initially, I used pattern matching extensively to simplify complex control flows. The result was more readable code, but benchmarking revealed a major performance hit. Investigating the compiled IL showed additional control flow instructions that the compiler couldn’t optimize effectively. While the if-else approach was less readable and prone to future bugs, it produced faster, more efficient code.

Lesson Learned: Always analyze the IL (using tools like SharpLab or ILSpy) or Assembly to understand the cost of syntactic sugar. What looks elegant in source code may carry hidden performance penalties.


--------------


TDD for Adding Features Paid Off


I wrote 2,000 lines of test code and rewrote the core implementation at least eight times. Testing against a well-defined high-level API saved my sanity, allowing me to quickly identify and fix regressions after each iteration.

This approach also helped during performance tuning and refactoring. The efficiency of the developer loop—where you can confidently hit Run Tests and get immediate feedback—proved invaluable.

Lesson Learned: None—this was one thing I got right from the start! Test-driven development (TDD) is a lifesaver when dealing with complex feature implementations.



Implementing ETag support for Garnet was a challenging but rewarding experience. It pushed me to refine my understanding of performance tuning and taught me the importance of proper benchmarking, understanding syntactic sugar costs, and maintaining a robust testing strategy.




Comments

Commenter: Hamdaan Khalid

Forgot this one. If you are manually fixing the size of a struct make it memory aligned https://learn.microsoft.com/en-us/cpp/cpp/alignment-cpp-declarations?view=msvc-170

Commenter: Joshua Gooden

Great read! I wonder, is there a team inside Microsoft that might want to know about the loose IL being generated with the new SS? Or is it unavoidable consequence of how the features themselves work?

Commenter: Hamdaan Khalid

The most optimal pattern matching would have eventually gotten to the same if-else but it would have been far from readable. At that point, the if-else based statement was what went further. Going to find the Diff from the PR and email it.

Commenter: Hamdaan Khalid

Link to the pattern matching diff: https://github.com/microsoft/garnet/pull/908/commits/7f5170d4c9591f47342c0b3303519e7026075bf5

Add a comment: