Link to my monolith PR: Native ETag Support by hamdaankhalid · Pull Request #908 · microsoft/garnet
Over the past few months, as part of my role at Microsoft Azure, I was tasked with implementing ETag support in Garnet, our RESP-compliant key-value store. Garnet is Microsoft's competitor to Redis, known for its ability to handle data larger than memory while still delivering exceptional performance.
This article documents the challenges, mistakes, and lessons I learned while implementing this feature—particularly given Garnet’s focus on nano-second level performance optimizations.
----------------------------------------
The core functionality took two weeks to develop and test end-to-end. Achieving 100% unit test coverage required an additional two weeks, followed by two more weeks for PR feedback and RESP API standardization.
However, I underestimated the complexity of optimizing for nano-second level performance. Addressing these regressions doubled my original timeline, requiring extensive benchmarking and tuning. Key efforts included:
if-else
statements to improve branch prediction.Lesson Learned: Always budget additional time for benchmarking when working on performance-critical systems. The process of measuring and optimizing each change takes far longer than anticipated.
Modern processors use branch prediction to optimize conditional control flow. When conditions are predictably true or false, branch prediction can be incredibly efficient. However, contrary to my intuition, I found that traditional if-else
conditions often outperformed branchless programming for predictable cases.
Micro-benchmarking was critical in uncovering this insight. Without it, my use of branchless programming would have added unnecessary complexity without any measurable benefits.
Lesson Learned: Your optimization is meaningless if you don’t measure it. Avoid “Schrödinger’s cat” scenarios with performance—always benchmark before and after making changes.
---------------------
I’m a fan of C# when it feels like C++—straightforward and easy to mentally map to IL/Assembly. However, newer C# features, like pattern matching, significantly improve readability when used correctly.
Initially, I used pattern matching extensively to simplify complex control flows. The result was more readable code, but benchmarking revealed a major performance hit. Investigating the compiled IL showed additional control flow instructions that the compiler couldn’t optimize effectively. While the if-else
approach was less readable and prone to future bugs, it produced faster, more efficient code.
Lesson Learned: Always analyze the IL (using tools like SharpLab or ILSpy) or Assembly to understand the cost of syntactic sugar. What looks elegant in source code may carry hidden performance penalties.
--------------
I wrote 2,000 lines of test code and rewrote the core implementation at least eight times. Testing against a well-defined high-level API saved my sanity, allowing me to quickly identify and fix regressions after each iteration.
This approach also helped during performance tuning and refactoring. The efficiency of the developer loop—where you can confidently hit Run Tests and get immediate feedback—proved invaluable.
Lesson Learned: None—this was one thing I got right from the start! Test-driven development (TDD) is a lifesaver when dealing with complex feature implementations.
Implementing ETag support for Garnet was a challenging but rewarding experience. It pushed me to refine my understanding of performance tuning and taught me the importance of proper benchmarking, understanding syntactic sugar costs, and maintaining a robust testing strategy.
Forgot this one. If you are manually fixing the size of a struct make it memory aligned https://learn.microsoft.com/en-us/cpp/cpp/alignment-cpp-declarations?view=msvc-170
Great read! I wonder, is there a team inside Microsoft that might want to know about the loose IL being generated with the new SS? Or is it unavoidable consequence of how the features themselves work?
The most optimal pattern matching would have eventually gotten to the same if-else but it would have been far from readable. At that point, the if-else based statement was what went further. Going to find the Diff from the PR and email it.
Link to the pattern matching diff: https://github.com/microsoft/garnet/pull/908/commits/7f5170d4c9591f47342c0b3303519e7026075bf5