HallPass.dev Consuming Rate-Limited APIs more Easily

TLDR

Nuget: https://www.nuget.org/packages/HallPass/

GitHub: https://github.com/BambooSoftwareLLC/HallPass.NET

Web: https://hallpass.dev/

Background

A while back, a client wanted to create a monthly report from data in their Shopify store. They had tons of products, and getting the data required calling a couple different Shopify endpoints repeatedly. Easy to write, except Shopify also rate-limits their API (as all real APIs do), so I couldn’t just blast the endpoints all at once (or in rapid succession) and expect good results.

I retrofitted some code to mostly help me respect the rate limits, but it was never a great solution. Also annoying to write, and also still wouldn’t let me scale out the solution horizontally. The solution was implemented with Google Cloud Functions, and it would have been great to launch multiple functions at once to crunch through the data, but then they’d all need to talk to each other to make sure they all respected the same rate limit.

There are bits and pieces of libraries and services that can help with these issues across various platforms. Istio might have something with their service mesh stuff, but if you can figure out how to use it, you’re one of the smartest people I know. Polly has a Rate Limit policy available, but it uses a specific algorithm, is actually designed for the server side (not the client side), and can’t support coordinating multiple instances (since it’s a library run locally). On the JS side, bottleneck seems incredibly popular with over 1.4M weekly downloads, but it hasn’t been updated in 3 years, and requires you to setup and manage your own Redis/cache layer in order to coordinate multiple instances.

Basically, nothing I found seemed very easy to configure and use for my dumb brain.

I’m going on a journey now to see if I can help solve some of these challenges with HallPass.

What is a Rate Limit?

If you’re not a developer, I’m surprised you’re reading this allow me to briefly explain what a rate limit is.

Actually, I can’t explain it any better than this link does.

What is HallPass?

Right now…

Right now, HallPass is a .NET library in pre-release status that allows developers calling rate-limited APIs to easily respect those API rate limits within a single instance, and using the basic Token Bucket algorithm. Ease of use is the most important motivation behind HallPass, even more so than pristine accuracy.

EXAMPLE: .NET 6 Configuration and Usage

Native HttpClient

What’s easier than using the same HttpClient that we already use in .NET?

(Shoutout to Jarrod for the insight.. let me know if you want your last name mentioned)

I want HallPass to be as unobtrusive as possible. After a simple configuration, it should just work (whatever that means). For .NET, I think this is pretty close to the ideal look.

When branching out to other languages (NodeJS/TypeScript is high on the list of priorities), I’ll try to keep the configuration and usage as native-feeling as I can, as well. Since JS devs use many different HTTP clients, I figure I’ll need to provide some easy hooks for some of the more popular ones (axios for sure).

Anyway, the other nice thing about hooking into .NET’s native HttpClient is that we can still transparently use other great libraries (like Polly) for things like automatic retries.

Also, most third-party HTTP clients in .NET – like RestSharp – are switching over to using .NET’s native HttpClient under the hood, so HallPass should (not tested!) be compatible with those, as well.

Thread-safe

As of now, HallPass.NET is built assuming that developers are working purely in async/await flows. If you’re calling an external API from .NET code, it would be very rare to do so via synchronous code.

To that end, we only have async versions of methods. Also, it’s built from the start to assume that it will have multiple threads interacting with it, so that it needs to be thread-safe all the way through.

Given it’s a pre-release, I’m still writing tests to make sure this claim holds up, but so far it looks pretty good. Also, all of the SDK’s (for .NET, JS (soon), and other languages) will be open-source… so hopefully we can get some good feedback to fix any holes I undoubtedly will miss.

As an example, here’s one of my tests:

Testing multiple-threads. AcceleratedTimeService just makes the test run 200 times faster than real time.

Soon…

HallPass will soon enable developers to easily respect API rate limits, even if their calling clients are distributed horizontally across multiple instances. For example:

Suppose you have a service implemented as a serverless function, which could easily spin up multiple instances at the same time. And suppose this function calls a rate-limited API. How can you ensure that you easily respect this rate limit, shared across all of your function instances? You could spin up your own DB or cache layer and figure out how to implement a fault-tolerant and concurrent rate-limit strategy yourself. But that’s hard. Remember, HallPass wants to make this all to be easy.

Or suppose you have an application implemented with micro-services, and some of these micro-services call a rate-limited API. They also need to share rate-limit consumption information amongst each other in a robust and performant manner.

To accomplish this, HallPass plans to offer a remote service from which calling clients can request chunks of hall passes, which are then used by the SDK library to get local permission to call the rate-limited APIs in question.

Though there will be a REST API, I’m hoping that the SDK itself is still the preferred way to use the remote service. Here’s an example of how that would look:

Connecting to HallPass Remote takes just one more small line of code.

Eventually…

  • More languages: JavaScript/TypeScript, Python, Java, Go …
  • More rate limit algorithms: Leaky Bucket, Sliding Window, Fixed Window …
  • Composable rate limits: “global limit of 100/min + endpoint specific limit of 30/sec”
  • Easier configuration
  • Better performance: more accuracy, less resource consumption, higher throughput …
  • Better cloud integrations: “deploy to AWS in zone us-east-1”
  • More deployment methods: dedicated cloud hosting, custom configurations for on-prem / hybrid clouds, etc.

How Does it Work?

For full details, check out the source code.

TLDR: Each rate limit has a bucket. When we make an HTTP request using the HallPass HttpClient, the request has to pass the HallPass DelegateHandler before proceeding to the actual request. The DelegateHandler checks if the request should be rate limited. If so, it finds the corresponding bucket and asks it for a hall pass, waiting until it gets one. Once it has a hall pass, it proceeds to the actual HTTP request.

Local vs. Remote

The biggest difference between the Local and Remote buckets are that Remote buckets end up refilling their local stash of hall passes in batches by calling the REST API. Once it has a batch of hall passes, it operates essentially as a local bucket until it needs more.

So the REST API is responsible for the following in order to coordinate everything appropriately:

  • Registering individual instances using the same shared key (for “fairly” distributing hall passes to the various instances)
  • Refilling the central bucket (atomically)
  • Taking from the central bucket (atomically)

To manage state, the REST API is currently using Redis in the cloud. Redis is fast, but it’s slower than in-memory, so there’s some baked-in complexity that I’m not yet sure how to best solve for scenarios that require high throughput (rate limits of 1,000 requests per second, for example)… but I have some ideas that I’ll be testing shortly.

Fun Stuff for Programmers

One thing that was fun to implement was a ConcurrentSortedStack<T>. I (think I) needed this for the RemoteTokenBucket, because it’s possible that different threads ask for refills from the remote service at the same time, and each get back hall passes with different valid windows that are not in order.

For example, let’s say threads A and B ask for refills. A gets theirs first on the API server, but B’s come back to the client code first, and B’s hall passes have ValidFrom times that are later than A’s. If I use a normal ConcurrentQueue, I would add B’s first and start grabbing them FIFO-style. I would need to wait until they became valid before proceeding, though. Eventually, I would get through the B tickets, and then start pulling the A tickets. But the A tickets would be expired by this point, so that refill was worthless.

If instead of a normal FIFO queue, I had a list sorted by ValidFrom times, then I would be assured that I’m using as much of the requested hall passes as possible.

To implement the ConcurrentSortedStack, I ended up using a LinkedList under the hood rather than a tree or other structure, because I wanted to optimize these two points:

  • Removals, specifically from the top
  • Insertions of groups

The most common operation would be plucking items from the top of the stack, so that needed to be the greatest priority. A LinkedList in .NET offers removal from the front or back at O(1).

For insertions, I could optimize my custom structure a bit better than the worst case for a sorted LinkedList of O(n) because I would generally be inserting groups at a time (actually, at its worst case, inserting groups would be O(n*m), where m is the size of the group of new items to be inserted). That means that I could do the following:

  • Sort the input group first
  • Insert the first item from the new sorted group, starting from the front, and remember the node that it was inserted before
  • Insert the next item starting from the last insertion point, knowing that it must be greater than or equal to the first item
  • Repeat until all items are inserted

This should bring the efficiency back down to O(n) + O(log m), assuming .NET sorts arrays at O(log m).

But practically speaking, I expect the efficiency to often be O(log m) thanks to .NET’s LinkedList also keeping a reference to the last item. Since each new group being inserted will likely belong after the last item, I can check that first before starting item-by-item from the start of the list. So then I only pay for the sorting of the new group.

What Now?

Anyway, I’m looking for feedback, testers, and contributors. Reach out if you have interest.

Self-taught .NET Developer: How I Learned

Early stage

Books

Pro C# 8

I actually read the version for C# 5, but this is the latest one I found on Amazon. It’s more than 1,500 pages long. My pace was around 30 pages per evening after work, which worked out to about a chapter per night.

If I started to get sleepy, I let myself sleep. When I’m learning new things, I often get sleepy. My theory is that my brain is working extra hard and needs to rest to crystalize the new knowledge.

I knew that I would neither understand nor remember everything I read. This was simply to provide a broad overview… at least after the initial chapters on the core of the language, which I did need to learn. In the future, I expected that I would either:

  • Remember that such-and-such thing existed, and look for more details to actually use the feature, or
  • Be able to learn the forgotten feature more easily the second time around after priming my mind initially

I think both cases happened in actuality.

Effective C#

I learned a lot about how to write clean C#. Each “Item” is a couple pages long and covers an important lesson/guideline/rule. The chapters are the following:

  1. C# Language Idioms
  2. .NET Resource Management
  3. Expressing Designs in C#
  4. Working with the Framework
  5. Dynamic Programming in C#
  6. Miscellaneous

Some of the Items include:

  • Use Properties Instead of Accessible Data Members
  • Prefer Query Syntax to Loops
  • Distinguish Between Value Types and Reference Types
  • Limit Visibility of Your Types
  • Avoid ICloneable
  • and more…

I feel like it definitely provided some solid foundations at the beginning of my C# learning to keep me from developing bad habits early on.

Videos

While reading those two books, I tried to line up videos as much as possible, as well.

One of my early sources for video content was Bob Tabor’s LearnVisualStudio.net (now just https://bobtabor.com/). The progression in his videos largely lined up with the progression in Pro C# 8, which helped to reinforce the learning from two angles. I also liked that his courses included simple exercises at the end of each, allowing me to punch out some basic code in Visual Studio to start getting familiar with that.

After finishing most of his content and spending a couple aimless weeks wandering around YouTube, I finally discovered the fantastically curated C# learning path at Pluralsight.

The quality of the content is excellent. The user experience is fantastic. The guided path is intelligent and covers many important topics very well. I was hooked and began spending around 1-2 hours per evening going through course after course. They track your progress with very simple metrics, which almost felt like a video game to me, which kept me even more motivated to keep going.

I’ve since gotten rid of my TV. If I want something to watch while eating dinner or something, I turn on a Pluralsight video and improve myself a tiny bit more. Since finishing the C# path, I’ve watched a number of other courses in various paths and as individual units (64 courses completed so far). I’m 100% convinced this has helped to dramatically ramp up my learning curve.

I was very nervous before starting officially as a professional developer that all of my theoretical knowledge wouldn’t transfer to on-the-job skills, but it turned out that it actually set me up on a solid foundation to succeed, and I had a very successful first year as a developer.

Anyway, in order, here’s what I watched from Pluralsight the first 12 months (I think the C# path has changed since then):

Beyond the Elementary Basics

Books

I don’t think the list below is in any particular order. I just know that I read these – some partly, others fully – throughout my first year on the job.

Dependency Injection in .NET

During my first year, I led the development of my team’s first Web API. It was a relatively new push throughout the organization, so there wasn’t a ton of examples that I could leverage from other teams.

After learning about unit testing, and realizing that I had no idea how to expect my Web API code to actually behave, I wanted to unit-test everything very thoroughly. Motivated by this, I quickly realized how using dependency injection to manage dependencies throughout the application would be essential for promoting testability.

This book looked like the standard-bearer on Amazon. It turned out to be exactly what I needed. Going through it, combined with some pinpointed videos online, I was able to use good DI practices to build the API, helping me to reach my primary goal of straightforward testability.

Of course, testability isn’t the only reason to use DI, and the book does a good job of covering everything else. I won’t repeat it here.

The Art of Unit Testing: with examples in C#

I had come across this book on Amazon about a month before finally purchasing it. Later, I stumbled upon an hour-long video on YouTube of the author, Roy Osherov, giving a talk at a conference on unit testing. The talk was fantastic, so I had to buy the book.

As an intro to unit testing, with a good mix of philosophy and practicality, this book was an enjoyable read. It helped me to shift my mindset away from normal best practices in software development, to the slightly different best practices for writing good unit tests. Namely, repetition can be more acceptable for unit tests if it helps to make the test understandable.

It also discusses how doing TDD can improve code quality by enforcing good patterns, gives an overview of various testing and mocking frameworks, and discusses different testing paradigms like TDD, ATDD, and others.

I learned a lot, and should probably revisit it from time to time. It’s a good intro and a solid resource.

Design Patterns

At the very beginning of my journey to self-taught professional programmer, I interviewed a senior PM and dev at my then-employer. They were able to give me a fantastic breakdown of the industry, providing a bunch of jargon and things to look into and keep in mind. One was TDD and the idea of a failing test. The other was the importance of design patterns.

This is the bible of design patterns, written by the Gang of Four. Unfortunately, I didn’t get as far as I would have liked. Reading UML still doesn’t come easy to me, and it definitely didn’t at that time, either. I think I need a little more interactivity, which is why the 15+ hour course on Pluralsight covering various design patterns was much more easy for me to digest.

Nevertheless, it got me off to a good start. Thanks to those two individuals, and thanks to some early education on the subject, I think I was able to develop some good habits when evaluating a problem. I think in patterns in real life, so thinking in design patterns for software problems just feels right to me, as well.

This is definitely a book that I need to take a second look at.

Patterns of Enterprise Application Architecture

During this first year, I discovered and fell in love with Martin Fowler’s work. I like the way he writes, and I like the way he thinks.

This is one of his classic books, and it has helped me to think through and analyze different architectures at a couple different employers so far. It’s best when you can speak with another developer who has spent time in Fowler’s world, because then you have a rich vocabulary that you can use to discuss otherwise complex topics. I think this was one of the primary aims of the book (and any design pattern endeavor).

Refactoring: Improving the Design of Existing Code

Another Fowler book, goes into detail introducing various patterns for making systematic refactorings, as well as the reasons why you should want to refactor.

I’m amazed at how the authors could judiciously categorize so many different types of refactoring actions. On top of categorization, they go another step and explain why you’d want to do one action, or why you wouldn’t want to, and how it interacts with other refactoring patterns.

The intro chapters are especially insightful, because he speaks more freely about the way he thinks when looking at legacy code and/or refactoring any code, as well as how he likes to make small changes to code to continually make it more legible. All good ideas to keep in mind, and pretty cool to see a master at his craft transform a piece of mediocre code into something clean and buttoned up.

Pro ASP.NET Web API Security

This book was rated highly on Amazon, and I made it through about 5 chapters. The content looks impressive when scanning the table of contents and the chapters themselves, but there was just something about the way the author writes that never seemed to flow to me. I found myself constantly re-reading different sentences or passages multiple times to understand what he was trying to say.

The content is probably good, but the English is awkward. Maybe I’ll give it another try eventually, but probably not.

Videos

Oddly, I spent a good amount of time watching videos on Angular, but no books. Also during this second year, I was working at a new company. During the interviews with them, they asked more standard computer science questions on things like algorithm efficiency (Big O Notation) and implementations of hash tables (dictionaries).

Despite successfully passing the interviews and getting hired, I felt that I was missing some core pieces of knowledge, so that was also reflected in the videos watched during this time.

Finally, near the end of my initial employer, I spent some time setting up the new API and Angular projects I’d created on our new CI/CD platform of TeamCity and Octopus. For this, I ended up using a decent amount of Powershell, so I thought I might need to get some more familiarity with it.

Here’s the list:

Discovering Domain Driven Design

My second coding employer had developed an exceptionally rich and supple domain model. It was mostly a joy to work with (would be more so if you could compare against the alternative universe of what would have been without the rich model), but I never knew that the principles behind it were a “thing”. I just thought that everybody that worked there was insanely smart, which they were.

At my next employer, we didn’t have the luxury of a rich and supple (love that word) domain model… but there were aspirations to get there. On my team particularly, we scoped out a chunk of code that we believed could serve as a good starting point for Strangling the Monolith, allowing us to use good Domain Driven Design principles and practices to build something useful and easy to maintain.

To ramp up my knowledge for the task, I read a lot of books and watched the majority of the Pluralsight DDD path.

Unfortunately, that project has thus far been shelved at work. It’s a big strategic mistake, in my opinion, but what do I know… I’m just a lowly developer with no significant business education at all…………

Books

Domain-Driven Design Distilled

Fantastic intro to the topic, written by the same author of Implementing Domain-Driven Design.

I joked with colleagues at work when they complained it was 70+ pages that it’s a very quick read because there are a lot of pictures… which is true. It’s an easy read, lots of diagrams, and touches on the topics enough for non-developers to understand pretty much everything they need to know about DDD. For developers, it provides us a soft intro so that learning the meat comes more easily in the later stages.

Domain-Driven Design

This is the reference for the subject. It’s a bit more philosophically written than its more practical Implementing Domain-Driven Design, but I preferred this one over the other two in this list. Maybe it helped that I read it last and so already had a decent grasp of the concepts, making it easier to go through this and appreciate the higher-level philosophy and principles.

It has changed the way I look at software for the better.

To practice, I’ve been working on a private project in my spare time at home, trying to stay as true to the principles of DDD as possible. So far, I love it. As complexity increases, so far my system is still very easy to understand and modify where needed.

He was a big proponent of the word “supple” to describe an ideal Domain Model, and I’m enjoying making refactorings along the way in my own project to continually progress toward more “suppleness”.

Implementing Domain-Driven Design

I read this one after Domain-Driven Design Distilled, but before Domain Driven Design. It was pretty good, but I didn’t use a number of his specific implementation patterns in my hobby project. Maybe that will come back to bite me, we’ll see. I’m also lucky in that I can refer back to my previous employer’s fantastic Domain Model and the patterns that they used for alternatives to those found in these books.

Others have told me they really preferred this book over the other main one. The style is a bit different, so that might be a thing. Overall, though, it felt like the overall quality was comparable between the two. If you can only read one, flip a coin and you’ll probably be fine… but why not read both?

Videos

I didn’t watch the videos in the advanced section, yet. I’m mostly looking forward to a more in-depth look at Event Sourcing, but that one hasn’t been posted (since I’ve last checked).

I also skipped a course by Esposito or something. I’m sure it was decent, but I couldn’t stand his voice. Nothing personal.

Connecting the Dots

The next step in my development is being able to tie the pieces all together into a cohesive software system – deployed and usable. Shipping software involves more than just writing beautiful code and committing it to the cloud somewhere. Databases need provisioned. CI/CD pipelines need setup. Authentication needs configured. Code needs deployed. Etc.

The only way to do that is to work on real projects from start to finish. Work provides some of these opportunities, but it’s difficult to cover all the bases in our silo’d teams and on complex projects, most of which are likely legacy and already have a number of infrastructural pieces long established.

Also, I really don’t have any direct mobile UI experience, so some of the projects in my queue are chosen specifically to get that covered.

Finally, another goal of this stage is to use implicit peer pressure to force me to lift my coding standards. With open-source projects, I’m displaying my code for all to see (and criticize). With deployable apps and open-source projects, I’m displaying my finished products for all to see (and criticize). With blogging, I’m displaying my thoughts and dreams for all to see (and criticize).

The criticism will hurt. So the threat of criticism provides additional motivation to go the extra mile, to stretch myself. And of course, the actual criticism will (hopefully) teach me new ways to do/see/think about things.

It’s all a giant intense learning experience!

Projects

Super Secret App

This is my main baby, but I think it might actually have commercial value when it’s finished… so I can’t talk about it too much.

But it has a ton of cool technical stuff in it for me to learn, including:

  • Domain Driven Design
  • Message-based asynchronous architecture
  • Event sourcing
  • Cloud hosting
  • Mobile development
  • End-to-end web development
  • Public API built for integrations with third-parties
  • Modern Authentication
  • Individual user accounts
  • Field level encryption
  • and more…

JsonCryption

This small project spun off from the one above. I needed to modify my JSON serializer to use with Marten so that I could easily encrypt different fields of my C# objects with a straightforward API. MongoDB has a similar feature, but it requires using MongoDB. I’m using Marten specifically because I like how they handle Event Sourcing… but Marten requires using Postgres.

I’ve already learned a ton from this short project, including:

  • Optimizing .NET reflection with Expression trees
  • Using GitHub Actions for CI/CD
  • Publishing to NuGet in a CI/CD pipeline
  • OWASP best practices for encryption
  • Key Management Systems
  • and more…

Todo Bubbles

Every developer needs to do their own todo list at some point, right? I have an idea for a slight variation on the usual todo list, making use of what I think is a key psychological trait in order to fill a specific niche.

More details will come probably later this summer. I want to finish the first project before starting this one. I’m not sure if anybody else will find it useful, but I know I could use the specific feature set that this guy will provide.

And all the while, I expect to learn:

  • More mobile development
  • Mobile app deployment
  • Probably some fun little UI stuff

This Blog

I like to relate things to seemingly unrelated things. It helps me to understand both at a deeper level, I find. I’m not sure if anybody else will agree, or if they’ll agree with the weird relationships I find with programming and business (religion, war, seduction…).

But I like writing about it, if nothing more than to store my thoughts. Also, as many people often say, forcing ourselves to write down our thoughts and ideas helps us to curate and refine them.

Writing is a skill by itself. It’s useful in life to be a good writer. It’s useful as a programmer to be a good writer. We all have complicated things to communicate with people we care about, about things we care about. I’m not a great writer, but I think I can improve, and I think it will improve other areas in my life – including the code I write.

Faster Reflection in .NET for JsonCryption.Utf8Json

  1. I needed to use Reflection to add support for Utf8Json to JsonCryption
  2. I wanted to support Utf8Json because it’s good and fast…
  3. … but, reflection in .NET is sloooowwww…

Thankfully, through C#’s Expression class, we can cache getters, setters, and methods that we discover via System.Reflection initially, so that we can use them in the future without going through System.Reflection each time thereafter.

I’m late to this game, as Jon Skeet first wrote about the technique back in 2008. And I believe others had written about it before him.

Adding support for Utf8Json

From a high-level view, I needed to provide an alternative implementation of Utf8Json.IJsonFormatterResolver, as well as implementations of Utf8Json.IJsonFormatter<T> in order to offer a similar usage API of JsonCryption:

using Utf8Json;

class Foo
{
    [Encrypt]
    public string LaunchCode { get; }
    ...
}

// setup
IJsonFormatterResolver encryptedResolver = new EncryptedResolver(…);

// serialize/deserialize
var myFoo = new Foo { LaunchCode = "password1" };
string json = JsonSerializer.Serialize(myFoo, encryptedResolver);
Foo deserialized = JsonSerializer.Deserialize<Foo>(json, encryptedResolver);

The implementation of IJsonFormatterResolver is trivial, just getting from a cache or creating an instance of IJsonFormatter<T> for each type T. The fun starts with the implementation of IJsonFormatter<T>.

First, an overview

Stepping back for a moment… I don’t want to write a JSON serializer. Whenever possible, JsonCryption should leverage the serialization logic of the given serializer, and only encrypt/decrypt at the correct point in the serialization chain. Something like this:

Without Encryption
  1. .NET Object (POCO)
  2. (serialize)
  3. JSON
  4. (deserialize)
  5. POCO
With Encryption
  1. POCO
  2. (serialize)
  3. JSON
  4. (encrypt)
  5. Encrypted JSON
  6. (decrypt)
  7. JSON
  8. (deserialize)
  9. POCO

Except, this isn’t exactly accurate since JsonCryption is doing Field Level Encryption (FLE). So as written, the encryption path shown above would produce a single blob of cipher text for the Encrypted JSON. We instead want a nice JSON document with only the encrypted Fields represented in cipher text:

{
  id: 123,
  launchCode: <cipher text here...>
}

So really, the process is something more like this:

  1. POCO
  2. (serialize)
  3. (resolve fields)
  4. (serialize/encrypt fields)
  5. JSON …
(serialize/encrypt fields) for a single field
  1. field
  2. (write JSON property name)
  3. (serialize data)
  4. JSON chunk
  5. (encrypt serialized data)
  6. cipher text
  7. (write cipher text as JSON value)

Like this, I (mostly) don’t have to worry about serializing/encrypting primitive, non-primitive, or user-defined objects. For example, if I have something like this…

class Foo
{
    [Encrypt]
    public Bar MyBar { get; }
}

class Bar
{
    public int Countdown { get; }
    public string Message { get; }
}

… then I will first get something like this during the serialization/encryption of MyBar

{ Countown: 99, Message: "Bottles of beer on the wall" }

Which itself is just a string, and therefore straightforward to encrypt, so that the final serialized form of Foo would be something like:

{
  MyBar: <cipher text here...>
}

Finally, since I only want to encrypt properties/fields on custom C# objects that are decorated with EncryptAttribute, I can safely cache an instance of IJsonFormatter<T> for each type that I serialize via JsonSerializer.Serialize(…). This is good news, and now we can begin the fun stuff…

EncryptedFormatter<T> : IJsonFormatter<T>

As mentioned earlier, for each type T, EncryptedFormatter<T> needs to get all properties and fields that should be serialized, serialize each one, encrypt those that should be encrypted, and write everything to the resulting JSON representation of T.

Getting the properties and fields

Getting a list of properties and fields to be serialized is easy with reflection. I can cache the list of resulting MemberInfo‘s to use each time. So far not bad.

Serialize each MemberInfo, encrypting when necessary

When serializing each one, however, some things I need to do include:

  • Get the value from the MemberInfo
  • Determine if it needs to be encrypted
  • Serialize (and possibly encrypt) the value

Get the value from the MemberInfo

With reflection, this is easy, but slow:

object value = fieldInfo.GetValue(instance);

We could be calling this getter many times in client code, so this should be optimized more for speed. Using .NET’s Expression library to build delegates at run-time has a much larger scope than this post, so I’m only going to show end results and maybe discuss a couple points of interest. For now, this was my resulting code to build a compiled delegate at run-time of the getter for a given MemberInfo (PropertyInfo or FieldInfo), so that I could cache it for reuse:

Func<object, object> BuildGetter(MemberInfo memberInfo, Type parentType)
{
    var parameter = Expression.Parameter(ObjectType, "obj");
    var typedParameter = Expression.Convert(parameter, parentType);
    var body = Expression.MakeMemberAccess(typedParameter, memberInfo);
    var objectifiedBody = Expression.Convert(body, ObjectType);
    var lambda = Expression.Lambda<Func<object, object>>(objectifiedBody, parameter);
    return lambda.Compile();
}

This gives me a delegate to use for this particular MemberInfo instance to get its value, bypassing the need to use reflection’s much slower GetValue(object instance) method:

// using reflection
object value = memberInfo.GetValue(instance);

// using the cached delegate
object value = cachedGetter(instance);

As others on the interwebs have mentioned when using this technique, it’s initially slow since we have to compile code at run-time. But after that, it’s essentially as fast as a direct access of the property or field.

Determine if it needs to be encrypted

This is trivial. Just check if it’s decorated by EncryptAttribute and cache that Boolean.

Serialize (and possibly encrypt) the value

Initially, I thought I could get away with just using Utf8Json’s dynamic support when serializing to avoid having to explicitly call the typed JsonSerializer.Serialize<T>(…) method for each MemberInfo. I got it to work for primitives, but not for more complex types.

Hence, I would need to once again use reflection to get the typed Serialize<T> method to use for each MemberInfo at run-time. Since reflection is slow, I also needed to cache this as a compiled delegate:

// signature: JsonSerializer.Serializer<T>(ref JsonWriter writer, T value, IJsonFormatterResolver resolver)

internal delegate void FallbackSerializer(
    ref JsonWriter writer,
    object value,
    IJsonFormatterResolver fallbackResolver);

FallbackSerializer BuildFallbackSerializer(Type type)
{
    var method = typeof(JsonSerializer)
        .GetMethods()
        .Where(m => m.Name == "Serialize")
        .Select(m => (MethodInfo: m, Params: m.GetParameters(), Args: m.GetGenericArguments()))
        .Where(x => x.Params.Length == 3)
        .Where(x => x.Params[0].ParameterType == typeof(JsonWriter).MakeByRefType())
        .Where(x => x.Params[1].ParameterType == x.Args[0])
        .Where(x => x.Params[2].ParameterType == typeof(IJsonFormatterResolver))
        .Single().MethodInfo;

    var generic = method.MakeGenericMethod(type);

    var writerExpr = Expression.Parameter(typeof(JsonWriter).MakeByRefType(), "writer");
    var valueExpr = Expression.Parameter(ObjectType, "obj");
    var resolverExpr = Expression.Parameter(typeof(IJsonFormatterResolver), "resolver");

    var typedValueExpr = Expression.Convert(valueExpr, type);
    var body = Expression.Call(generic, writerExpr, typedValueExpr, resolverExpr);
    var lambda = Expression.Lambda<FallbackSerializer>(body, writerExpr, valueExpr, resolverExpr);
    return lambda.Compile();
}

For this, I needed to use a custom delegate due to the JsonWriter being passed in by reference, which isn’t allowed with the built-in Func<>. Beyond that, everything else should more or less flow from what we did before with the MemberInfo getter.

Ultimately, this allowed me to do something like:

static void WriteDataMember(
    ref JsonWriter writer,
    T value,
    ExtendedMemberInfo memberInfo,
    IJsonFormatterResolver formatterResolver,
    IJsonFormatterResolver fallbackResolver,
    IDataProtector dataProtector)
{
    writer.WritePropertyName(memberInfo.Name);
    object memberValue = memberInfo.Getter(value);
    var valueToSerialize = memberInfo.ShouldEncrypt
        ? BuildEncryptedValue(memberValue, memberInfo, fallbackResolver, dataProtector)
        : BuildNormalValue(memberValue, memberInfo, memberInfo.HasNestedEncryptedMembers, formatterResolver);
    JsonSerializer.Serialize(ref writer, valueToSerialize, fallbackResolver);
}

static string BuildEncryptedValue(
    dynamic memberValue,
    ExtendedMemberInfo memberInfo,
    IJsonFormatterResolver fallbackResolver,
    IDataProtector dataProtector)
{
    var localWriter = new JsonWriter();
    memberInfo.FallbackSerializer(ref localWriter, memberValue, fallbackResolver);
    return dataProtector.Protect(localWriter.ToString());
}

static object BuildNormalValue(
    dynamic memberValue,
    ExtendedMemberInfo memberInfo,
    bool hasNestedEncryptedMembers,
    IJsonFormatterResolver formatterResolver)
{
    if (!hasNestedEncryptedMembers)
        return memberValue;

    var localWriter = new JsonWriter();
    memberInfo.FallbackSerializer(ref localWriter, memberValue, formatterResolver);
    return localWriter.ToString();
}

There are a couple things going on here…

First, I needed to use the localWriter when leaning on Utf8Json to serialize at the intermediate stage, because otherwise it would restart its internal JsonWriter when calling the JsonSerializer.Serialize(instance, fallbackResolver) overload. Things were very weird before I realized what was happening with this.

Second, you’ll see that I needed to do one additional special stage for properties that aren’t marked to be encrypted themselves. This is to take into account nested classes/structs whose children may themselves have encrypted members:

class FooParent
{
    public FooChild Child { get; }
}

class FooChild
{
    [Encrypt]
    public string LaunchCode { get; }
}

Because of the possibility of nesting, when building the cached EncryptedFormatter<T>, I also needed to traverse every nested property and field of T to determine if any were decorated by EncryptAttribute. If a nested member needs encrypted, then I need to encrypt T itself using the EncryptedResolver, eventually returning a JSON string. Otherwise, I could do the entire thing normally with the default Utf8Json resolver configured by the client, therefore only needing to return the original object directly.

Conclusion: All theory without benchmarking

Is this actually faster than using regular reflection? Did I make the code needlessly complicated?

Theoretically, it should be significantly faster, but until I actually benchmark it, I won’t know for sure.

I’ve been talking about benchmarking JsonCryption for a while now, so it will likely be the next thing I do on this project. Unfortunately, I have other projects going on that are more important, so I’m not sure when I’ll be able to get to it. I’m also not thrilled about slightly rewriting JsonCryption.Utf8Json to use reflection just so that I can benchmark it.

Encryption itself is slow. I expect the encryption part alone to be a very significant piece of the total time spent serializing a given object. But again, I won’t know until I look into it.

Finally, working on this port of JsonCryption taught me some new techniques that I would like to see incorporated into the version for Newtonsoft.Json. I’m guessing/hoping I might find some low hanging fruit to optimize that one a bit more.

Another Self-taught Developer

Learning how to Pronounce Integer

When I was ten years old, my dad dropped off a couple old programming books after one of his visits. One was an intro to C.

At the time, I was in love with a little-known football video game that was decades ahead of its time in terms of off-the-field team management features, called Total Control Football. Gamespot says:

“I’m not sure if there was a demand for it, but here it is: a football simulation for pigskin fans who love micromanagement.”

Gamespot

It was glorious. Madden is still catching up.

So shortly after failing to build a flying set of glider wings out of sticks and duct-tape, I set out learning how to code so I could build an awesome game of my own…

I don’t remember much of my early reading from the book, but I remember generally just re-typing the example programs listed in its pages. Attempting to compile them taught me early on how rage-inducing a missing semicolon could be back then before intelligent IDEs could spot the obvious. The one thing I do know is that, thanks to this book, I learned the word integer long before finally hearing it pronounced (it’s NOT a hard ‘G’).

Over the years, I’d convince my mother and myself that what I really wanted for Christmas was the newest version of Visual C++, even though I really couldn’t use it.

Discovering a Taste

In college, I was able to get a copy of SQL Server Management Studio. I’m not sure if I used up a Christmas or birthday gift for it, or if I was able to obtain an academic version through my university. With my now budding passion for stock options, and exposure to data mining thanks to an undergrad professor that let me unofficially audit his masters-level course on the subject, I had a grand plan to discover a money-printing covered call trading strategy. For a couple hundred bucks (I had a credit card and wasn’t yet afraid of debt), I was able to purchase a few years worth of historical stock option market data. My goal was to load it into my database, clean it up, and then unleash the power of SSMS’s data mining module to find the Holy Grail. I knew it would work, which kept the motivation strong. At the same time, I knew there was no chance that it would work, which led to my ultimate justification:

Even if it doesn’t work, I’ll learn a bit of SQL and how to work with a database.

me

I took a couple programming courses in college: Intro to Java and Numerical Methods I/II (using Matlab). Otherwise, I wouldn’t dabble in any coding until the end of my first internship in Paris.

Getting in the Zone

While attending business school for an MBA, I started an internship on the trading floor of BNP Paribas in Paris at the height (trough?) of the 2009 financial crisis. At the end, I was out of work on my last day, so I spent the day building an “art macro” in Excel. It did nothing more than randomly cycle through colors within cells, influenced by neighboring cells, creating a swaying mosaic of color.

In my next internship (now Societe Generale), I actually used my budding VBA skills for real work. After a month or so, I was told that I would be taking over the weekly responsibility of creating/updating a standard internal report on the Credit market for our traders. The intern passing off her mantel showed me all the tedious manual steps she would take to update it, ultimately spending around a half-day every week. There had to be a better way! Thankfully, my managers gave me the freedom to explore automating the entire process. In the end, I just had to update a small paragraph of text and click one button. In 5 minutes, my report was updated and ready to go.

I learned two key things from that experience:

  1. It was very easy for me to get “in the zone” and forget about lunch or stay late working, and
  2. Projects that seem too large can be broken down into smaller and more manageable units with a little discipline, almost turning the project into one big video game (I just want to do one… more… thing.. for tonight……)

Before finishing my studies and moving back to the States, I used my new skills (horrible, ugly, embarrassing coding “skills” at the time) on a couple fun personal projects, including a study on a historical equity trading strategy using Value and Quality to rank companies for buying/selling.

Looking for More

My full-time career started in Manhattan at the same bank. Although I was technically in a marketing role on the trading floor, it was a relatively quantitative position. From time to time, our team could justify coding up a tool quickly with VBA, and I was always the one to jump on these opportunities.

After a couple years, a few things were going on in my mind:

  1. If I want to leave this company, my options for similar employment are very limited.
  2. If I want to leave this city, my options for location are pretty much limited to the handful of major global financial cities.
  3. If I want to work for myself, my options are virtually non-existent.
  4. The first three problems are wiped clean if I’m a developer.
  5. I like programming very much, and I think I’d be good.

So should I go back to school? With a few degrees and a small mortgage in student loans… no.

As luck would have it, I elected to participate in our official mentorship program and somehow landed with our COO. Following his introductions, I continued meeting people down the chain until reaching a Team Lead from MIT and a nice Senior Developer willing to work with me a bit.

At this point, I wasn’t even sure what language to learn.

“Java is popular, right? Should I learn that?”

“Well, we use C#.”

“I’ve heard of that,” he said, trying to sound impressive…

The Team Lead went on to give me a long list of vocabulary to start my research: design patterns, write tests that fail (what??), message queues, etc. Finally, I was beginning to know what I don’t know.

Here’s a breakdown of my learning regimen…

Learning C# Regimen:

  • Read Troelsen’s 1,500 page book on C#/.NET
  • Watch topical YouTube videos to reinforce the readings
  • Code along with the book, at least to get used to typing it and using Visual Studio
  • Watch C# path in Pluralsight
  • Do problems on codewars (“why does everybody use LINQ so much?”)

Shortly before confirming an internal transfer to a dev team, I was able to participate in our annual coding contest. Using the online editor without any IDE support, I managed to score at about the 50th percentile, so my learning was paying off a little.

Fast Forward to Today

I’ve now been a full-time developer for just over 3 years. The best part of this career is being able to (and required to!) constantly continue learning. I’ve been blessed with a mix of open-ended opportunities to explore, combined with mentorships by some of the brightest programming minds on Wall Street.

Other self-taught developers would likely agree with the main benefit of being self-taught. In teaching ourselves to code, we learn:

  1. How to learn complex things
  2. We’re capable of learning new complex things
  3. With the right system, applied consistently, major transformations are not only possible, they’re inevitable

Closing, the famous bamboo story…

When finally making the switch to a full-time dev, I came across the famous parable of bamboo for the first time. TLDR, once a bamboo seedling begins growing, it doesn’t sprout above ground for the first five years.

During that time, it stays busy building a strong and elaborate root system to anchor itself when it finally grows to be more than 80 feet tall. When it finally does sprout, it shoots up multiple stories in a matter of weeks.

I’m at year 3. There is still MUCH to learn to develop a solid system of coding roots. I hope that initiating a dialog here will help.