Introducing JsonCryption!

I couldn’t find a useful .NET library for easy and robust JSON property-level encryption/decryption, so I made one.

The GitHub page covers more details, but this is the gist:

Installation:

Install-Package JsonCryption.Newtonsoft
// There's also a version for System.Text.Json, but the implementation
// for Newtonsoft.Json is better, owing to the greater feature surface
// and customizability of the latter at this time.

Configuration:

// pseudo code (assuming using Newtonsoft.Json for serialization)
container.Register<JsonSerializer>(() => new JsonSerializer()
{
    ContractResolver = new JsonCryptionContractResolver(container.Resolve<IDataProtectionProvider>())
});

Usage:

var myFoo = new Foo("some important value", "something very public");
class Foo
{
    [Encrypt]
    public string EncryptedString { get; }
  
    public string UnencryptedString { get; }

    public Foo(string encryptedString, string unencryptedString)
    {
        ...
    }
}
var serializer = // resolve JsonSerializer
using var textWriter = ...
serializer.Serialize(textWriter, myFoo);
// pseudo output: '{ "encryptedString": "akjdfkldjagkldhtlfkjk...", "UnencryptedString": "something very public" }'

Why I need JsonCryption

My main project (not fully operational) is a .NET Core app that handles contact information for users. Being on the OCD spectrum, I wanted this data to have stronger protection than just disk-level and/or database-level encryption.

Property/field-level encryption – in addition to disk-level and database-level encryption – sounded pretty nice. But I needed to be able to easily control which fields/properties were encrypted from each object.

This project is also using Marten, which uses PostgreSQL as a document DB. Marten stores documents (C# objects, essentially) in tables with explicit lookup columns, and one column for the JSON blob. From what I could tell, the best hook offered by Marten’s API to encrypt/decrypt documents automatically is at the point of serialization/deserialization by providing an alternative ISerializer. If I encrypted the entire blob, I wouldn’t be able to query anything very well. So I needed a way to leave certain columns unencrypted when serializing – the ones that would serve as lookups in queries.

Discovery path

First Stop: Newtonsoft.Json.Encryption

This library provided a lot of inspiration. It intends to be very easy to use by requiring a single EncryptAttribute to decorate what is to be encrypted, and it plugs into Newtonsoft.Json via the ContractResolver approach (similar to JsonCryption above).

However, I felt that it had a few fatal flaws that would make using it a more difficult than initially meets the eye.

That it doesn’t store the Init Vector with the generated ciphertext was a non-starter for me. This requires consumers of the library to figure out how and where to store it themselves. I’m not a cryptographic expert (use JsonCryption at your own risk!), but it seems pretty standard practice to include the IV with the ciphertext to enable later decryption with just the symmetric key. In any case, this would be a bigger issue after later discoveries.

Overriding JsonConverter

Next, I came across this blog post by Thomas Freudenberg that used a slightly different approach. Rather than provide a custom ContractResolver, he decorated each property needing encryption with a custom JsonConverter. His approach also offered a normal way to handle the Init Vectors.

public class Settings {
    [JsonConverter(typeof(EncryptingJsonConverter), "#my*S3cr3t")]
    public string Password { get; set; }
}

This was interesting, but would be annoying to have to type all of that for each property needing encryption. Also, I would obviously need a way to inject the secret into the converter, rather than hard-code it here.

Nevertheless, it gave me an idea for an approach to use with .NET Core’s new System.Text.Json library…

Initial Attempt for System.Text.Json

Microsoft recently released System.Text.Json with .NET Core 3.0 as an open-source alternative to the also-open-source Newtonsoft.Json, which had been the default JSON serialization library for .NET Core up to now. Wanting to be cutting edge, and not knowing much about this new library, I started writing my solution around this.

The library has decent documentation, is open-source (as already mentioned), and enables powerful serialization customization via an unsealed public JsonConverterAttribute. By overriding this with my own implementation, I could essentially implement Freudenberg’s approach with much less code:

public sealed class EncryptAttribute : JsonConverterAttribute
{
    public EncryptAttribute() : base(typeof(EncryptedJsonConverterFactory))
    {
    }
}

Then I just needed to write a custom EncryptedJsonConverterFactory to provide the correct converter given the datatype being serialized.

But this approach also carried critical issues…

  • Overriding the JsonConverterAttribute ultimately required using a Singleton pattern rather than clean Dependency Injection
  • System.Text.Json currently offers no ability to serialize non-public properties, nor fields of any visibility. For most DDD scenarios, this was also a non-starter.

Newtonsoft.Json

Newtonsoft.Json offers support for serializing private to public fields and properties. It’s a well-known mature library with a highly extensible API. It’s JsonConverterAttribute is currently sealed, so we can’t override that… but there are better options for configuring it, anyway, in order to take advantage of Dependency Injection and other better patterns than I was forced to use with System.Text.Json.

The good news is that the exercise of implementing a solution for System.Text.Json forced me to develop some core logic for converting different datatypes to and from byte arrays, which would come in handy for encrypting a wide variety of datatypes. Another issue with the other libraries and approaches I mentioned earlier is that they only handled a tiny number of potential datatypes. I wanted a set-and-forget solution that would work widely, so being able to convert all built-in types and any nested combination thereof was essential.

Adding support for Cryptography best practices

I began with a custom implementation and abstraction of the core Encrypter that I was using throughout the library. It was basic and structured largely using inspiration from the two approaches discussed earlier.

It worked.

But then I attended a great session at CodeMash 2020 called Practical Cryptography for Developers. Without getting into the weeds of cryptography, I was exposed for the first time to the concept of key/algorithm rotation and management and cryptographic best practices.

Writing these features into my library would take me far outside its immediate domain, and far outside my expertise. Surely, I thought, there must be some libraries that handle this already…

Switching to Microsoft.AspNetCore.DataProtection underneath

… yes, there is. Obviously.

The open-source package Microsoft.AspNetCore.DataProtection was designed to provide

a simple, easy to use cryptographic API a developer can use to protect data, including key management and rotation

https://docs.microsoft.com/en-us/aspnet/core/security/data-protection/introduction?view=aspnetcore-3.1

It’s highly configurable, easy to bootstrap, built to promote testability, and built for .NET Core. It handles key management and algorithm management, written by dedicated experts in the field.

So I used that instead of my own Encrypter.

Closing

In the end, I kept both the System.Text.Json implementation (JsonCryption.System.Text.Json), and the Newtonsoft.Json implementation (JsonCryption.Newtonsoft).

JsonCryption.Newtonsoft is better for the moment, allowing encryption/serialization of private to public fields and properties, shallow or nested, of (theoretically) any data type that is also serializable by Newtonsoft.Json.

Check it out. Try it out.

And tell me what you think needs changed to make it better.

Intellectual Obesity

Eating food is good.

Eating a lot of food is good if you’re Michael Phelps and use it all. Or anybody working very hard and using the calories in productive ways to improve the world.

Eating a lot of food and not using it leads to obesity.

Obesity is bad.

Here’s something that’s even more cool: the food that’s stable across the entire domain of food isn’t food: it’s information. It’s information, and we use the same bloody circuits in our brain to forage for information that animals use to forage for food. It’s the same circuit. Why is that? Because we figured out that knowing where the food is is more important than having the food. Knowing where the food is is a form of meta-food—information is a form of meta food, and that’s why we’re information foragers. That idea is embedded into the story of Adam and Eve: whatever it is that they ingest is a form of meta food. It’s information.

Jordan Peterson – Biblical Series IV

Like almost all software developers, especially the self-taught ones, I love learning new things. It’s probably even a coping mechanism for stress… the trick being to at least channel it toward learning useful things.

Throughout life, one of my greatest character struggles has been converting knowledge into productivity. This blog is part of my attempt to burn off the fact-fat into chiseled apps and a more powerful bench(mark).

I am intellectually obese.

Learning new things is good.

Learning new useful things is better.

Learning a lot of new useful things is great, as long as the knowledge gets converted into productive actions.

Learning a lot of new useful things and failing to use any of the knowledge gained leads to intellectual obesity.

Intellectual obesity is bad.

Another Self-taught Developer

Learning how to Pronounce Integer

When I was ten years old, my dad dropped off a couple old programming books after one of his visits. One was an intro to C.

At the time, I was in love with a little-known football video game that was decades ahead of its time in terms of off-the-field team management features, called Total Control Football. Gamespot says:

“I’m not sure if there was a demand for it, but here it is: a football simulation for pigskin fans who love micromanagement.”

Gamespot

It was glorious. Madden is still catching up.

So shortly after failing to build a flying set of glider wings out of sticks and duct-tape, I set out learning how to code so I could build an awesome game of my own…

I don’t remember much of my early reading from the book, but I remember generally just re-typing the example programs listed in its pages. Attempting to compile them taught me early on how rage-inducing a missing semicolon could be back then before intelligent IDEs could spot the obvious. The one thing I do know is that, thanks to this book, I learned the word integer long before finally hearing it pronounced (it’s NOT a hard ‘G’).

Over the years, I’d convince my mother and myself that what I really wanted for Christmas was the newest version of Visual C++, even though I really couldn’t use it.

Discovering a Taste

In college, I was able to get a copy of SQL Server Management Studio. I’m not sure if I used up a Christmas or birthday gift for it, or if I was able to obtain an academic version through my university. With my now budding passion for stock options, and exposure to data mining thanks to an undergrad professor that let me unofficially audit his masters-level course on the subject, I had a grand plan to discover a money-printing covered call trading strategy. For a couple hundred bucks (I had a credit card and wasn’t yet afraid of debt), I was able to purchase a few years worth of historical stock option market data. My goal was to load it into my database, clean it up, and then unleash the power of SSMS’s data mining module to find the Holy Grail. I knew it would work, which kept the motivation strong. At the same time, I knew there was no chance that it would work, which led to my ultimate justification:

Even if it doesn’t work, I’ll learn a bit of SQL and how to work with a database.

me

I took a couple programming courses in college: Intro to Java and Numerical Methods I/II (using Matlab). Otherwise, I wouldn’t dabble in any coding until the end of my first internship in Paris.

Getting in the Zone

While attending business school for an MBA, I started an internship on the trading floor of BNP Paribas in Paris at the height (trough?) of the 2009 financial crisis. At the end, I was out of work on my last day, so I spent the day building an “art macro” in Excel. It did nothing more than randomly cycle through colors within cells, influenced by neighboring cells, creating a swaying mosaic of color.

In my next internship (now Societe Generale), I actually used my budding VBA skills for real work. After a month or so, I was told that I would be taking over the weekly responsibility of creating/updating a standard internal report on the Credit market for our traders. The intern passing off her mantel showed me all the tedious manual steps she would take to update it, ultimately spending around a half-day every week. There had to be a better way! Thankfully, my managers gave me the freedom to explore automating the entire process. In the end, I just had to update a small paragraph of text and click one button. In 5 minutes, my report was updated and ready to go.

I learned two key things from that experience:

  1. It was very easy for me to get “in the zone” and forget about lunch or stay late working, and
  2. Projects that seem too large can be broken down into smaller and more manageable units with a little discipline, almost turning the project into one big video game (I just want to do one… more… thing.. for tonight……)

Before finishing my studies and moving back to the States, I used my new skills (horrible, ugly, embarrassing coding “skills” at the time) on a couple fun personal projects, including a study on a historical equity trading strategy using Value and Quality to rank companies for buying/selling.

Looking for More

My full-time career started in Manhattan at the same bank. Although I was technically in a marketing role on the trading floor, it was a relatively quantitative position. From time to time, our team could justify coding up a tool quickly with VBA, and I was always the one to jump on these opportunities.

After a couple years, a few things were going on in my mind:

  1. If I want to leave this company, my options for similar employment are very limited.
  2. If I want to leave this city, my options for location are pretty much limited to the handful of major global financial cities.
  3. If I want to work for myself, my options are virtually non-existent.
  4. The first three problems are wiped clean if I’m a developer.
  5. I like programming very much, and I think I’d be good.

So should I go back to school? With a few degrees and a small mortgage in student loans… no.

As luck would have it, I elected to participate in our official mentorship program and somehow landed with our COO. Following his introductions, I continued meeting people down the chain until reaching a Team Lead from MIT and a nice Senior Developer willing to work with me a bit.

At this point, I wasn’t even sure what language to learn.

“Java is popular, right? Should I learn that?”

“Well, we use C#.”

“I’ve heard of that,” he said, trying to sound impressive…

The Team Lead went on to give me a long list of vocabulary to start my research: design patterns, write tests that fail (what??), message queues, etc. Finally, I was beginning to know what I don’t know.

Here’s a breakdown of my learning regimen…

Learning C# Regimen:

  • Read Troelsen’s 1,500 page book on C#/.NET
  • Watch topical YouTube videos to reinforce the readings
  • Code along with the book, at least to get used to typing it and using Visual Studio
  • Watch C# path in Pluralsight
  • Do problems on codewars (“why does everybody use LINQ so much?”)

Shortly before confirming an internal transfer to a dev team, I was able to participate in our annual coding contest. Using the online editor without any IDE support, I managed to score at about the 50th percentile, so my learning was paying off a little.

Fast Forward to Today

I’ve now been a full-time developer for just over 3 years. The best part of this career is being able to (and required to!) constantly continue learning. I’ve been blessed with a mix of open-ended opportunities to explore, combined with mentorships by some of the brightest programming minds on Wall Street.

Other self-taught developers would likely agree with the main benefit of being self-taught. In teaching ourselves to code, we learn:

  1. How to learn complex things
  2. We’re capable of learning new complex things
  3. With the right system, applied consistently, major transformations are not only possible, they’re inevitable

Closing, the famous bamboo story…

When finally making the switch to a full-time dev, I came across the famous parable of bamboo for the first time. TLDR, once a bamboo seedling begins growing, it doesn’t sprout above ground for the first five years.

During that time, it stays busy building a strong and elaborate root system to anchor itself when it finally grows to be more than 80 feet tall. When it finally does sprout, it shoots up multiple stories in a matter of weeks.

I’m at year 3. There is still MUCH to learn to develop a solid system of coding roots. I hope that initiating a dialog here will help.