Are there any solutions to this conundrum?
If you have lots of money and lawyers, you may try your chances in court trying to prevent your cloud providers from handing over your encryption keys to third parties, but there are cheaper and more reliable options.
Client-side encryption is your best friend
The first obvious option is the so-called client-side encryption, consisting of encrypting data on the client/sender side before transmitting it to a server side such as a cloud storage service or database. Client-side encryption uses an encryption key that is not available to the service provider, making it impossible for a service provider to decrypt hosted data. Note that you keep your keys as well as a client computer (e.g., your laptop) strictly out of the cloud; otherwise your effort is immediately compromised.
Is client-side encryption mainstream? There are lots of open-source cryptographic libraries you can use to encrypt your data before submitting to a cloud, so you are all set. But can you ‘transparently’ integrate client-side encryption with, say, cloud databases? Transparently means you spend little or no effort on en/decrypting, just do your usual database CRUD operations. Here we have quite a limited choice.
AWS offers DynamoDB Encryption Client in Java and Python only. You can use these SDKs with AWS KMS (no real privacy) or provide your own (full privacy) cryptographic material for encryption (you keep away from the cloud). Unfortunately, at the time of writing, relational databases in AWS lack client-side encryption support.
How about Azure CosmosDB? They created the so-called Always Encrypted client-side option for CosmosDB, in .NET and Java. Unfortunately, at the time of writing, the only option to keep your customer-managed keys CMK for client-side encryption (which should be your most valuable private property) suggested by Microsoft is the Azure KeyVault, which undermines the whole privacy idea (goats to wolves) straight away.
Primary keys are NOT encrypted
It is important to note that both AWS DynamoDB and Azure CosmosDB require that you always leave the primary keys (partition/hash and range/sort keys) unencrypted. The motivation is that NoSQL databases need primary keys for proper and efficient storing and fetching data (think of the necessity to repartition your DB when encryption keys are rotated and encrypted primary keys change respectively). This unencrypted primary keys requirement is necessary but a bit disappointing, since it exposes part of your important data. If you ever worked with NoSQL databases, you realize that normally primary keys are composed of many other attributes to make searches efficient, which exposes your data even more. For example, you may use name, surname, and DOB as primary keys (some exposure) but use Bitcoin credentials (extremely sensitive) for client-side encryption. Another example is you use a personal number as a(n unencrypted) primary key, but keep all other patient information fully client-side encrypted. Full client-side encryption is of course possible if you store and operate your DB completely outside the cloud, then encrypt and upload it to the cloud. This is possible with the cloud blob storage in any cloud.
In contrast to AWS and Azure, GCP offers client-side encryption for Cloud SQL. However, their recommended Google’s Tink encryption suite leaves you no option of keeping encryption keys on your premises, only in the cloud KMS. This immediately opens the back door to your encrypted data. As an alternative, you may use another encryption suite, such as AWS encryption SDK. Either way, with Cloud SDK you need to manually encrypt/decrypt data before/after storing it, in contrast to AWS DynamoDB client-side encryption, which encrypts/decrypts transparently.
Client-side encryption options are available, but not abundant. Let’s expect more will emerge since motivations are pretty obvious.
- Amazon DynamoDB Encryption Client
- Use client-side encryption with Always Encrypted for Azure Cosmos DB
- Google Cloud: About client-side encryption
Just keep the keywords “client side encryption” in your Googling.
See the commented Python code coming with this blog at:
If there’s interest, in a forthcoming blog we may address another option: FHE – Fully Homomorphic Encryption.