Parallelizing and auto-scaling bcrypt (2024)

Table of Contents
tl;dr Failing gracefully

How we greatly scaled our ability to perform login transactions

tl;dr

Keeping passwords safe involves using hash functions that are CPU-intensive and intentionally slow. This presents an interesting scaling challenge when having to handle thousands of login tx/sec.

Moving to an automatically scalable, distributed worker pool model allowed us to avoid bottlenecks, save money and not have to worry about variable load.

One of the features we have had for a really long time is what we call Database Connections. This means that instead of using Google, Facebook or an Active Directory instance as an identity provider, we store the emails/usernames and password hashes for users and they log in using those credentials.

This is one of our most used features. About 70% of users come from Database Connections. Making sure involved components are constantly optimized is always top of mind.

As our customer base started growing, performance was one of the things we had to pay more attention to. The more customers you have, the more varied their use cases and the higher load you get. Making sure we could handle that load became one of our top priorities.

In the extreme, bad performance can be interpreted as downtime.

We usually run many instances of each of our services in our cloud deployments. For the performance tests, we had decided to run with a single instance of each service to find the most elemental bottlenecks first.

When performing a test like this, an expected distribution for the response times usually:

  • Has a minimum greater than 0. No request takes 0 ms
  • Has relatively similar amounts for the lower set of response times
  • Has a constantly decreasing tail; some requests just take a bit longer.

A good example could look like this:

Parallelizing and auto-scaling bcrypt (3)

We set up a separate environment for the tests, created a couple of JMeter scripts to simulate users logging in and got this as the response time distribution:

Parallelizing and auto-scaling bcrypt (4)

As you can see, the shapes of the curves are quite different. In reality, the majority of the requests were taking a very long time to complete. This pointed to a clear bottleneck.

Conceptually, our setup looks something like this:

Parallelizing and auto-scaling bcrypt (5)
  1. A client performs a request to authenticate against our Auth API.
  2. Auth0 verifies the user's credentials (active authentication)
  3. Eventually the client gets a token as a response.

Note: The database and caches are queried while the above is taking place.

The first thing you hope for in situations like these was that we were just missing an index and adding it would make everything better, but it just wasn't the case.

Our next hypothesis was that a (or set of) CPU-intensive task was blocking the event loop and causing requests to be queued. Instead of going through our entire code base, we used flame graphs to try to find the cause of the issue. Fortunately, we got something that looked like the one in the following figure when profiling Auth0:

Parallelizing and auto-scaling bcrypt (6)

Most of our time was being spent on bcrypt. Bcrypt is the algorithm we use for hashing passwords. It is both memory- and CPU-intensive, intentionally slow, and the number of iterations it performs can be configured to adjust for faster cores in the future.

All of that is great from a security perspective, but a scaling and performance nightmare.

Parallelizing and auto-scaling bcrypt (7)
  1. Faster hash: This is a simple alternative (for new users) but was out of the question for us. We like the security traits of bcrypt and were not willing to give them up.
  2. Caching: This is does not fix the issue. We are trying to scale real logins, people don't login more than once in a while and there's no temporal locality. Caching would not save us this time.
  3. Scaling up: This was definitely the way to go for a quick win. You get more cores, have more processes running and you are good. The problem is that running in a cloud environment, you are going to hit a ceiling at some point.
  4. Scaling out: That's what you want in general, but what VMs do you use? If you have single-core VMs, you are not wasting any $, but every time one authentication request comes in, all other requests are stalled; that's not good :( If you get VMs with more cores and use async mode (which is basically using libuv's thread pool), you will have idle cores during periods when authentication requests are not coming in. This looks like a step in the right direction, but more atomicity seems to be better.

We liked a lot of the benefits that solution #4 brought to the table, but wanted more control over the bcrypt operations. That's how we came up with BaaS.

We created a TCP service that uses Protocol Buffer to handle two types of messages:

  • Comparing a password to a hash
  • Hashing a password

Each BaaS instance is like a worker in a distributed thread pool (kind of "distributing" async mode). The trivially parallelizable nature of the aforementioned operations is an ideal fit for this model.

Auth0 nodes connect to a load balancer (actually 2 for HA purposes) and they distribute the load between the different BaaS instances.

Added 2016–09–21

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Note: the traffic between the Auth0 node and the BaaS nodes is encrypted. using TLS.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Parallelizing and auto-scaling bcrypt (8)

Since the characterizing metrics of the BaaS service are very specific (e.g., requests/sec, CPU usage), we can do more granular capacity planning and implement automatic scaling to handle variable levels of load.

Failing gracefully

Whenever a dependency with an external service is introduced, the possibility of failure has to be handled gracefully.

Parallelizing and auto-scaling bcrypt (9)

There are different nonexclusive options, but their end goal is to prevent degradation of the users' experience:

  1. Retrying the operation against the same or different destination
  2. Falling back to an alternative implementation
  3. Returning an error

In our case we went with option #2. If a failure occurs when trying to call BaaS, we perform the hash calculation locally. This is not ideal, but gives time to our team to figure out what is going on and get a fix in place.

We are really happy with the outcome; BaaS has been running smoothly in production for some time now.

Since this is a critical component in our infrastructure, we frequently look for improvement opportunities. For example, we want to give Avro a try, as we have found some perf improvements when doing some tests internally.

The explanation of BaaS provided in this post is a simplified version. You can keep up with the latest changes in the github repo.

Parallelizing and auto-scaling bcrypt (2024)
Top Articles
What Is a Niche Market? 9 Examples in 2024 - Shopify UK
What You Need to Qualify
Skamania Lodge Groupon
Pangphip Application
Nco Leadership Center Of Excellence
Mountain Dew Bennington Pontoon
From Algeria to Uzbekistan-These Are the Top Baby Names Around the World
Mr Tire Rockland Maine
Find The Eagle Hunter High To The East
South Bend Tribune Online
Best Fare Finder Avanti
Aspen.sprout Forum
Available Training - Acadis® Portal
Maplestar Kemono
boohoo group plc Stock (BOO) - Quote London S.E.- MarketScreener
Katherine Croan Ewald
Kiddle Encyclopedia
Marvon McCray Update: Did He Pass Away Or Is He Still Alive?
Craigslist Sparta Nj
Trivago Sf
Adt Residential Sales Representative Salary
Jeff Nippard Push Pull Program Pdf
Asteroid City Showtimes Near Violet Crown Charlottesville
Lexus Credit Card Login
Cornedbeefapproved
Radical Red Ability Pill
Nurofen 400mg Tabletten (24 stuks) | De Online Drogist
Grandstand 13 Fenway
Phone number detective
Human Unitec International Inc (HMNU) Stock Price History Chart & Technical Analysis Graph - TipRanks.com
Craigslist Ludington Michigan
Hypixel Skyblock Dyes
Yoshidakins
The Wichita Beacon from Wichita, Kansas
Omnistorm Necro Diablo 4
Compare Plans and Pricing - MEGA
Express Employment Sign In
Ucsc Sip 2023 College Confidential
888-822-3743
Unblocked Games Gun Games
Fool's Paradise Showtimes Near Roxy Stadium 14
Linkbuilding uitbesteden
Brake Pads - The Best Front and Rear Brake Pads for Cars, Trucks & SUVs | AutoZone
Advance Auto.parts Near Me
Xre 00251
Jane Powell, MGM musical star of 'Seven Brides for Seven Brothers,' 'Royal Wedding,' dead at 92
Freightliner Cascadia Clutch Replacement Cost
Online TikTok Voice Generator | Accurate & Realistic
Minute Clinic Mooresville Nc
Google Flights Missoula
Public Broadcasting Service Clg Wiki
Dinargurus
Latest Posts
Article information

Author: Saturnina Altenwerth DVM

Last Updated:

Views: 5919

Rating: 4.3 / 5 (44 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Saturnina Altenwerth DVM

Birthday: 1992-08-21

Address: Apt. 237 662 Haag Mills, East Verenaport, MO 57071-5493

Phone: +331850833384

Job: District Real-Estate Architect

Hobby: Skateboarding, Taxidermy, Air sports, Painting, Knife making, Letterboxing, Inline skating

Introduction: My name is Saturnina Altenwerth DVM, I am a witty, perfect, combative, beautiful, determined, fancy, determined person who loves writing and wants to share my knowledge and understanding with you.