Background

A few weeks ago my friend Rob and I relaunched a small beer website we took over. We had rewritten the entire site in ratpack over the previous two months and were excited to show it off to the world. Alomst immediately upon launching our login and password reset pages were slowing to a crawl and timing out for most requests. After several hours of debugging we determined two main causes:

I had picked a woefully inadequate instance type for our application. (Because I’m cheap.)
We had compounded the problem by picking a log rounds value for password hashing that took a very long time in ec2. Particularly on the instance type I selected.

After moving to a sane ec2 instance type and decreasing the log rounds our performance issues have gone away.

This experience made me wonder if anyone had benchmarked this in ec2 before and without finding anything in a quick google search I decided to do it myself.

Methodology

JMH (Java Micro-benchmarking Harness) is a framework for writing micro-benchmarks in java and I’ve been wanting to try it for a while. JMH handles the tricky parts of benchmarking code on the JVM by handling things like JVM warmup and garbage collection. It provides annotations to generate the benchmarking code as well as the benchmark runner. The JMH documentation isn’t fantastic. I preferred this blog post, and this set of samples.

For this case I wanted to benchmark the amount of time it would take to hash a string using a salt generated by bcrypt over various number of log rounds. The higher the number of log rounds used, the more time it takes to hash the password and is more secure. However, the time to hash the password is going to be the lower bound on the amount of time it takes to authenticate a user. Pick a number that is too high and users could be waiting a very long time. (Particularly if your server has 1 cpu and a slow clock speed.) The method being benchmarked in this case is BCrypt.hashpw using various sizes of salts. The salt is generated by BCrypt.gensalt(logRounds) and log rounds varies from 6 to 12, increasing by 2 each time.

After I wrote the JMH benchmark, I needed an easy way to create ec2 instances, copy the jar file to it, execute the benchmark and capture the results. And I wanted to be able run this on several different ec2 instance types and compare the results. I didn’t find anything that could do this already so I wrote a small groovy script that leverages the Gramazon library to help out.

Results

###MacBook Pro - i7 2.5gHz

Benchmark	(logRounds)	Mode	Samples	Score	Error	Units
b.b.LogRoundsBenchmark.hashPw	6	avgt	200	5.194	± 0.015	ms/op
b.b.LogRoundsBenchmark.hashPw	8	avgt	200	20.570	± 0.075	ms/op
b.b.LogRoundsBenchmark.hashPw	10	avgt	200	82.559	± 0.319	ms/op
b.b.LogRoundsBenchmark.hashPw	12	avgt	200	330.919	± 1.509	ms/op

###m3.large

Benchmark	(logRounds)	Mode	Samples	Score	Error	Units
b.b.LogRoundsBenchmark.hashPw	6	avgt	200	6.219	± 0.002	ms/op
b.b.LogRoundsBenchmark.hashPw	8	avgt	200	24.471	± 0.017	ms/op
b.b.LogRoundsBenchmark.hashPw	10	avgt	200	97.416	± 0.050	ms/op
b.b.LogRoundsBenchmark.hashPw	12	avgt	200	388.989	± 0.225	ms/op

###c3.large

Benchmark	(logRounds)	Mode	Samples	Score	Error	Units
b.b.LogRoundsBenchmark.hashPw	6	avgt	200	5.811	± 0.002	ms/op
b.b.LogRoundsBenchmark.hashPw	8	avgt	200	22.859	± 0.015	ms/op
b.b.LogRoundsBenchmark.hashPw	10	avgt	200	90.972	± 0.031	ms/op
b.b.LogRoundsBenchmark.hashPw	12	avgt	200	363.898	± 0.653	ms/op

Although we didn’t have these numbers before we launched, it would have been really useful. We sarted on an m3.medium with a log rounds value of 16. I didn’t even include it in this test but it took multiple seconds. And there was only a single vCPU so our server was quickly overwhelmed when we sent out the email saying a big new release had been launched.

Having these numbers allows us to make an informed choice about the tradeoffs between security and performance.

Code

First the JMH test itself:

package benchmarks.bcrypt;
 
import java.util.concurrent.TimeUnit;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;

import org.mindrot.jbcrypt.BCrypt;

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Thread)
public class LogRoundsBenchmark {

	@Param({"6", "8", "10", "12"})
	public int logRounds;

	int i;
	
	@Benchmark
	public void hashPw() {
		BCrypt.hashpw("password", BCrypt.gensalt(logRounds));
	}
}

This is all the code you have to write for a simple JMH test. Additionally I used gradle to build it.

plugins {
  id "java"
  id "me.champeau.gradle.jmh" version "0.1.3"
}

sourceCompatibility = 1.7

repositories {
    mavenCentral()
}

dependencies {
	jmh 'org.mindrot:jbcrypt:0.3m'
}

The other piece required was a script to execute the benchmark on an ec 2instance. A lot could be done to make this script much more generic but it fit the bill for what I needed. To try this on your own you would need an amazon access and secret key, as well as the pem file generated for the security group used to create the ec2 instance.

@GrabResolver(name='oss', root='https://oss.sonatype.org/content/repositories/snapshots/')
@Grab('com.aestasit.infrastructure.aws:gramazon:0.3.6-SNAPSHOT')
@Grab(group='com.jcraft', module='jsch', version='0.1.51')

import com.aestasit.infrastructure.aws.*
import com.aestasit.infrastructure.aws.model.Instance
import com.jcraft.jsch.*

def ec2 = new EC2Client('us-east-1')
System.setProperty("aws.accessKeyId", 'ACCESSKEY')
System.setProperty("aws.secretKey", 'SECRETKEY')

println "Starting ec2 instance"

Instance instance = ec2.startInstance(
	'kyle-key-pair',
	'ami-52a3153a',
	'default',
	'm3.xlarge',
	true)

println instance.host
println instance.instanceId

sshToEc2Instance(instance) { Session session -> 
	installJava(session) 
	scpJarToConnection(session)
	runBenchmark(session)
}

println "Stopping ec2 instance" 

ec2.terminateInstance(instance.instanceId)

println "ALL DONE"

Lets take the script in pieces. First is the script that uses Gramazon and Jsch for the heavy lifting. All we’re left with is starting the instance, then sshing to the instance and running several commands. Finally we stop the instance which is important if you don’t want to continue to pay for it once you’re finished. Each time I wanted to try a new instance type I would just change the m3.large string to the next value.

void sshToEc2Instance(Instance instance, Closure closure) {
	try {
		JSch jsch=new JSch()
		jsch.addIdentity("/Users/kboon/Documents/workspace/bcrypt-jmh-benchmark/kyle-key-pair.pem")
		jsch.setConfig("StrictHostKeyChecking", "no")

		Session session=jsch.getSession("ubuntu", instance.host, 22)
		session.connect()
		closure(session)
		session.disconnect()
	} catch (all) {
		println all.message
		all.printStackTrace()
	}
}

This SSHes to the instance and then executes the provided closure, passing the ssh session to it.

void scpJarToConnection(Session session) {
	Channel channel = session.openChannel("sftp")
    channel.connect()
    channelSftp = (ChannelSftp)channel
    File fileToTransfer = new File('/Users/kboon/Documents/workspace/bcrypt-jmh-benchmark/build/libs/bcrypt-jmh-benchmark-jmh.jar')
    channelSftp.put(new FileInputStream(fileToTransfer), fileToTransfer.getName())
    channel.disconnect()
    println "Done scping file."
}

This copies the Jar file built with gradle. This should probably be parameterized so the jar file isn’t hard coded.

void installJava(Session session) {
	//run stuff
	String command = "sudo apt-get update;sudo apt-get -y install default-jre"
	executeCommand(command, session)	
    println "Done installing java."
}

The AMI I selected doesn’t have java installed so this takes care of that. I could select a different AMI or build my own if that was important enough.

void runBenchmark(Session session) {
	String command = "java -jar bcrypt-jmh-benchmark-jmh.jar"
	executeCommand(command, session)	
	println "Done running benchmark."
}	

Executes the JMH benchmark harness. This takes about 20-30 minutes as it runs multiple iterations to get accurate numbers.

void executeCommand(String command, Session session) {
	Channel channel = session.openChannel("exec")
		channel.setCommand(command)
		channel.setErrStream(System.err)
		channel.connect()

		InputStream input = channel.getInputStream()
		//start reading the input from the executed commands on the shell
		byte[] tmp = new byte[1024]
		while (true) {
    		while (input.available() > 0) {
        		int i = input.read(tmp, 0, 1024)
        		if (i < 0) break;
        			print(new String(tmp, 0, i))
    		}
    		if (channel.isClosed()){
        		println("exit-status: " + channel.getExitStatus())
        		break
    		}
    		sleep(1000)
		}

		channel.disconnect()
}	

This is just a bit of jsch to execute a command via the shell and stream the output back to standard out. I think I cut and pasted this from an example somewhere.