Don’t ignore response codes, they do tell a story about your system

On a dark, gloomy day in Melbourne with rain pouring down our office windows, the team and I were debating whether we should test AWS S3 or not for our new architecture. After a bit of healthy debating, we agreed to run a performance test against the new architecture that leverages AWS Cloudfront and S3 functionality. The objective of the test was to check if we configured Cloudfront and S3 correctly, and also to see if they can handle anticipated static content load.  At this point, you might say, “Well, Cloudfront can handle 100,000 requests/sec per distribution and S3 can handle 5500 GET requests/sec, why the need?” Well, so did we, until we discovered an issue none of us anticipated, and if left unfixed, would have impacted end user experience (and potentially sales).

 

To test this scenario, we copied all the static content from our existing production infrastructure to the new infrastructure and used the current CDN to work out the load profile we needed to generate the right load. Once this activity was done, I quickly created a JMeter script and generated the load using Octoperf.

 

Following is the HTTP response code pie chart Octoperf generated after the test run. Nothing too exciting, except I have 7.6% HTTP 404 responses and remaining 92.4% HTTP 200 responses.

2018-08-13_10h29_43

 

At this point, I started thinking, “I don’t see ~7.5% 404’s for static content in production, and since the data was copied from production and I am using static content URLs from production, something doesn’t feel right.” I raised my observation with the rest of the team, and within an hour of our investigation, we found out the real issue.

 

The real issue was related to how Windows and Linux interpret the file path. The existing architecture runs on Windows OS, and Windows, by default, is not case-sensitive for the file paths, whereas Linux (S3 bucked mounted) is, by default. What this means, from a customer point of view, is that they will receive a 404 for product images (just as an example) as Linux OS will try to resolve the image path to a location in the S3 bucket, but, because of the case-sensitive issue, the path won’t exist and will therefore return a 404 to the customer, and that won’t be a good experience. For example, the following is the response I got for one of the static requests from Cloudfront. Notice the “N” in the file path — the real folder in the S3 bucket ends with “n” and NOT “N.” Therefore, I get a 404, whereas Windows is not case-sensitive and treats them the same, and I get the right response.

2018-08-13_22h27_38

 

During your performance testing, make sure you are also looking at the response codes, as they can tell you a lot about your system. Also, AWS S3 and Cloudfront were able to handle the anticipated load.