CS6650 Fall 2023 Assignment 2

Let’s Store Some Data

This is conceptually pretty simple. We’re going to implement the doPost() and goGet() methods and write/read to a database respectively. As always, simple things are not that simple in this course!

Client

You only need to make minor changes to your client from assignment 1 (part2). Unless of course, it has unnecessary synchronization - if you have low throughput in assignment 1, then revisit your design and implementation with your instructors. There will be a reason ;)

The required change is to print out the following after the test has finished, along with the other statistics:

number of successful requests
number of failed requests

We of course want to see the latter as zero, or maybe a handful at most for each test.

Add a Database

You need to deploy, design and implement a database that enables you to:

Persist new album information, including the image and JSON-supplied data, during the doPost() method
Retrieve album information by primary key in the doGet() method.

You are free to choose any database you like that gives you the nescessary safety gurantees (ie you can’t lose writes) and hopefully high performance. Obvious choices include:

AWS RDS (MySQL or PostGresSQL). You can initially deploy on a free tier instance to keep costs low
MongoDB: There are managed services you can use but costs/latencies may be prohibitive. Installing MongoDB on its own instance could be a straightforward starting point
DynamoDB: Easy to access, fast. Cheap? Depending on how you configure it. Be careful and see additional notes below.
Others - talk to us …

Bear in mind you have a balanced workload - 50% write and 50% read. This should inform your data model design.

Use the same three workloads for your client as assignment 1, and see what throughput you can achieve?

Incorporate Load Balancing

One free tier server for your servlets will probably get pretty busy, so you will want to introduce load balancing.

You can set up AWS Elastic Load Balancing using either Application or Network load balancers. Enable load balancing with 2 free tier EC2 instances and see what effect this has on your performance. Depending on your database, you may have to allocate connections to each server so that you don’t exceed maximum connections.

A tutorial here should help. Remember to create AWS templates for your instances.

Tune the System

Run your client against the load balanced servlets and see what effect it has on overall throughput.

Somewhere you will probably have a bottleneck that you can try to address - use available monotoring tools to find this. Then think about how to remove it, ie:

database bottleneck - increase capacity (e.g. bigger server, higher throughput configuration)
Servlet bottleneck - increase capacity (e.g. more load balanced free tier VMs, beefed up instances)

There’s a lot of variables here so do your best. See if you can increase the throughput for the 30 Thread group client configuration.

Submission

URL for your code repo
A short description of your data model (5 points) - Please state size of image used if not using the stock image, and also Database/File storage solution.
Output windows for the 3 client configuration tests run against a single server/DB (5 points)
Output windows for the 3 client configuration tests run against a two load balanced servers/DB (15 points)
Output window for optimized server configuration for client with 30 Thread Groups. Briefly describe what configuration changes you made and what % throughput improvement you achieved (15 points)

Additional Notes for Submission

For 3, 4, 5. The output windows means output window similar to A1 - client part II which contains

Configuration used
Overall wall time and throughput
number of success and fail request
the statistics (mean, median, P99, min, max etc) separated for the GET and POST request

A table for the results for each stage for 3, 4, 5, and a overall table for comparison of results across 3, 4, 5.

Other optional but highly recommended to have in your submission:

Some screenshot of your ALB/ELB set up
Some screenshot of your database after testing
Any other screenshot to demonstrate the effect of your configuration change for task 5.
- which can be some screenshot of your servers CPU Utilization before / after changes showing if you successfully remove the bottleneck

Deadline : 11/3 11.59pm

Addendum - DynamoDB

For those interested in DynamoDB

If you are interested in using DynamoDB, here are some hopefully useful resources:

A series of documentation explaining how to set up confidential before you could make requests to AWS using AWS SDK for Java 2.X.https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials.html.I have already verified the first two options work well in this section.The credentials(aws_access_key_id, aws_secret_access_key, aws_session_token) could be found on your learner’s lab page by clicking the right top corner “aws details”
Examples of interacting with DynamoDB using AWS SDK for Java 2.X.https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.htmlI also found the API reference included in each example really helpful.

DynamoDBPricing: Pay attention to the difference between “provisioned capacity mode” and“on-demand capacity mode”.As to how to set billing mode in code, check here