CS6650 Fall 2023 Assignment 2

CS6650 Building Scalable Distributed Systems

CS6650 Fall 2023 Assignment 2

Let’s Store Some Data

This is conceptually pretty simple. We’re going to implement the doPost() and goGet() methods and write/read to a database respectively. As always, simple things are not that simple in this course!

Client

You only need to make minor changes to your client from assignment 1 (part2). Unless of course, it has unnecessary synchronization - if you have low throughput in assignment 1, then revisit your design and implementation with your instructors. There will be a reason ;)

The required change is to print out the following after the test has finished, along with the other statistics:

We of course want to see the latter as zero, or maybe a handful at most for each test.

Add a Database

You need to deploy, design and implement a database that enables you to:

  1. Persist new album information, including the image and JSON-supplied data, during the doPost() method

  2. Retrieve album information by primary key in the doGet() method.

You are free to choose any database you like that gives you the nescessary safety gurantees (ie you can’t lose writes) and hopefully high performance. Obvious choices include:

Bear in mind you have a balanced workload - 50% write and 50% read. This should inform your data model design.

Use the same three workloads for your client as assignment 1, and see what throughput you can achieve?

Incorporate Load Balancing

One free tier server for your servlets will probably get pretty busy, so you will want to introduce load balancing.

You can set up AWS Elastic Load Balancing using either Application or Network load balancers. Enable load balancing with 2 free tier EC2 instances and see what effect this has on your performance. Depending on your database, you may have to allocate connections to each server so that you don’t exceed maximum connections.

A tutorial here should help. Remember to create AWS templates for your instances.

Tune the System

Run your client against the load balanced servlets and see what effect it has on overall throughput.

Somewhere you will probably have a bottleneck that you can try to address - use available monotoring tools to find this. Then think about how to remove it, ie:

There’s a lot of variables here so do your best. See if you can increase the throughput for the 30 Thread group client configuration.

Submission

  1. URL for your code repo

  2. A short description of your data model (5 points) - Please state size of image used if not using the stock image, and also Database/File storage solution.

  3. Output windows for the 3 client configuration tests run against a single server/DB (5 points)

  4. Output windows for the 3 client configuration tests run against a two load balanced servers/DB (15 points)

  5. Output window for optimized server configuration for client with 30 Thread Groups. Briefly describe what configuration changes you made and what % throughput improvement you achieved (15 points)

Additional Notes for Submission

For 3, 4, 5. The output windows means output window similar to A1 - client part II which contains

A table for the results for each stage for 3, 4, 5, and a overall table for comparison of results across 3, 4, 5.

Other optional but highly recommended to have in your submission:

Deadline : 11/3 11.59pm

Addendum - DynamoDB

For those interested in DynamoDB

If you are interested in using DynamoDB, here are some hopefully useful resources:

  1. A series of documentation explaining how to set up confidential before you could make requests to AWS using AWS SDK for Java 2.X.https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials.html.I have already verified the first two options work well in this section.The credentials(aws_access_key_id, aws_secret_access_key, aws_session_token) could be found on your learner’s lab page by clicking the right top corner “aws details”
  2. Examples of interacting with DynamoDB using AWS SDK for Java 2.X.https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.htmlI also found the API reference included in each example really helpful.

DynamoDBPricing: Pay attention to the difference between “provisioned capacity mode” and“on-demand capacity mode”.As to how to set billing mode in code, check here