Load testing for content served from SDR

Question

Load testing for content served from SDR

laurensorensen opened this issue 2 years ago · comments

Per our 12/6/2022 meeting with Tom, he's requesting load testing for the content being served via SDR, particularly streaming media. @corylown suggested getting a day or two of stats together, here is one day soon after the release of NTA documents in Spotlight with a very high volume of traffic.

From VT exhibit on Spotlight:
October 1-2, 2021

Cathy helped me find this, and asked if we should get examples from Spotlight as a single instance (all exhibits) or if it's better to get stats from just one exhibit, as above.

Lauren Sorensen · Answer 1 · Thu Dec 08 2022 03:11:18 GMT+0800 (China Standard Time)

Some more discussion on this: https://stanfordlib.slack.com/archives/C0RK5EM9N/p1670366042288899

Marlo Longley · Answer 2 · Thu Dec 08 2022 05:10:03 GMT+0800 (China Standard Time)

@laurensorensen it seems like there are two types of loads to test: traffic on the web app, and traffic for streaming media plays.
This graphic is about traffic on the web app; it is about page views basically. Is there any data on video playing, play clicks, or something like that?

@corylown suggested that we could maybe roughly correlate the number of video plays per page visits.

Justin Coyne · Answer 3 · Thu Dec 08 2022 05:26:33 GMT+0800 (China Standard Time)

Can I ask why we are doing load testing? What are we trying to find? If we find that we are under provisioned, what are we going to do about it, given Julian's comments?

Marlo Longley · Answer 4 · Thu Dec 08 2022 05:36:45 GMT+0800 (China Standard Time)

@jcoyne Reason is Tom specifically brought this up twice in meetings about the project. He is expecting a lot of video plays.

Lauren Sorensen · Answer 5 · Thu Dec 08 2022 05:37:46 GMT+0800 (China Standard Time)

@marlo-longley No data on video plays/clicks. Julian said "We’ve got no stats, I’m afraid."

Justin Coyne · Answer 6 · Thu Dec 08 2022 05:41:18 GMT+0800 (China Standard Time)

@marlo-longley Yes, I got that Tom asked for it. But what specifically does he mean? Percentage of HTTP request errors? Loading time of playlist? Loading time of chunks? Bitrate? Time to starting the stream? Lag time? Average? Percentiles? Standard deviations?

What will you do with this data after we collect it? Give me some understanding of what we want to find out and then we are able to pick an appropriate test/metrics.

Lauren Sorensen · Answer 7 · Thu Dec 08 2022 05:42:28 GMT+0800 (China Standard Time)

I exported the data on pages with the most views from the data report Google analytics and it seems to suggest that a search on "film" or related terms happens not infrequently:

9 views: film - Virtual Tribunals - Spotlight at Stanford Search Results
4 views: universum film - Virtual Tribunals - Spotlight at Stanford Search Results
2 views: Film - Virtual Tribunals - Spotlight at Stanford Search Results
1 view: Video / Language: German - Virtual Tribunals - Spotlight at Stanford Search Results
6 views: Video - Virtual Tribunals - Spotlight at Stanford Search Results
1 view: Videos - Virtual Tribunals - Spotlight at Stanford Search Results

Marlo Longley · Answer 8 · Thu Dec 08 2022 05:43:32 GMT+0800 (China Standard Time)

Nice thanks @laurensorensen -- sometimes there are also events you can see as far as clicks on a player in GA. I wonder if we can see into that at all.

Marlo Longley · Answer 9 · Thu Dec 08 2022 05:45:19 GMT+0800 (China Standard Time)

@jcoyne I love this list! Well, Tom didn't get too technical. He said something along the lines of, if we know we can handle double or quadruple a spike day (like the day after a press release or highest traffic day) of Spotlight exhibit, he will be happy. I realize that doesn't answer your question.

Lauren Sorensen · Answer 10 · Thu Dec 08 2022 05:47:51 GMT+0800 (China Standard Time)

Nice thanks @laurensorensen -- sometimes there are also events you can see as far as clicks on a player in GA. I wonder if we can see into that at all.

Cathy said she copied this report interface from a colleague at Harvard and creates them one-by-one on request of the Spotlight exhibit project. I get the impression that other data isn't very accessible, but I'm sure she wouldn't mind it if you wanted to look around Google Analytics if you're more familiar with it than I am. I'd ask her directly for credentials if you want to go that route (or I could ask, just LMK)

Justin Coyne · Answer 11 · Thu Dec 08 2022 06:10:46 GMT+0800 (China Standard Time)

@marlo-longley do we want to spin up a cluster or ec2 instances to generate a 4x daily load? I'm not sure we're going to be able to accurately simulate something like that from a single remote laptop. Furthermore, testing something like this in production may degrade the quality of service to our real patrons. AFAIK, our stage and production environments have dissimilar hardware/network/firewall, etc., so it isn't possible to do an accurate test on stage.

Nick Budak · Answer 12 · Thu Dec 08 2022 06:24:53 GMT+0800 (China Standard Time)

I don't think there's any barrier to testing it in production right now, since production itself isn't live. We might need to give ops a heads-up, but otherwise it seems fine to start it on fire with a bunch of requests 🤷🏻

In terms of actually executing the test, I've always wanted to try something like drill, which I think is designed to be invoked from a single machine. Maybe it won't approach "real traffic", but at least it'd be enough to know if we're going to hit problems very quickly (which it sounds like might be the case).

Justin Coyne · Answer 13 · Thu Dec 08 2022 06:34:24 GMT+0800 (China Standard Time)

@thatbudakguy I'm speaking of the media server https://sul-mediaserver.stanford.edu/. Tom asked for "the content being served via SDR" not the VT site itself.

Marlo Longley · Answer 14 · Thu Dec 08 2022 06:58:58 GMT+0800 (China Standard Time)

So just to clarify, I think the worries about testing production that @jcoyne mentioned were about testing the production sul media servers.

Lauren Sorensen · Answer 15 · Thu Dec 08 2022 07:47:32 GMT+0800 (China Standard Time)

To be clear, the impression I got was that the media streaming is "of special interest" re: knowing about load testing results, but he's game to hear about any load testing (content from SDR or the Arclight instance)

Lauren Sorensen · Answer 16 · Tue Dec 13 2022 06:26:17 GMT+0800 (China Standard Time)

Meeting 12/12/2022 notes: https://docs.google.com/document/d/1OJBXu3Dt7fd5OVzXm4gAuAbB2nTY8JRUYyw-k80Cl70/edit#bookmark=id.dw3kkgn4xlpx

Marlo Longley · Answer 17 · Thu Dec 15 2022 07:09:03 GMT+0800 (China Standard Time)

Closing in favor of #401 and #400