What do you do when peak demand is the new normal?
We might currently be in Covid-19-induced lockdown, but it doesn’t stop us from putting our heads together – virtually, of course – to help clients address some of their most pressing business challenges during these unprecedented times.
Obviously, for some, a sharp decline in business is cause for concern; for others, it’s the increase in demand for their products and services that’s creating additional pressure, as websites struggle to maintain availability.
At thinkTRIBE we help organisations optimise the online experience for the users all year round, solving problems and boosting brand performance. We recently undertook a performance audit to more specifically explore the practical steps clients can take to improve web pages in the short and medium term as traffic peaks persist – which sparked the idea of holding a webinar to discuss strategy.
We gathered together five of thinkTRIBE’s finest testing and monitoring professionals to share their stories about the problems prompted by sudden and sustained spikes in demand and to discuss what we consider to be the most effective tactics. We also took live questions during broadcast. These are some of the things we talked about.
Is it possible to plan for unexpected peaks?
This latest situation has leveraged unexpected demand on sectors that supply necessities such as groceries, as well as electronics, DIY and gardening equipment. The last time we saw anything like this was the first time Black Friday was widely observed in the UK in 2014. Nobody really expected it to come across from the US quite that early and in the way it did. So many sites went down – even the big players were taken by surprise.
While none of us can hope to predict the future, what we can do is to be ready all year round for a quick response to sudden events. We work with clients to prepare for planned peaks and to ensure that websites are in the best condition possible at all times – which includes testing when changes are made throughout the year, not just in the run-up to an expected peak. If you can cope robustly with elevated demand, it will result in customer engagement wins further down the line.
When things start returning to normal after the lockdown, experts are predicting that some retailers will start benefiting from so-called ‘revenge shopping’. The theory is backed by experience in China, where consumers who have been largely focusing on buying essentials during the crisis emerge with a fresh determination to splash disposable income on the luxury goods they’ve been craving. We’re advising retailers in this sector to begin preparing now for the expected snapback.
It’s complicated – and it’s not getting any simpler!
The shift to cloud suppliers has accelerated in the last few years. While auto-scaling is now more common, it shouldn’t be seen as a silver bullet – it can be slow (or slower than expected) to scale and requires testing, like anything else.
Likewise, the exponential growth in the use of third parties is also impacting website functionality. We recently worked on a website with a large number of third parties where a simple code change triggered a flood of requests that overwhelmed the database. We always recommend an audit of third-party widgets and a review of dependencies, as we know there’s a risk from the back-end integration of third parties, which may not manifest itself until the site is under a high load. It’s worth putting contingencies in place in case of failure – so you can drop them out or swap replacements in as necessary – but you’ll need to test them in advance so you’re not firefighting under pressure.
Bottom line? You really need to know what your third parties are doing. It’s all very well monitoring your infrastructure but if you’re relying on someone else to provide a key front-end function that’s not going through your back-end API, you have to make sure the provider can cope with increased capacity. Invest in the right preparation processes and you can engage more effectively with your third parties so you can rely on a quick response when things go wrong.
Crunching the numbers
Queueing systems obviously have become more commonplace, as businesses look to strike a balance between availability and customer experience. Even with a scalable infrastructure, you may still have bottlenecks within the system that will crash under extreme pressure, so queuing systems probably have their place.
We’ve talked before about streamlining third parties and stripping out all the non-essentials to lighten the load as much as possible. But, a lot of web stacks are now so complex, that it’s hard to simplify your offering with any degree of confidence. Some clients have plans in place or a list of functions they can switch off – wish lists, recommendations and such – along with special highly-cached versions of their sales product listings, to reduce the amount of load on the back end.
What does effective testing look like?
Traffic surges can happen for lots of reasons. One client recently told us about the web performance problems they experienced after a social media influencer featured one of the company’s products in a high-profile post. Effective and regular testing will help prepare for this kind of traffic – we recommend three load testing methods:
- A benchmark test that looks at historical peaks, replicating traffic to repeat the same profile and measuring how the site responds, where it fails and how it fails.
- A soak test takes a figure that represents the most users that a site could host at any one time without sacrificing the user experience (UX) too much and measures how long this load could be sustained before the site starts to fail.
- A spike test hits the site hard with peak load in a very short time span to replicate a sudden surge.
Everything we do is based on real data and carried out from a customer perspective. No-one can prepare for the unknown but if you take what you do know and use that as your baseline for the future (with a healthy percentage on top), you’ll know what your head room looks like.
How important is monitoring UX vs server health?
Ideally, you should be monitoring both.
Obviously, server health is the thing that will give you an advanced warning when you’re hitting the limits on your infrastructure – especially with some of the increasingly large stacks that we’re seeing on modern ecommerce sites.
That said, there are lots of problems that may appear to the user but won’t trigger an alert. Which makes it equally important to monitor from the user side, especially where third parties are integrated. The user’s ability to interact freely with the site is also a focus of thinkTRIBE’s load testing: we execute journeys in real browsers, checking that content is coming back properly (not just cursory checks on various HTTP exchanges). This means we can catch the odd cases that impact functionality client-side.
Prepare for peaks that are significantly higher than the business projections might be. If you call it a day at +10%, say, you’ll be heading into unknown territory if you do experience a sudden spike in demand. Better to know how your site will perform now so you can plan your strategy.
With so many clients moving to cloud solutions the kind of blocking factors associated with fixed infrastructure aren’t really there, although scaling on the cloud is not as simple as it seems. In any case, you should still test in advance because you’ll be in a sticky spot if that spike kicks in and you have to start manually scaling stuff up at speed.
Even if there are things you can’t fix right now, learn from your experience. Instead of mourning lost productivity, focus on how you could prevent it happening again in the future.
It’s never too early to prepare. Start a dialogue sooner rather than later with all the people who are keeping the wheels turning on your complex online infrastructure, so you can be ready for the unexpected.