With product roadmaps evolving from learnings during 2020, we look at how tech can help with 2021 challenges.
We are joined by Matthew Gracey-McMinn from Netacea. Netacea provides a revolutionary bot management solution that protects websites, mobile apps and APIs from malicious attacks such as scraping, credential stuffing and account takeover.
2020 saw some big product releases most notable being PS5 and Xbox which, in turn, attracted bots, disguised amongst the barrage of traffic, most notably, scalper bots. These malicious bots target merchandise that is in high demand or limited supply and snap it up faster than any human user can, before selling it on for a tidy profit.
It’s estimated that over 50% of traffic on eCommerce sites is made up of bots. In this episode, we look at how bot traffic can affect the performance of your site and how to mitigate against the risk.
What is bot traffic?
Matthew: Bot is a fairly broad term. It essentially describes a computer program that performs an action automatically for someone, whether that’s talking to customers or providing some level of support. Most people will be familiar with chatbots (AI) used in messaging apps and with Googlebot, Google’s web crawler.
A bot is just a simple computer programme that automates a process. It can do good things automatically or it can do bad things – which is where Netacea steps in. Scalper bots generally work by monitoring for when something becomes available, say, a PlayStation 5. They’ll be constantly making lots of requests to a website, looking for the exact moment that the PlayStation 5 goes live. Now, if it’s monitoring perhaps once every five minutes, it’s not so different to a keen customer just hitting F5. But it can get a lot worse. Some bot operators will make thousands of requests every minute, if not every second. Having ten people connect every second isn’t so bad. Having 10,000, though, quickly starts to increase infrastructure costs significantly. You can see how even a small low-scale bot could create 500 to 1,000 times the footprint of an ordinary user. And that’s not the end of it. Once the products have been identified, another bot will put the product in the cart and try to complete the purchase.
Bots cause issues elsewhere, too. Bots may be involved in ‘click’ fraud: automatically clicking on ads for pay-per-click rewards or simply to mess with marketing statistics or to burn through a competitor’s marketing budget. Some botnets even launch serious DDoS attacks. Bots can cause a lot of different types of damage.
What do traffic profiles look like?
Alistair: At Tribe, we monitor peaks, whether driven by high-demand items or by influencers or traditional peaks like Black Friday or other sale periods. And we also analyse web analytics to understand what those customers are doing at that time – to see what those profiles look like. But is that a challenge from a bot perspective as well? Are bots being captured in analytics?
Matthew: Bots span the whole gamut. There are basic bots, which are often quite easy to identify, through to those that are very advanced. We’re seeing increasing professionalism in the bot-operator industry as profits go up. The PS5s really highlighted this, with some groups making more than a million-pounds-worth of profits. These are really professional outfits.
You can get a free bot off GitHub to target PS5s or shoes or rare Pokémon cards. But you also get the much more expensive bots: $27,500 was the price tag on one of the more expensive we saw recently. That’s a business-level investment – more of an organisational tool than an individual hobbyist trying to get one PlayStation 5.
One computer making 10,000 requests a minute, that’s fairly easy to spot. And you’re probably going to assume that someone’s not hammering their F5 key 10,000 times a minute. But you also have slightly more advanced bots. We’re seeing an escalating ‘arms race’ between bot operators and the methods implemented to detect and defend them.
To try and get around that one computer, we see a lot of proxy attackers using rotating IPs. You get, say, 20 requests from one IP and then a couple of minutes later, you get some from another IP, and the IP changes very regularly, possibly even for each request. You can spot the low-hanging fruit from that fairly easily. If a company looks at these logs, does a bit of WHOIS lookups on the Ips, it’s easy to see that they probably don’t have 5,000 genuine customers sitting in a data centre in the Netherlands looking at it.
As a general rule, most of our customers come from residential IPs rather than data centre IPs. However, again, attackers are getting wise to this. They move up into more residential proxies, in some cases using legitimate residential proxies, in other cases using somewhat more illegally acquired botnets. And some of these operators aren’t even aware that they’re possibly piggybacking off illegal botnets. These criminals are selling them to people who think they’re acting legitimately.
Detection gets harder and harder as you’re dealing with the more advanced groups. And in many ways, the more advanced groups are those likely to be the ones snapping up all of your products, finding ways of exploiting your APIs that give them a slight edge.
And that’s really where bot management solutions can help a lot. We use machine learning, a little bit of secret sauce in there and real-time examination of the data by human analysts who are very skilled, very experienced. Real-time analysis of the data also helps.
One of the interesting things we saw was that as Google searches for PS5s rose, so did the interest in scalper bots. More people are inclined to use bots when they have a need to get a product.
What are bots costing the business?
Alistair: We talked about brand. We talked about revenue. And about things like scaling your infrastructure to accommodate the traffic you’re expecting. But if over half of that traffic is bots, you’re accelerating costs to host bots that are trying to buy products that you can’t sell to your actual customers.
A lot of the work we do around the testing and scalability is around addressing actual need. Sometimes clients want to know if they can decommission some infrastructure – if they’re in fact overprovisioned. But if you can achieve a balance by cutting out 50% of your traffic from bots, then that’s a win-win for everyone.
Matthew: It usually goes down well with our customers when they’re talking to their decision-makers, the CFO, for instance, and saying, we need to cut some of our infrastructure costs because 50% of it is going to robots rather than to actual humans. It’s a huge number.
Alistair: Then there’s the impact on performance as well: the speed of the customer journey is something we’re very focussed on and talk about a lot. Something we’ve covered a lot in our earlier podcasts is that you’ve got to think about that end-to-end customer journey. And if a high number of bots are making a disproportionately high number of requests, it can massively inflate your page delivery time or your browser speed times, adversely affecting customers’ journey completion times and negatively impacting the customer experience and the brand.
How do you manage bot traffic?
The best place to start is by understanding the nature of your threat. If you know you’ve got a Black Friday sale, PS5 or a popular launch, and start seeing a lot of unusual traffic at an unexpected time, you can probably expect that a lot of this is going to be from bots.
If you’re aware of these threats ahead of time, you can take some steps towards protecting your site from them.
Similarly, if your marketing team have reported a lot of traffic in response to a promotion that hasn’t actually translated into as many sales as expected, that’s also a red flag.
Often, the numbers just don’t add up. If stock sells out in minutes, despite limiting products to one per customer or only allowing 500 people on the website at any one time, that result feels wrong. The numbers just don’t add up. If sales are accelerating beyond your expectations, you’re probably being targeted by bots.
Once you have that knowledge, you can start reverse engineering that attack from that point and use it to inform better decision-making in the future.