ULTIMATE GUIDE TO
BOT MANAGEMENT
TABLE OF CONTENTS
CHAPTER TOPIC AND KEY QUESTIONS ADDRESSED
PREFACE 03
1.0 OVERVIEW 04 Bot Basics GoodBots:BeneficialtoOnlineBusinesses Bad Bots: Up to No Good HTTP vs. IoT Botnet: What’s the Difference? BusinessImpactsofBadBots:HowBotsImpactVariousIndustriesandBusinessFunctions Industry-SpecificImpacts SomeoftheMostWell-Known“Celebrity”BotAttacks
2.0 FOUR GENERATIONS OF BOTS 10 FirstGeneration:ACloserLook SecondGeneration:ACloserLook ThirdGeneration:ACloserLook FourthGeneration:ACloserLook
3.0 TECHNICAL OVERVIEW: BOT TYPES BY CHANNEL 18 APIs APIsUnderSiege MobileApps Websites
4.0 DETECTION & MITIGATION TECHNIQUES 22 Detection Mitigation AHealthyBotManagementStrategy DeploymentOptions
5.0 ADDITIONAL CONSIDERATIONS 26 WAForBotManagement?
6.0 BUYERS CHECKLIST 28 KeyConsiderationsWhenEvaluatingBotManagementSolutions
CLOSING 30
APPENDIX 31 OWASPTop21AutomatedThreats HowtoEvaluateBotManagementSolutions TheTopFreeandPaidWebScrapingToolsandServices The50MostCommonWebScrapingTools ContactUs ForMoreInformation
BOTS TOUCH VIRTUALLY EVERY PART OF OUR DIGITAL LIVES — AND NOW ACCOUNT FOR OVER HALF OF ALL WEB TRAFFIC.1 Somehelppopulateournewsfeeds,telltheweather,providestockquotesand controlsearchrankings.Weusebotstobooktravel,accessonlinecustomersupport,eventoturnourlightsonandoffandunlockourdoors.
Butotherbotsaredesignedformoremischievouspurposes—includingaccounttakeover,contentscraping,paymentfraudanddenial-of-service(DoS)attacks.Thesebotsaccountforasmuchas26% oftotalinternettraffic,andtheirattacksareoftencarriedoutbycompetitorslookingtoundermineyourcompetitiveadvantage,stealyourinformationorincreaseyouronlinemarketingcosts.2These“badbots”representoneofthefastest-growingandgravestthreatstowebsites,mobileapplicationsandapplicationprogramminginterfaces(APIs).
Thise-bookprovidesanoverviewofevolvingbotthreats,outlinesoptionsfordetectionandmitigationandoffersaconcisebuyersguidetohelpevaluatepotentialbot managementsolutions.
ULTIMATE GUIDE TO BOT MANAGEMENT 3 1Application Security in a Digitally Connected World, Nov. 2017, Radware2Application Security in a Digitally Connected World, Nov. 2017, Radware
PREFACE
Althoughbotshavebeeninuseforaboutfivedecades,modernbotsareacomplicatedbunch.Someareimmenselyhelpful;othersareharmful bydesign.
GOOD BOTS: BENEFICIAL TO ONLINE BUSINESSES Goodbotsarelegitimatebotswhoseactionsarebeneficial.Thesebotscrawlawebsitetosupportsearchengineoptimization(SEO),aggregation andmarketintelligence/analytics.
Herearesomegeneralcategoriesofgoodbotsandthefunctionstheyperform:
Monitoring bots(example:Pingdom)monitorwebsites’uptimeandsystemhealth—periodicallycheckingandreportingonpageloadtimesanddowntimeduration,amongothermetrics.
Backlink checker bots(example:UASlinkChecker)confirmtheinboundURLsthatawebsiteisgetting,somarketersandSEOspecialistscanunderstandtrendsandoptimizepagesaccordingly.
Social network bots(example:Facebookbot) arerunbysocialnetworkingwebsitesthatgive visibilitytoyourwebsiteanddriveengagement ontheirplatforms.
Partner bots(example:PayPalIPN)enable importantfunctionalityonwebsitesthatare transactionalinnature.
Aggregator/Feedfetcher bots(example: WikioFeedBot)collateinformationfromwebsitesandkeepusersorsubscribersupdatedonnews,eventsorblogposts.
Search engine crawler bots(alsoknownasspiders)crawlandindexwebpages,makingthemavailableonsearchengines,suchasGoogleandBing.Youcancontroltheircrawlrates,aswellasspecifyrulesintherobots.txt.Searchcrawlerswillthenfollowyourruleswhenindexingyourwebpages.Searchenginecrawlerbotsareessentialtoindexingyourwebpagesothatitbecomesvisibletoeveryoneusingtheinternet.Withoutthem,mostonline businesseswouldstruggletoestablishtheirbrandvalueandattractnewcustomers.
BOT BASICS
BOTS ARE AUTOMATED PROGRAMS CREATED TO PERFORM
REPETITIVE TASKS. WITH THE COMPUTING POWER AVAILABLE
TO THEM, PROGRAMMERS CAN CREATE BOTS TO EXECUTE TASKS
AT VERY HIGH SPEEDS — SO HIGH, IN FACT, IT’S UNTHINKABLE
FOR A HUMAN TO KEEP PACE WHEN DOING THE SAME TASKS.
4 ULTIMATE GUIDE TO BOT MANAGEMENT
Overview1.0OVERVIEW
BOT BASICS
BOTS ARE AUTOMATED PROGRAMS CREATED TO PERFORM
REPETITIVE TASKS. WITH THE COMPUTING POWER AVAILABLE
TO THEM, PROGRAMMERS CAN CREATE BOTS TO EXECUTE TASKS
AT VERY HIGH SPEEDS — SO HIGH, IN FACT, IT’S UNTHINKABLE
FOR A HUMAN TO KEEP PACE WHEN DOING THE SAME TASKS.
ULTIMATE GUIDE TO BOT MANAGEMENT 5
OVERVIEW
Manyofthesegoodbotsarehighlybeneficial—ifnotcritical—toyourbusiness.Theyplayimportantrolesinestablishingandmaintainingyouronlinepresenceandenablingafavorablecustomerexperience.Whileyoumaydecidetoselectivelystoponeormoreofthem,blockingthemcouldreducethevisibilityyourwebsitegetsonsearchenginesandothersocialplatforms.
BAD BOTS: UP TO NO GOOD Badbotsareautomatedprogramsthatdon’tplaybytherules.Mostlyunregulated,theyhaveadefinite“malicious”pattern.Forexample,imaginethousandsofpagevisitsoriginatingfromasingleIPaddresswithinaveryshortperiodoftime.Thatactivitystresseswebservers,chokesavailablebandwidthanddirectlyimpactsgenuineuserstryingtoaccessaproductorserviceonawebsite.Andthat’sjustoneexampleofhowbadbotscanwreakhavoconawebsiteandbusiness.
Here are some general categories of bad bots and the havoc they can wreak on business:
Scraper bots can be sent by third-party
scrapers or competitors to steal information
from your website — even content unique
to your business. That might include
product reviews, breaking news, dynamic
pricing information of products listed, a
product catalog or even user-generated
content on community forums. By
scraping your content and then publishing
it elsewhere, bots can affect your website’s
search engine rankings. There have been
instances of stolen content outranking
the original on Google search pages. This
theft directly impacts the bottom line of
companies that have invested budget and
resources to create original content.
Spam bots primarily target community
portals, blog comment sections and
lead collection forms. They arrive in the
middle of user conversations and insert
unwanted ads, links and banners. These
insertions frustrate genuine users who are
participating in forums and commenting
on blog posts. What’s more, spam bots
insert links that may be malicious — for
example, directing users to phishing
sites that may persuade them to divulge
sensitive information, such as bank
account numbers and passcodes.
Scalper bots target ticketing websites
and make bulk purchases. The modus
operandi is to buy hundreds of tickets as
soon as the bookings open and then sell
them to reseller websites at many times
the original ticket price. They emulate
humanlike behavior to remain undetected
by conventional or in-house bot detection
methodologies.
SCRAPER BOTS
SPAM BOTS
SCALPER BOTS
HTTP VS. IOT BOTNET: WHAT’S THE DIFFERENCE? Dependingonaperson’sbackground,botsandbotnetshavedifferentmeanings.Forsome,the word“botnet”conjuresupapictureofadistributeddenial-of-service(DDoS)attackinvolvingvast numbersofinternetofthings(IoT)devices.WhileDDoScanbeawidespreadattackoriginatingfrombotnets,botnetsareknowntocarryvariouspayloadsandareusedinvarioustypesofattacks.
Oftheknownbotnets,someareusedtominecryptocurrencyoninfecteddevices.Othersuse infecteddevicesasanonymizingproxiestoconcealattacksorillegalactivity.Stillothersuse infecteddevicesasmailrelaysformassivespamcampaigns.Thethreatemergingfrombotnets isonlylimitedbythecreativityoftheircreators.
Inthisguide,wediscussbotsandbotnetsandhowtheyareusedtoperformattacksagainstonlineservicesortodeliveralegitimateservicetoanonlineservice.Itisimportanttoemphasizethatlegitimatebotsprovideadvantagesorservicesandareconsidered“goodbots.”“Badbots,”ontheotherhand,performattacksagainstwebsitesandAPIs.BothtypesofbotsleveragethesameHTTPprotocols,meaningthatthetacticsandmethodstoprovidearighteousservicecanbeidenticaltothoseusedformaliciousintent.
Attacksfrombadbotsareasdiverseastheyarefrequent.Theyincludeclickerbotsthatperform adfraud,spambotsthatillegallypostadvertisements,webscrapers,andbotsthattrytoexhaustane-commercesite’sinventory.Thesearethekindsofbadbotsthataffectmanyonlinecommercialandnoncommercialservices.
Intheremainderofthisdocument,wegenerallyrefertobotsandspecificallyindicateHTTPbotsasdevicesorprogramsthatprimarilyusetheHTTPprotocolsformaliciousorlegitimatebehavioragainstanonlinewebsiteorAPI.
BUSINESS IMPACTS OF BAD BOTS: HOW BOTS IMPACT VARIOUS INDUSTRIES AND BUSINESS FUNCTIONS Anycompanywithane-commercepresencemustpreparetodetectandmitigatebadbotsseekingtoexecuteanyoralloftheseattacks:
Account Takeover Scammersusebotstomakefraudulentpurchasesusingstolenusercredentials.Hackerstargetuseraccountstoharvestpersonalinformationandpurchasehistory.Theycanalsomakeunauthorizedtransfersofvirtualcurrencies—fromrewardpointsandwalletmoneytogiftcardsandairmiles.
Carding Scammersusebotstotestthousandsofstolencreditcardnumbersagainstamerchant’spaymentprocesses.Sinceownersofstolencardscanclaimarefundforthefraudulenttransaction,cardingattacksleadtochargebacks,penaltiesandpoormerchanthistory.Frequentcardingactivitiesandtoomanychargebackscaneventuallyresultinamerchantbeingpreventedfromacceptingcreditcards.
6 ULTIMATE GUIDE TO BOT MANAGEMENT
OVERVIEW
ULTIMATE GUIDE TO BOT MANAGEMENT 7
OVERVIEW
Scraping of Pricing, Content and Inventory Information Competitorsscrapepricesandproductlistingstoattractyourcustomers.Suchaggressive tacticssabotagearetailer’srevenuestream.Scrapingofuniqueandproprietarycontentisanother commonproblemforonlinebusinesses.Duplicationofexclusivecontentcannegativelyimpact SEOeffortsaswell.
Cart Abandonment and Inventory Exhaustion Competitors’botsaddhundredsofitemstocartsandabandonthemlatertopreventrealconsumersfrombuyingproducts.Theseautomatedattackscreateartificialinventoryexhaustion,reducesales,skewconversionratesandhurtbrandreputation.
Application DDoS ApplicationDDoSattacksaffecttheavailabilityofwebsites.Thesurgeinnonhumantrafficoncheckoutpagescanincreasetheloadoninventorydatabasesandpaymentprocessingresources.Botnetsperformlarge-scaleLayer7DDoSattacksthatareoften“lowandslow”(thatis,making justoneortwohitsperIPaddressused)togoundetectedbyconventionalsecuritymeasures. DDoSattacksalsocreateapoorbuyingexperienceforcustomers.
Scalping Products and Tickets Maliciousbotsareactiveduringsalesandbuyvaluablegoods,suchasconsumerelectronics,toreselllateratamuchhigherprice.Botsaredeployedtoscoopupticketsforpopulareventsassoonastheygoonsale.
Fake Account Creation Criminalsemploybotstocreatefakeaccountstocommitvariousformsofcybercrime,suchascontentspam,launderingvirtualcash,spreadingmalwareandskewingsurveysandSEO.
INDUSTRY-SPECIFIC IMPACTS Botsfueldistinctivesetsofchallengesforthesetypesoforganizations:
Advertising Networks and Digital Publishers Scammersusebotnetstogeneratefalseclicksandtosupportfraudulentdisplaysofdigitalads.Faketrafficartificiallyinflatesadvertisingcosts.Botsalsoperformretargetingfraudtoillegally monetizetheinvalidtrafficonpublishingsites.Suchattackssabotageadnetworks’effortstoconnect advertiserswithqualityinventory,helpmarketersreachawideraudienceandoffercustomersmorevaluefromcampaigns.Ultimately,badbotsgenerateinvalidtrafficthatadverselyaffectsanadnetwork’sbrandreputationandunderminesitsclaimofprovidingatrustworthymediabuyingenvironment.
Financial Services Banking,financialservicesandinsuranceorganizationsrepresenthigh-valuetargetsforscammers.Theuseofbotnetstocommitfraudhasrampedupthespeedofattacksinrecentyears.Hackersdeploybotnetsonfinancialinstitutionstotakeoveraccounts,executeDDoSattacksorscrapecontent.Again,large-scalesophisticatedbotsareoftenlowandslowtobypassconventionalsecuritymeasures.
Marketplace and Classifieds Theessentialassetforanyclassifiedsiteisfreshcontentanduniquelistings.Maliciousbotssentbycompetitorsandthird-partyscraperscrawlinformationfromthesewebsitestopublishitelsewhereorevensellit.Inadditiontostealingnewlistings,maliciousbotsfillwebformswithfakedetails,endingupasdeadleadsthatdon’tconvert.Salesteamsmaywasteasignificantamountoftimeandeffortchasingthem.Eventually,thesebadbotsalsoskewanalytics—anddrivedownserverperformance.
Travel Scraperbotsarethemostobviousthreatforonlinetravelwebsites.Scraperbotscanbedeployed bycompetitorsandscammerstoscrapedynamicpricingofairlinetickets.Scrapingpricinginformationcangivecompetitorsanunfairadvantageandprovetobealong-termbusinessloss.Keepingpricingsafefromcompetitorsiscriticaltoretainingcustomers,partnersandbrandcompetitiveness.Anotherthreat:fakequeriesforflighttickets.Costsleviedbytheairlineglobaldistributionsystem(GDS)willincreasesignificantlybecauseoffakeGDSqueriesmadebybots.Competitorscanusethesefakequeriestoscrapeticketprices.Thesefakequeriescostatravelcompanymoney—andnoneofthesequerieswillevergeneratealegitimatebooking.
8 ULTIMATE GUIDE TO BOT MANAGEMENT
OVERVIEW
20192018
20172016
ULTIMATE GUIDE TO BOT MANAGEMENT 9
SOME OF THE MOST WELL-KNOWN “CELEBRITY” BOT ATTACKS
20192018
20172016
Manybotsnevergainacclaimfortheirmischief.These,however,madeheadlines:
APRScalpersweresellingticketsforthehotlyanticipatedmovieAvengers: Endgame3thesamedayticketswentonsale,athighlyinflatedprices.Earlierin2019,othervictimsofscalpingcampaignsincludedfamousartistssuchastheKorean boy band BTS4 and British singer Ed Sheeran5.Backin2015,Coldplay6fansfoundtheirfavoriteband’sticketsresoldata3,000%premium.
FEB Ryanair,alow-costcarrierbasedinIreland,filedaU.S.lawsuitagainstExpediaoverscreenscraping.ThesuitarguesthatExpedia’sunauthorizedwebscrapingoftheairline’ssiteviolatestheU.S.ComputerFraudandAbuseAct(CFAA).RyanairfurtherassertsthatExpediacausedreputationaldamagetotheairlinebylevyingopaquefeesonconsumersandthatExpedia’sunauthorizedactivitiestaxRyanair’swebsite,causingpoorresponsetimesandothererrors.7
NOVTheFBI,theDepartmentofHomelandSecurity,Googleandotherprivatesecuritycompaniesdisruptedamajorad-fraudnetwork.3veinvolved1.7millionIPaddressesandcausedtensofmillionsofdollarsinlosses.Atitspeak,the3vebotnetconsistedofmorethan700,000compromisedmachinesandmorethan60,000accountssellinggarbageadinventory.8
SEP
British Airways9(BA)wasthevictimofadatabreachpotentiallyaffecting380,000customers.JustfivelinesofJavaScriptcodewereaddedbytheattackerstoexistingJavaScriptlibrarieshostedonthewebserver.Thefive-linebotexecuted inthecontextoftheclientbrowserandhookedintothepaymentformsubmissionevent,givingthebotaccessto submittedcreditcardinformation,includingthecard’sverificationcode,beforeitwassenttoBA’spaymentprocessingservice.Theinformationwasscrapedandstoredonaservercreatedandhostedbytheattackers.TheattackwaslinkedtoMagecart,whichpreviouslyaffectedabreadthofprovidersincludingAdMaxim,CloudCMS and Picreel10 throughalarge-scale,automated,web-basedsupplychainattack.InJuly2019,aMagecartgroupwasinjectingskimmercodeinJavaScriptlibrariesfromthird-partywebsuppliersthatwerestoringtheirscriptsinworld-writableAmazonSimpleStorageService(S3)buckets,potentiallyaffectingseveralthousandsofwebsitesusingtheservicesoftheseproviders.11
APRForabouteightmonths,Panerabread.comwasleakingcustomerrecordsinplaintext,affectingasmanyas7millionpeoplewhohadsigneduptoorderfoodthroughthefast-casualchain’swebsite.AsecurityvulnerabilityinoneoftheAPIsthatthecompanyusesacrossitsdigitalplatformsmadeitpotentiallysimpleforanyonetoscrapeallavailablecustomerrecordsusingabasicscript.12
JUNApoliceraidinThailandprovidedaglimpseoftheunderbellyoftheinternet:primitivemetalshelvingholding500smartphones,eachwiredtoacomputermonitorforthepurposesofclickfraud.Thisandotherclick fraud farmsprofitbygeneratingfaketrafficthatmakeswebsites,socialmediapostsandadvertisementsappeartobemore popularthantheyactuallyare.13
MARAMcDonald’sIndiaapp,McDelivery,wasleakingthepersonaldataofmorethan2.2millionusers.AnunprotectedandpubliclyaccessibleAPImadeitpossibletogetuserdetails.That,alongwithseriallyenumerableintegersascustomerIDs,makesitfartooeasyforanyonetoscrapethepersonaldataoffallthesite’sregisteredusers.14
MAYCambridge Analyticascrapedthepersonalinformationof87millionU.S.residentsfromFacebookandleverageddatascienceonthescrapedinformationtotargetresidentswithcustomFacebookadcampaignsinanattempttoinfluencetheirvotingbehavior.15
3https://www.asiaone.com/singapore/scalpers-selling-tickets-avengers-endgame-888-carousell4https://mustsharenews.com/bts-ticket-scalpers/5https://theindustryobserver.thebrag.com/ed-sheeran-cancels-tickets-fight-scalpers/6https://www.cnbc.com/2016/11/23/sold-out-coldplay-concert-tickets-in-singapore-being-resold-at-3000-premium-by-scalpers.html7https://skift.com/2018/02/25/ryanair-files-u-s-lawsuit-against-expedia-over-screen-scraping/8https://digitalguardian.com/blog/all-about-3ve9https://www.riskiq.com/blog/labs/magecart-british-airways-breach/10https://www.riskiq.com/blog/labs/cloudcms-picreel-magecart/11https://www.riskiq.com/blog/labs/magecart-amazon-s3-buckets/12https://www.csoonline.com/article/3268025/panera-bread-blew-off-breach-report-for-8-months-leaked-millions-of-customer-records.html13https://www.vice.com/en_us/article/43yqdd/look-at-this-massive-click-fraud-farm-that-was-just-busted-in-thailand14https://www.securityweek.com/mcdonalds-app-leaks-details-22-million-customers15https://www.theguardian.com/news/2018/may/06/cambridge-analytica-how-turn-clicks-into-votes-christopher-wylie
Botsnowleveragefull-fledgedbrowsersandareprogrammedtomimichumanbehaviorinthewaytheytraverseawebsiteorapplication,movethemouse,tapandswipeonmobiledevicesand generallytrytosimulaterealvisitorstoevade securitysystems.
Thischapterlooksbackatthefirstthree generationsofHTTPbotsandprovidesan overviewofcurrentfourth-generationbots.
WITH THE ESCALATING RACE BETWEEN BOT DEVELOPERS
AND SECURITY EXPERTS — ALONG WITH THE INCREASING
USE OF JAVASCRIPT AND HTML5 WEB TECHNOLOGIES — BOTS
HAVE EVOLVED SIGNIFICANTLY FROM THEIR ORIGINS AS SIMPLE
SCRIPTING TOOLS THAT USED COMMAND LINE INTERFACES.
1 0 ULTIMATE GUIDE TO BOT MANAGEMENT
Four Generations of Bots2.0FOUR GENERATIONS OF BOTS
WITH THE ESCALATING RACE BETWEEN BOT DEVELOPERS
AND SECURITY EXPERTS — ALONG WITH THE INCREASING
USE OF JAVASCRIPT AND HTML5 WEB TECHNOLOGIES — BOTS
HAVE EVOLVED SIGNIFICANTLY FROM THEIR ORIGINS AS SIMPLE
SCRIPTING TOOLS THAT USED COMMAND LINE INTERFACES.
ULTIMATE GUIDE TO BOT MANAGEMENT 1 1
FOUR GENERATIONS OF BOTS
First- and second-generation bots typically use very few IP addresses and make thousands of hits from each one. On the other hand, third- and fourth-generation bots can rotate through thousands of IP addresses. As a result, they make only one or two hits from each address. This evasion technique, known as “low and slow,” enables them to slip past basic security systems.
FIGURE 2. BASIC VS. SOPHISTICATED BOTS: HITS PER IP ADDRESS
FIGURE 1. EVOLUTION OF BOTS
FIRSTGENERATION
SECONDGENERATION
THIRDGENERATION
• Typically use just one or two IP addresses to execute thousands of webpage visits to scrape content or spam forms
• Easy to detect and blacklist thanks to repetitive attack patterns and the small number of originating IP addresses
• Leverage “headless browsers” — which are essentially website development and testing tools — to tap into their ability to run JavaScript and maintain cookies
• Perform simple actions, such as moving the mouse, scrolling and clicking links to traverse a website
• Exhibit sophisticated behaviors that may overcome certain challenges but still cannot overcome interaction-based detection (examples: CAPTCHA or invisible challenges)
FOURTHGENERATION
• Rotate through large numbers of user agents (UAs) and device IDs — making just a few hits from each to avoid detection
• Make random mouse movements (not just in a straight line like third-generation bots) and other humanlike browsing characteristics
• Record real user interactions, such as taps and swipes, on hijacked or malware-laden mobile apps, so they can be replicated, “blend in” with human traffic and circumvent security measures
Observed November 1–30, 2018
FIRST GENERATION: A CLOSER LOOK Definition:First-generationbotswerebuiltwithbasicscriptingtoolsandmakecURL-likerequeststowebsitesusingasmallnumberofIPaddresses(oftenjustoneortwo).TheydonothavetheabilitytostorecookiesorexecuteJavaScript,sotheydonotpossessthecapabilitiesofarealwebbrowser.
Impact:Thesebotsaregenerallyusedtocarry outscraping,cardingandformspam.
Mitigation:ThesesimplebotsgenerallyoriginatefromdatacentersanduseproxyIPaddresses andinconsistentUAs.TheyoftenmakethousandsofhitsfromjustoneortwoIPaddresses.They alsooperatethroughscrapingtools,suchasScreamingFrogandDeepCrawl.Theyaretheeasiesttodetectsincetheycannotmaintaincookies,whichmostwebsitesuse.Inaddition,theyfail JavaScriptchallengesbecausetheycannotexecutethem.First-generationbotscanbeblockedbyblacklistingtheirIPaddressesandUAs,aswell ascombinationsofIPsandUAs.
1 2 ULTIMATE GUIDE TO BOT MANAGEMENT
FIGURE 3. DISPARITY OF NUMBER OF IPs USED BY SOPHISTICATED BOTS (GENERATION 3 AND 4) VERSUS BASIC BOTS (GENERATION 1 AND 2)
FIGURE 4. THE MAJORITY OF BAD BOT TRAFFIC TO THESE WEBSITES WERE FIRST-GENERATION BOTS
DOMAIN BOT %
dtlbs.ru 97.50%alibaba.com 93.94%ovh.com 90.18%
These stats are from across our client base, January–December 2018.
FOUR GENERATIONS OF BOTS
ULTIMATE GUIDE TO BOT MANAGEMENT 1 3
FIGURE 5. OVERVIEW OF OUTDATED WEB BROWSERS USED BY FIRST-GENERATION BOTSFirst-generation bots usually have UAs from
outdated versions of popular browsers such as
Google Chrome, Firefox and Internet Explorer.
They cannot run JavaScript or store cookies.
FIGURE 6. CONTRAST BETWEEN FIRST- AND SECOND-GENERATION BOTS Second-generation bots (plotted in red) can load cookies and
JavaScript. Compared to first-generation bots (plotted in blue),
generally their visits have a longer session length, although they
make fewer hits per session.
FOUR GENERATIONS OF BOTS
Source data is from ShieldSquare subscriber IDs.
1 4 ULTIMATE GUIDE TO BOT MANAGEMENT
FIGURE 7. TRAFFIC VARIATIONS ACROSS SECTIONS OF A WEBSITE CAN REVEAL TELLTALE SIGNS OF AN ATTACKMost real website and app users exhibit consistent patterns. They generally start at the login page and then move
on to search pages or product pages. They typically conclude by adding products to the shopping cart and paying
for their purchase — or exiting the website without buying. Bots programmed to carry out account takeover and
scraping attacks have page traversal patterns noticeably different from those of genuine visitors.
FOUR GENERATIONS OF BOTS
Mitigation: Thesebotscanbeidentifiedthroughtheirbrowseranddevicecharacteristics,includingthepresenceofspecificJavaScriptvariables,iframetampering,sessionsandcookies.Oncethebotisidentified,itcanbeblockedbasedonitsfingerprints.Anothermethodofdetectingthesebotsistoanalyzemetricsandtypicaluserjourneysandthenlookforlargediscrepanciesinthetrafficacrossdifferentsectionsofawebsite.Thosediscrepanciescanprovidetelltalesignsofbotsintendingtocarryoutdifferenttypesofattacks,suchasaccounttakeoverandscraping(seeFigure7).
SECOND GENERATION: A CLOSER LOOK Definition: Thesebotsoperatethroughwebsite developmentandtestingtoolsknownas“headless”browsers(examples:PhantomJSandSimple-Browser),aswellaslaterversionsofChromeandFirefox,whichallowforoperationinheadlessmode.Unlikefirst-generationbots,theycanmaintaincookiesandexecuteJavaScript.BotmastersbeganusingheadlessbrowsersinresponsetothegrowinguseofJavaScriptchallengesinwebsitesandapplications.
Impact: ThesebotsareusedforapplicationDDoSattacks,scraping,formspam,skewedanalyticsandadfraud.
THIRD GENERATION: A CLOSER LOOK Definition: Thesebotsusefull-fledgedbrowsers— dedicatedorhijackedbymalware—fortheir operation.Theycansimulatebasichumanlike interactions,suchassimplemousemovementsandkeystrokes.However,theymayfailtodemon-stratehumanlikerandomnessintheirbehavior.
Impact:Third-generationbotsareusedforaccounttakeover,applicationDDoS,APIabuse,cardingandadfraud,amongotherpurposes.
Mitigation:Third-generationbotsare difficulttodetectbasedondeviceandbrowsercharacteristics.Interaction-baseduserbehavioralanalysisisrequiredtodetectsuchbots,whichgenerallyfollowaprogrammaticsequenceof URLtraversals.Figure8outlinestheattack strategythatthird-generationbotsusetocarry outscrapingattacks.
ULTIMATE GUIDE TO BOT MANAGEMENT 1 5
FIGURE 8. EXAMPLE OF A THIRD-GENERATION BOT SCRAPING ATTACKIn this representation of a large attempted scraping attack on a retailer, third-generation bots leveraged multiple
IP addresses and user agents and originated from several ISPs across the globe in a coordinated manner.
FOUR GENERATIONS OF BOTS
FOURTH GENERATION: A CLOSER LOOK Definition: Thelatestgenerationofbotshaveadvancedhumanlikeinteractioncharacteristics—includingmovingthemousepointerinarandom,humanlikepatterninsteadofinstraightlines.Thesebotsalsocanchangetheir UAswhilerotatingthroughthousandsofIPaddresses.Thereisgrowingevidencethatpointstobotdeveloperscarryingout“behaviorhijacking”—recordingtheway inwhichrealuserstouchandswipeonhijackedmobileappstomorecloselymimichumanbehavioronawebsiteorapp.Behaviorhijackingmakesthemmuchhardertodetect,astheiractivitiescannoteasilybedifferentiatedfromthoseofrealusers.What’smore,theirwidedistributionisattributabletothelargenumberofuserswhosebrowsersanddeviceshavebeenhijacked.
Impact: Fourth-generationbotsareusedforaccounttakeover,applicationDDoS,APIabuse,cardingand adfraud.
Mitigation: Thesebotsaremassivelydistributed acrosstensofthousandsofIPaddresses,oftencarryingout“lowandslow”attackstoslippastsecuritymeasures.Detectingthesebotsbasedonshallowinteractioncharacteristics,suchasmousemovementpatterns, willresultinahighnumberoffalsepositives.Prevailingtechniquesarethereforeinadequateformitigatingsuchbots.Machinelearning-basedtechnologies,suchasintent-baseddeepbehavioralanalysis(IDBA)—which aresemi-supervisedmachinelearningmodelstoidentifytheintentofbotswiththehighestprecision—arerequiredtoaccuratelydetectfourth-generationbots withzerofalsepositives.
Suchanalysisspansthevisitor’sjourneythrough theentirewebproperty—withafocusoninteractionpatterns,suchasmousemovements,scrollingandtaps,alongwiththesequenceofURLstraversed,thereferrersusedandthetimespentateachpage.Thisanalysisshouldalsocaptureadditionalparametersrelatedtothebrowserstack,IPreputation,fingerprintsandothercharacteristics.
1 6 ULTIMATE GUIDE TO BOT MANAGEMENT
FIGURE 9. OVERVIEW OF IP ADDRESSES USED BY FOURTH-GENERATION BOTS, WHICH USE MANY IPs BUT ONLY MAKE A FEW HITS FROM EACH ONE
Operating through a large and globally distributed number of IP addresses and often hijacked browsers, fourth-generation
bots can sneak past basic security systems by making very few hits per IP address used. This approach allows their
traffic to blend in with that of real users.
FOUR GENERATIONS OF BOTS
ULTIMATE GUIDE TO BOT MANAGEMENT 1 7
FIGURE 10. OVERVIEW OF HOW FOURTH-GENERATION BOTS ROTATE THROUGH THOUSANDS OF IP ADDRESSES AND DEVICE IDs TO EVADE DETECTION
FIGURE 11. EXAMPLE OF AN ATTACK BY A FOURTH-GENERATION BOT FOR ACCOUNT TAKEOVER
To be camouflaged among genuine visitors,
fourth-generation bots present themselves with
a multitude of IP addresses and device IDs. This
example of an actual attempted bot attack reveals
how they leverage multiple IP addresses while using
one device ID, as well as employing thousands of
device IDs while using just one IP address.
An actual account takeover attempt on a client’s
website reveals the multitude of ISPs, geographical
origins, domains, user agents and cookies that
were leveraged along with millions of keystrokes
and mouse movements (obtained through behavior
hijacking) to make more than 4 million hits.
FOUR GENERATIONS OF BOTS
WAYS IN WHICH BOTS CAN INFILTRATE THE THREE MAIN
CHANNELS OF THE ONLINE WORLD:
• APIs,whichprovideaccesstomachine-to-machinecommunicationsandfrequently involvehighvolumesofcalls,makingthemeasiertargetsforbotattacks
• Mobile apps,whichcanbeexploitedbybotstowageattacksononlineservices thattheappsinteractwithtoprovidecontenttousers
• Websites,whichpresentcontentinhuman-readableformats,whichmustbescraped andtranslatedintomachine-readableformatsforusebybots
Thesechannelsarehighlyinterconnected—withAPIsplayingamajorroleandfuelingmajorriskswhenitcomestobotmanagement.
1 8 ULTIMATE GUIDE TO BOT MANAGEMENT
Technical Overview: Bot Types by Channel3.0
TECHNICAL OVERVIEW: BOT TYPES BY CHANNEL
BOTS TARGET ALL CHANNELS
APIs MOBILE APPS WEBSITES
ULTIMATE GUIDE TO BOT MANAGEMENT 1 9
TECHNICAL OVERVIEW: BOT TYPES BY CHANNEL
APIs APIsaresoftwareintermediariesthatmakeitpossibleforsystemstocommunicatewitheachother.UseofwebAPIshasgrownexponentiallysince2005(seeFigure12).Infact,it’ssafetosaythatAPIsare—andwillcontinuetobe—everywhere.Theyarecriticalenablersofcountlesssystemsand services.APIspowerorganizations’back-endsystems,mobileappsandincreasinglyevenwebsites—andwillbecomeevenmorecriticalastheIoTcontinuestoconnecteverythingfromtoasterstocars.
APIgrowthhasbeensignificanttodate,andseveralindustrytrendswillfurtherfueluseofAPIs:
• IoT:TheindustryispoisedforanIoTexplosionthrough5G.IoTandindustrialIoTsolutionswill connecttocloudsdirectly.CloudAPIssupportingtelemetryregistration(sensors)andrichfunctionality(example:IfThisThenThat(IFTTT),AlexaandGoogleAssistant)willbeexposeddirectly.
• Mobile apps:IoTand5GnetworkingwillalsofuelthegrowthinmobileapplicationsandtheirrelianceonAPIs.
• Cloud migrations: Asthesemigrationscontinueaccelerating,multicloudenvironmentsareafactoflife.Applicationscanmixandmatchbest-in-classservicesfromAmazonWebServices(AWS),AzureandGoogleCloud.AlltheseinterconnectingcloudapplicationsrequireAPIs.
AsorganizationscreateAPIstopowertheirbusinesses,theysometimesdecidetomaketheAPIavailableforsaleandusebyotherenterprises.Thispracticeisfuelingafast-growing“APIeconomy”alongsidethetrendsinIoT,5Gandthecloud.
FIGURE 12. GROWTH IN WEB APIs SINCE 200516
16https://www.programmableweb.com/news/apis-show-faster-growth-rate-2019-previous-years/research/2019/07/17
Thetroubleis,APIscanbehighlyvulnerable,makingthemfrequentattacktargets.AndbecauseAPIsarepoweredbymachine-to-machinecommunication,itcanbefarmoredifficulttodetermineifanAPIcallisoriginatingfromagoodsourceforahelpfulpurpose—orfromabadactorwithillintentionsforyourbusinessandyourcustomers.That’sbecauseAPIs are builtformachinestotalkto,makingthethresholdforbotsinteractingeasierwith anAPIthanwithawebsite.Botsdon’thave tomimicusersorscrapeanddecodePDFs orHTMLtables;theycansimply“speak”thecomputerlanguagewiththeAPIandobtain alltheinformationtheyneed.
Mobileapps,websitesandevendesktopapplicationsregularlyrelyonthird-partydataorfunctionalitiesthattheyconsumethroughwebAPIs.WebAPIsprovideapplicationswithotherwiseinaccessibleresources,suchasaccesstoglobalsocialnetworks(examples:webAPIsprovidedbyTwitter,FacebookorLinkedIn),advancedmachinelearningcapabilities(examples:webAPIsprovidedbyIBMWatsonorGoogleCloud’sAI)orcomplextransactionprocessing(examples:webAPIsbyStripeorPayPalforpaymentprocessingortheFlightBookingAPI).
Althoughmostcompaniesarewell-versedastowheretheirwebapplicationsreside,theymayhavelittletonovisibilityintothefullcomplementofAPIsonwhichtheirbusinessesdepend.
Inotherwords,applicationdevelopersnowrelyheavilyonthirdparties—entitiesbeyondtheircontrolsphere—forcorefunctionalityoftheirapplications.Userexperienceand,byextension,applicationreputationaredirectlyaffectedbyactionsandnonactionsoftheAPIprovider(s).Service-levelagreements(SLAs)mightcome
intoplayforcommercialAPIofferings,butbyandlarge,developersarenolongerincontroloftheirapps.AndAPIsmakeitmuchmoredifficulttodistinguishgoodbotsfrombadbots.
APIs UNDER SIEGE ScammersexploitAPIvulnerabilitiestostealsensitivedata,includinguserinformationandbusiness-criticalcontent.Modernapplicationarchitecturetrends—suchasmobiledevices,useofcloudsystemsandmicroservicedesignpatterns—complicatesecurityofAPIsbecausetheyinvolvemultiplegatewaystofacilitateinteroperabilityamongdiversewebapplications.What’smore,extensivedeploymentofinternalAPIs,combinedwithmobileaccessandincreaseddependenceoncloud-basedAPIs,meansthatwebapplicationsecuritydefensesystemsthatdefendonlytheexternalperimeterareineffective.Also,asbusinessescontinuallyaddandconsumenewAPIs,APIsecuritycannotbeaone-timeexercise.
Followingisaroundupofthemostcommon botattacksagainstAPIs:
Application DDoS Attacks AttackersoverwhelmAPIsbysendingtrafficfrommultipleclients.Theytargetbusiness-criticalservices,includingloginservices,sessionmanagementandotherservicesvitaltoapplicationreliability.AttackersalsogenerateAPIcallsthatrequireextensiveresourcesandaffectserverresponsetime.Detectingandfilteringunwantedtraffic,includingrequestsfromautomationscripts,areessentialforstoppingDDoSattacksonLayer7.ThekeyisanalyzingeveryAPIrequest,includingpayloadandHTTPheaders,toidentifyanomalousbehaviorpatternsandperformingintentanalysistounderstandtheactualintentbehindanAPIrequesttofilterbadAPIcalls.
2 0 ULTIMATE GUIDE TO BOT MANAGEMENT
TECHNICAL OVERVIEW: BOT TYPES BY CHANNEL
Account Takeover HackersdeploybotnetstoprogrammaticallysendAPIcallstoteststolencredentials.AlthoughAPImanagementsystemsrejectinvalidloginattempts,thesesystemsareincapableofstoppingbotherdersfromtryingdifferentcombinationsofcredentialsusingmultipleIPs.HackersalsokeeptheAPIrequestsbelowtheratelimittomakeitdifficultforconventionalAPIsecuritymeasurestodetectsuchsophisticatedaccounttakeoverattempts.Itisimportanttoaccuratelydistinguishbetweengenuineloginattemptsandmaliciouscredentialstuffingattacks.
Web Scraping ScrapersextractdatafromAPIsandexecuteautomatedformfilling.Hackersreverse-engineerwebandmobileappstohijackAPIcallsandscrape content.
MOBILE APPS Withtheubiquitousadoptionofthemobileinternet,cyberattackershavefoundyetanotherattacksurfacetoexploit.Attackerstargetmobileappstopreyonbusiness-criticaldata,customers’personalinformation,credentialsandpaymentcarddetails.Thesebotschangetheiridentity,behaviorandIPaddresstooperateunderpermissiblelimitsofconventionalsecuritymeasures.Additionally,mobiletrafficcharacter-isticsarelesspredictablethanwebbrowsers’traffic.Tacklingsuchsophisticatedbotsrequiresanadvancedapproachthatimprovesitslogicfasterthancontinuouslyevolvingbotpatterns.
Hackersabusemobileappsbycreatingvirtualmachines—thatis,virtualizingasmartphone,runningtheapponthatvirtualmachineandthenduplicatingit1,000timesormore.It’saperfectformulaforcommittingclickfraudusing
alegitimatemobileapp.Andhackerscanusuallygetawaywithitunlessanduntilthe appdeveloperimplementsacapabilityto detectwhethertheappisrunningonan iPhoneorAndroiddeviceorinavirtual machineenvironment.
Moreadvancedhackerstakethetacticstoanotherlevel—reverse-engineeringamobileapptoreusetheAPIcallsitmakesandplacingthosewithintheirownapp.ThisraisestheneedforappownerstodistinguishthesourceofcallsagainsttheirAPIs.Anditrequirestheabilitytoverifywheretheappiscompiled.Ifithasn’tbeencompiledinasafeandapproved environment,thoseAPIcallsshouldbeignored.
WEBSITES Badbottoolsareadimeadozen—making itrelativelyeasyforattackerstoacquiretheresourcestheyneedtospam,scrape,scalporotherwiseabuseyourwebsite.Webscrapinghasemergedasoneofthepredominant usesforbots.Programsthatexecutethesecontent-stealingattacksareasdiverseastheyarepopular,rangingfromsimple,manuallytunedscriptstohighlyautomated,cloud-basedservices.Othertoolsserveinasupportingrole.
Thispiece, The Top Free and Paid Web Scraping Tools and Services,providesanoverviewofthetopwebscrapingtools,cloud-basedservicesandIProtationsolutionscurrentlyusedtoconductwebscrapingattacks.
ULTIMATE GUIDE TO BOT MANAGEMENT 2 1
READ THE TOP FREE AND PAID WEB SCRAPINGTOOLS AND SERVICES
TECHNICAL OVERVIEW: BOT TYPES BY CHANNEL
Inmedicineaswellassecurity,detectinga problemisnotthesameasdetectingtheproblem.Forexample,althoughit’seasytoidentifyahighfever,thepresenceofafeverdoesnotclearlyindicateacertaindiseaseand,byextension,acourseoftreatment.Thesameholdstruewhendiagnosingbotactivity.Youmaybeabletosee ahigher-than-usualvolumeofbottraffic,butthatdoesn’trevealbotintentorriskfactors.
Inbothdisciplines,it’scriticaltousetherighttreatmentortoolfortheproblemathand.Takingantibioticswhenyouhaveaviralinfectioncanintroduceunwantedsideeffectsanddoesnothingtoresolveyourillness.Similarly,usingCAPTCHAisn’tacure-allforeverybotattack.Itsimplywon’tworkforsomebottypes,andifyoudeployitbroadly,it’ssuretocausenegativecustomerexperience“sideeffects.”
Andinbothmedicineandsecurity,treatment israrelyaone-size-fits-allexercise.Treating ormitigatingaproblemisanentirelydifferentexercisefromdiagnosingordetectingit.Figuringoutthe“disease”athandmaybelongand
complex,buteffectivemitigationcanbesurprisinglysimple.Itdependsonseveralvariables—andrequiresexpertknowledge,skillsandjudgment.
Medicineisincreasinglymovingtoa“personalized”approachbasedonapatient’sspecificgeneticandlifestylevariables.Intherealmofbotmanagement,securityneedstodothesame.Everybusinessisrunningadistinctsetofwebsites,applicationsandmobileappsandreliesonauniquesetofAPIs.Andeachhasdifferentprioritiesandlevelsofcomfortwithfalsepositivesandfalsenegatives.
Theonlywaytomeetthosenuancedrequirementsiswithatailoredapproachthatleveragesmachinelearningtocontinuallymonitorthe“genetics”ofyourbusiness,theenvironmentinwhichitoperatesandtheattackersseekingtoharmit.
Acloserlookatbotdetectionandmitigation willexplainwhy.
WHEN IT COMES TO DETECTION AND MITIGATION, SECURITY
AND MEDICAL TREATMENT HAVE MORE IN COMMON THAN
YOU MIGHT THINK. BOTH REQUIRE CAREFUL EVALUATION
OF THE RISKS, TRADE-OFFS AND IMPLICATIONS OF FALSE
POSITIVES AND FALSE NEGATIVES.
2 2 ULTIMATE GUIDE TO BOT MANAGEMENT
Detection & Mitigation Techniques4.0
DETECTION & MITIGATION TECHNIQUES
You may be able to see a higher-than-usual volume of bot traffic, but that doesn’t reveal bot intent or risk factors.
WHEN IT COMES TO DETECTION AND MITIGATION, SECURITY
AND MEDICAL TREATMENT HAVE MORE IN COMMON THAN
YOU MIGHT THINK. BOTH REQUIRE CAREFUL EVALUATION
OF THE RISKS, TRADE-OFFS AND IMPLICATIONS OF FALSE
POSITIVES AND FALSE NEGATIVES.
ULTIMATE GUIDE TO BOT MANAGEMENT 2 3
DETECTION Onthesurface,botdetectionseemssimple:Youwanttoaccuratelydetectbadbotswithalowrateoffalsepositives(toavoidblockinglegitimatehumanusersandgoodbots)andalowrateoffalsenegatives(toensurethatyou’redetectingALLbadbots).Gobelowthesurfacethough, andthechallengesofdetectionbecomemuchmorecomplex.
There’sagoodreasonwhyanalystfirmForresterhascitedattackdetectionasoneofthemajorselectionconsiderationsforbotmanagement
solutions.17Thequalityofdetectiondeterminesthequalityofthesolution.Andasattackingbotsbecomeevermoresophisticated,detectionbecomesevermorechallenging.
Chapter2presentedthefourgenerationsofbots,highlightingtheincreasinglevelsofsophisticationovertime.Forsimplerbots,thedetectionprocesswillbecomparativelyfast,easyandinexpensive.However,asbotsbecomemoresophisticated,detectionwillbecomelonger,morechallengingandmorecostly(seeFigure13).
GENE
RATIO
NS FIRST GENERATION IP and HTTP reader combinations
SECOND GENERATION Device fingerprints: format checks,
logical checks across attributes, reputation database
THIRD GENERATIONShallow machine learning
FOURTH GENERATIONDeep machine learning
BOTS SCRIPT BOT HEADLESS BROWSER BOT HUMANLIKE BOT DISTRIBUTED BOT
TECH
NOLO
GY BLACKLISTS IP, UA
DEVICE/BROWSER Cookie, JS, fingerprinting
INTERACTION (SHALLOW) Mouse movement & keystroke anomalies
INTENT (DEEP) Correlation in intent
signatures across devices
17The Forrester New Wave: Bot Management, Q3 2018
FIGURE 13. MITIGATION OPTIONS BY BOT GENERATION
1. Identifyingtheaverageactivityrates andabnormalratesofunsuccessful loginattempts.Unfortunately,this approachisnotsufficientlyaccurateand,moreimportant,doesnotidentifytheattacksource.Thus,anymitigationwill beineffectiveorwillhaveasignificant customerexperienceimpact.
2. LookingateachsourceIPaddress andcorrelatingactivityovertimeto allowdetectionofactiveIPsgenerating unsuccessfulloginattempts.However, iftheattacksourceisdynamicallyrotatingitsIPaddresses,thismethodologywill beblindtotheattack.
3. Correlatingtheactivityovertimeforeachsourcebydevicefingerprint.Butagain,iftheattacksourceisdynamicallymodifyingitsdevicefingerprint,themethodology willmissthemark.
Amoresophisticateddetectionwill correlateactivityovertimeacrossIPs, devicefingerprints,mobiledeviceattributesandsensors,aswellasotherattributes, toprovidecomprehensiveanalysis foraccurateattacksourcedetection.
To illustrate these points, consider the example of a bot attack aimed at cracking passwords. A bot management solution could apply several methodologies to detect the attack by:
User Behavioral Analysis
MITIGATION Blockingbotsmayseemliketheobviousapproachtomitigation;however,mitigationisn’talwaysabouteradicatingbots.Instead,youcanfocusonmanagingthem.Whatfollowsisaroundofmitigationtechniquesworthconsideration:
1. Feed fake data to the bot.Keepthebotactiveandallowittocontinueattemptingtoattackyourapp.Butratherthanreplyingwithrealcontent,replywithfakedata.Youcouldreplywith modifiedfakedvalues(thatis,wrongpricingvalues).Inthisway,youmanipulatethebottoreceivethevalueyouwantratherthantherealprice.Anotheroptionistoredirectthebottoasimilarfakeapp,wherecontentisreducedandsimplifiedandthebot isunabletoaccessyouroriginalcontent.
2. Challenge the bot with a visible CAPTCHA. CAPTCHAcanfunctionasaneffectivemitigationtoolinsomescenarios,butyoumustuseitcarefully.Ifdetectionisnoteffectiveandaccurate,theuseofCAPTCHAcouldhaveasignificantusabilityimpact.SinceCAPTCHAisachallengebynature,itmayalsohelpimprovethequalityofdetection.Afterall,clientswhoresolveaCAPTCHAaremorethanlikelynotbots.Ontheotherhand,sophisticatedbotsmaybeabletoresolveCAPTCHA.Consequently,itisnotabulletproofsolution.
3. Use throttling.Whenanattacksourceispersistently attackingyourapps,athrottlingapproachmaybeeffectivewhilestillallowinglegitsourcesaccesstotheapplicationinascenariooffalsepositives. 4. Implement an invisible challenge.Invisiblechallenges caninvolveanexpectationtomovethemouseortypedatainmandatoryformfields—actionsthatabotwouldbeunable tocomplete.
5. Block the source.Whenasourceisbeingblocked,there’snoneedtoprocessitstraffic,noneedtoapplyprotectionrulesandnologstostore.Consideringthatbotscangeneratemorethan90%oftrafficforhighlyattackedtargetsandapplications,thiscostsavingsmaybesignificant.Thus,thisapproachmayappeartobethemosteffectiveandcost-efficientapproach.Thebadnews?Apersistentattacksourcethatupdatesitsbotcodefrequentlymayfindthismitigationeasytoidentifyandovercome.Itwillsimplyupdatethebotcodeimmediately,andinthisway,asimplefirst-generationbotcanevolveintoamoresophisticatedbotthatwillbechallengingtodetectandblockinfutureattackphases.
A HEALTHY BOT MANAGEMENT STRATEGY Here’sanoverviewofthebasicfunctionalityyouneedto mitigate—ormanage—bots:
1.Asessionisasinglecontextfromasingleuserorclientaccessingyourapp.AbotmanagermustaddacookieinthewebenvironmentoratokenintheAPIenvironmentinordertomonitorandanalyzesessioncontext.
2.Abotmanagermustcorrelateallthebehaviorsofall sourcesacrossallsessionsforthepurposeofattackdetection.Thosebehaviorsshouldincludevolume,nature,frequency oftransactionsandnavigationflow.
3.Abotmanagershouldbeabletouniquelyidentifysources.Considerthesimpleexampleofanattackertryingtocrackaparticularuser’spassword.Supposeittriesthreetimestolog inwithadictionarypasswordbeforeswitchingtoanotherIP. Insuchascenario,IP-basedidentificationoftheattacksourceisineffective,andyou’reblindtotheattack.
Tocorrelateacrossthosemultipleattackattempts,youneed adevicefingerprinttogatherIP-agnosticinformation.Evenif thesameattacksourceusesadictionaryofthe1,000mostcommonpasswordsandkeepsswitchingIPaddresses,youneedtheabilitytoidentifythebehaviorandthecontextovermultiplesessions.TodosorequiresyoutoembeddevicefingerprintJavaScriptintothesecuredapplicationorintotheapplicationresponses.Inotherwords,thereisaneedtomodifytheresponseifJavaScriptisnotembeddedintotheapplication.
Finally,whiledevicefingerprintingiseffectiveinawebenvironment,amobiledevicethatmaynotexecuteJavaScriptrequiresadifferentapproach.Inthatcase,youneedacollectionofmobiledevicesensordataforsourceidentification.Byintegratingtheapplicationwithamobilesoftwaredevelopmentkit(SDK),youcanenableaccesstomobiledevicesensordata.
2 4 ULTIMATE GUIDE TO BOT MANAGEMENT
“It’s easy to embed a CAPTCHA into my apps. Why would I need a bot manager?”
Not so fast. Whom should you challenge with your CAPTCHA? How do you identify your attacker? Do you really want to challenge everyone?
You can’t know how to deploy CAPTCHA without effective detection. Without it, either you’d be challenging the wrong attackers or you would be challenging everyone — impacting visibility and user experience.
4.Abotmanageralsoneedstoofferarulesenginewithdeterministicrulesthatsupportimmediateattackdetectionandmitigation.
5.Finally,abotmanagerneedsmachinelearningcapabilitiestodetectsophisticatedbotswhosebehaviorcannotbedetectedbydeterministicrules.Alegitimatebehaviorforoneappmaybecompletelyillegitimateinadifferentappandallowbotactivitydetection.
DEPLOYMENT OPTIONS Althoughabotmanagementservicemayoffercloud controlandonecentralized,multitenantportal,theprimaryprovisioning/deploymentoptionfordatapathintegrationshouldbereviewed.
Followingarethethreeoptions:
1. Cloud integrated.Variousvendorsoffercloud-integratedbotservicesthatrequireredirectionofclienttraffic,sothebotmanagercanprocessalltrafficforbotdetection.Thecommonapproachisdomainnamesystem(DNS)redirection.Suchservicesareofferedbycloudwebapplicationfirewall(WAF)integratedbotmanagementserviceprovidersandbycontentdeliverynetwork(CDN)serviceproviders.
2. Application integrated.Withthisdeploymentoption,thereisnoredirectionoftraffictoaclouddatapathservice.Thedatapathvisibility,detectionandmitigationcomponentmaybeintegratedin:
•Areverseproxy,suchasEnvoy,NGINXorHAProxy, whichisusuallyimplementedasaplug-inoramodule integratedintothereverseproxyorasascriptimplementingthebotmanagementfunctionality
•Thewebserverrunningtheapplicationwhereaplug-inimplementsthebotmanagementfunctionality
•ThewebapplicationitselfusingSDKintegration
•ThemobileappusingmobileSDKintegration
3. Appliance.Usuallyanapplianceimplementationwillbebasedonaprepackagedreverseproxyoption(describedabove)deliveredasaphysicalorvirtualappliance.
ULTIMATE GUIDE TO BOT MANAGEMENT 2 5
DETECTION AND MITIGATION TECHNIQUES
IMPLEMENTATION OPTIONSAPI ENDPOINT SPAN PORT IN-LINEThedatapathcomponenthasvisibilityintotheapplicationtraffic,whiletheactualanalysisofthetrafficisappliedinacen-tralizedcloud/on-premiseAPIendpointcomponent.
Mitigationmaybeappliedinthesamedatapathcomponentorelsewhere.
Aconceptavailableoneveryswitchthatallowsyoutotakeacopyofthetrafficbeingsentfromclienttowebserver andbacktoclient.
Althoughyoucan’t modifythetraffic (requestorresponse),youdoseethemandcancollectthisinformationforcompleteanalysiswithoutimpactingdatalatencyorresponsetimes.
Deliveredasareverseproxyorasabridge.
ADVANTAGES: Reduceddatapathriskallowingcontrolledlatencywithconfigurabletimeout,nopointoffailurewithbypassoptionsandanoptiontoimplementanasynchronousapproach.
ADVANTAGE: Zeroimpacttotraffic.
ADVANTAGES: Allowsinspectionbeforeforwardingtherequesttothesecuredapplication.
Allowsresponse modificationfor embeddingJavaScript.
DISADVANTAGES: None
DISADVANTAGES: Noresponsemodificationandlimiteddetectioncapabilities.
Justvisibility—NOTmitigation.
Whilesomeorganizationsprioritizevisibilityintotheproblemovermitigatingit,thisapproachisn’tpracticalforbot managementsolutions.
DISADVANTAGES: Mayintroducelatencyandpointoffailure.
However,WAFsfallshortwhenfacedwith someautomatedthreats(example:contentscraping).Moreover,fourth-generationbotsusesophisticatedtechniquestoevadedetection—frommimickinghumanbehaviorandabusingopen-sourcetoolstomakingmultipleviolationsin different sessions.
Whenitcomestodetectingandmitigatingfourth-generationbots,WAFssimplydon’tdothejobwellenough.
Two to Tango Completeapplicationprotectionmustbring togethertheabilitytodetectandcontrol maliciousbotattacksaswellastosecure the“by-design”flaws.Inmanycases,botsareprogrammedtoexploitthesevulnerabilities, butagain,theycandoalotmore.
WAF OR BOT MANAGEMENT?
WAFs ARE PRIMARILY CREATED TO SAFEGUARD WEBSITES
AGAINST APPLICATION VULNERABILITY EXPLOITATIONS. WAFs
USUALLY FEATURE BASIC BOT MITIGATION CAPABILITIES AND
CAN BLOCK THOSE BASED ON IPs (THAT IS, ACCESS CONTROL
LISTS OR ACLs). IF THE WAF IS A BIT MORE SOPHISTICATED,
IT MAY ALSO BE ABLE TO PERFORM DEVICE FINGERPRINTING.
2 6 ULTIMATE GUIDE TO BOT MANAGEMENT
Additional Considerations5.0
ADDITIONAL CONSIDERATIONS
ULTIMATE GUIDE TO BOT MANAGEMENT 2 7
ADDITIONAL CONSIDERATIONS
WHEN DO WAFs DO THE JOB? WHEN DO YOU ALSO NEED TO BRING IN A BOT MANAGER? The table below provides an at-a-glance view of each one’s strengths — and in what cases they play well together:
SECURITY CAPABILITIES BOT MANAGER TRADITIONAL WAFs HAVING BOTH
Protectionfromsimplebots YES YES YES
Fingerprintingmaliciousdevices YES YES YES
MitigationofdynamicIPandheadless browserattacks
YES LIMITED YES
Detectionofsophisticatedbotattacks YES NO YES
Riskofblockinggenuineusers(falsepositives) NONE HIGH NONE
Collectivebotintelligence(forexample,IPs,fingerprintsandbehavioralpatterns)
YES NO YES
Customizedactionsagainstsuspicious bottypes
YES NO YES
ProtectionfromOWASPTop10vulnerabilities NO YES YES
ProtectionfromAPIvulnerabilities LIMITED YES YES
ProtectionagainstLayer7DoS LIMITED YES YES
HTTPtrafficinspection NO YES YES
Maskingsensitivedata NO YES YES
CompliancewithHIPAAandPCI LIMITED YES YES
IntegrationwithDevOps NO YES YES
CONSIDERATION #1: SCOPE OF DETECTION TECHNIQUES Theriseofhighlysophisticated,humanlikebotsrequiresadvancedtechniquesindetectionandresponse.
Ask the following:• What detection and response techniquesdoesthesolutionsupport?
•Howmanymethodologiesare included,andwhatistheirlevel of sophistication?
Forthemostadvancedcapabilities,lookforasolutionthatoffersafullcomplementoftechniques,includingdeviceandbrowserfingerprinting,intentandbehavioralanalyses,collectivebotintelligenceandthreatresearch,aswellasotherfoundationaltechniques.
CONSIDERATION #2: ADAPTABILITY TO DYNAMIC THREATS Botsneverstopevolving.Neithershouldyourbotmanagementsolution.
Ask the following:•Doesthesolutionincludedeeplearningandself-optimizingcapabilities?Thesecapabilitiesareessentialforidentifyingandblockingbotsastheyaltercharacteristicstoevadedetection.
•Doesthesolutionmatchthedeceptioncapabilitiesofsophisticatedbots?Requestexamplesofsophisticatedattacksthatthesolutionhas successfullydetectedandthwarted.
CONSIDERATION #3: MULTIGENERATIONAL DETECTION Eachofthefourcurrentbot generationsrequiresdifferent approachestomitigation.
Ask the following:•Howdoesthesolutiondefeat earliergenerations?Techniques suchasblacklists,fingerprinting andJavaScriptarethemost commonapproaches.
•Howdoesthesolutiontakeonmodernbots,whichextendtheircapabilitiesbeyondscriptsandheadlessbrowsers?Humanlikebotsandadvanceddistributedbotsrequirecomplexuserbehavioralanalysis.
•Howdoesthesolutionunderstandandneutralizeabot’sintent?
2 8 ULTIMATE GUIDE TO BOT MANAGEMENT
Buyers Checklist6.0BUYERS CHECKLIST
KEY CONSIDERATIONS WHEN EVALUATING BOT MANAGEMENT SOLUTIONS
WE’VE SUMMARIZED THE CHALLENGES. WE’VE OUTLINED
OPTIONS FOR DETECTING AND MITIGATING BOT ATTACKS. THIS
CHAPTER EXPLORES KEY CONSIDERATIONS WHEN EVALUATING
BOT MANAGEMENT SOLUTIONS. USE THESE CRITERIA TO SELECT
THE BEST SOLUTION FOR YOUR ENVIRONMENT.
ULTIMATE GUIDE TO BOT MANAGEMENT 2 9
CONSIDERATION #4: ROBUST AUTOMATED RESPONSE It’simportanttochooseasolutionthatoffersmultipleresponse mechanismstobottraffic.
Ask the following:•Doesthesolutionincludenotonlyblockingbutalsolimitingcustom actionsbasedonthreatidentification?
•Canitservefakedatatothebot?
CONSIDERATION #5: DEPLOYMENT FLEXIBILITY Everynetworkisdifferent.
Ask the following: •Howwelldoesthesolution accommodateyournetwork’suniqueneeds?
•Cantheproviderdeploythesolutionexactlythewayyouneedit?Lookforabotmanagementsolutionthatprovideseasy,seamlessdeploymentwithoutinfrastructurechangesortheriskofreroutingtraffic.
•Doesyourarchitecturerequire anin-linesolutionorsomething out-of-path?
•Doyouwantittodetectand mitigateoronlytodetectandnotify?
Besuretolookforoptionsthatcanbestand-aloneorintegratedwith aWAFforcompletecoverage.
CONSIDERATION #6: CLEAN, FEATURE-RICH REPORTING FOR OPTIMAL VISIBILITY Reportingisacriticalaspectofanybotmanagementtool.Considerhoweachsolutionprovidesreporting information.Havingaccessto granularreportscanbecrucial, yettoomuchinformationcanalsohidewhatyou’relookingfor.
Ask the following: •Doesitofferclean,easy-to-understandreporting?ItshouldpresentgranulardetailwhenyouwantittobutalsointegratewithpopularanalyticsplatformsfromAdobeorGoogletoprovidereportsonnonhumantraffic.
CONSIDERATION #7: GOVERNANCE AND COMPLIANCE FACTORS Formanyorganizations,their applicationsandsupportingdata areamongtheirmostvaluedassets.
•Doesthebotmitigationsolutionensurethattrafficdoesnotleave anetwork?
•Ifso,doesittransformdatatoanencryptedandhashedformattomaximizeprivacyandcompliance?
EnsuringthatthebotmitigationsolutioniscompliantwiththeGeneralDataProtectionRegulation(GDPR)pertainingtodataatrestanddataintransitwillhelptoavoidpersonaldatabreachesandtheriskoffinancialandlegalpenalties.
FOR MORE DETAILS ON THIS TOPIC, SEE THE RADWARE WHITE PAPER HOW TO EVALUATE BOT MANAGEMENT SOLUTIONS.
BUYERS CHECKLIST
3 0 ULTIMATE GUIDE TO BOT MANAGEMENT
CLOSING
IN AN IDEAL SECURITY WORLD, IT WOULD BE EASY
TO IDENTIFY GOOD BOTS VS. BAD BOTS. WE WOULD
SIMPLY ENABLE HELPFUL BOTS AND BLOCK THEIR
HARMFUL COUNTERPARTS. AS WITH MANY ASPECTS
OF TECHNOLOGY MANAGEMENT — AND DAY-TO-DAY
LIFE — THE REALITY IS FAR MORE NUANCED. SIMPLISTIC
APPROACHES MAY HELP THWART BAD BOTS, BUT THE
PRICE WILL BE TOO HIGH IN TERMS OF LOST BENEFITS
FROM GOOD BOTS — NOT TO MENTION CUSTOMER
FRUSTRATION AND DISENFRANCHISEMENT.
Ultimately,yourgoalshouldnotbetoeradicatebotsbutrathertomanagethem effectively.Byusingmitigationanddetectiontechniquesthataresophisticated andcontinuallyrefined,youcanensurethatyourorganizationenjoysbotbenefitswhilereducingtheimpactofbadactors.
ULTIMATE GUIDE TO BOT MANAGEMENT 3 1
APPENDIX
APPENDIXOWASP TOP 21 AUTOMATED THREATS
ACCOUNT TAKEOVER
CREDENTIAL CRACKING
CREDENTIAL STUFFING
ACCOUNT CREATION
ACCOUNT AGGREGATION
TOKEN CRACKING
AVAILABILITY OF INVENTORY
DENIAL OF INVENTORY
SCALPING
SNIPING
ABUSE OF FUNCTIONALITY
DATA SCRAPING
SKEWING
SPAMMING
CAPTCHA DEFEAT
AD FRAUD
EXPEDITING
PAYMENT DATA ABUSE
CARDING
CARD CRACKING
CASHING OUT
VULNERABILITY IDENTIFICATION
FINGERPRINTING
FOOTPRINTING
VULNERABILITY SCANNING
RESOURCE DEPLETION
DENIAL OF SERVICE
HOW TO EVALUATE BOT MANAGEMENT SOLUTIONS
THE TOP FREE AND PAID WEB SCRAPING TOOLS AND SERVICES
THE 50 MOST COMMON WEB SCRAPING TOOLS
CONTACT US ABOUT BOT MANAGEMENT SOLUTIONS
FOR MORE INFORMATION VISIT OUR BOT MANAGEMENT RESOURCE CENTER
ABOUT RADWARE Radware® (NASDAQ: RDWR) is a global leader of cybersecurity and application delivery solutions for physical, cloud and software-defined data centers. Its award-winning solutions portfolio secures the digital experience by providing infrastructure, application and corporate IT protection and availability services to enterprises globally. Radware’s solutions empower more than 12,500 enterprise and carrier customers worldwide to adapt quickly to market challenges, maintain business continuity and achieve maximum productivity while keeping costs down. For more information, please visit www.radware.com.
Radware encourages you to join our community and follow us on: Radware Blog, LinkedIn, Facebook, Twitter, SlideShare, YouTube, Radware Connect app for iPhone® and our security center DDoSWarriors.com that provides a comprehensive analysis of DDoS attack tools, trends and threats.
© 2019 Radware Ltd. All rights reserved. The Radware products and solutions mentioned in this document are protected by trademarks, patents and pending patent applications of Radware in the U.S. and other countries. For more details, please see: https://www.radware.com/LegalNotice/. All other trademarks and names are property of their respective owners.