# https://en.wikipedia.org/wiki/Robots_exclusion_standard # See a list with: # cPanel > Metrics > Awstats # Then on the left: # Robots/Spiders visitors -- Full list # Blackhole for Bad Bots # WordPress plugin # Disabled, because this piece of shit doesn't play nice with The Wayback Machine. # https://perishablepress.com/blackhole-bad-bots/ # https://wordpress.org/plugins/blackhole-bad-bots/ # https://blog.spiralofhope.com/wp-admin/admin.php?page=blackhole_settings # https://blog.spiralofhope.com/?p=54811 #User-agent: * #Disallow: /*blackhole #Disallow: /?blackhole # ------------------ # Marketing crawlers # ------------------ # https://ahrefs.com/robot # > ... for Ahrefs online marketing toolset # > ... used by thousands of digital marketers around the world to plan, # > execute, and monitor their online marketing campaigns. User-agent: AhrefsBot Disallow: / # https://moz.com/ # An SEO/Marketing business. # https://moz.com/help/moz-procedures/crawlers/dotbot # > ... gathers web data for the Moz Link Index. This data we collect # > through Dotbot is available in the Links section of your Moz Pro # > campaign, Link Explorer, and the Moz Links API. User-agent: dotbot Disallow: / # https://www.semrush.com/bot/ # Backlink analytics, SEO, prospect finding, SEO, brand monitoring User-agent: SemrushBot Disallow: / # -------------- # Others of note # -------------- # (unknown, I think this is a search engine..) # https://www.2345.com/ #User-agent: 2345Explorer #Disallow: / # http://webmeup-crawler.com/ # > BLEXBot assists internet marketers to get information on the link # > structure of sites and their interlinking on the web, to avoid any # > technical and possible legal issues and improve overall online # > experience. User-agent: BLEXBot Disallow: /