Managing Web Crawler Traffic – Freestar Help Center

1. What Are Web Crawlers?

Web crawlers (also known as bots or spiders) are automated programs that visit websites to gather data. They may be:

Search engine crawlers (e.g., Googlebot, Bingbot) → Index content for search results.
Ad crawlers (e.g., AmazonAdBot, AppNexusBot) → Scan content and ads.txt for ad targeting and eligibility.
Monitoring/security bots → Check uptime or scan for vulnerabilities.

Crawlers identify themselves by their user agent string and typically follow the rules set in robots.txt.

Search traffic → Blocking search crawlers can hurt SEO visibility.
Advertising revenue → Blocking ad crawlers may prevent them from:
- Accessing ads.txt, which is required to validate authorized sellers.
- Scanning content for contextual targeting, which can affect CPMs.
Server performance → Some crawlers can hit sites very frequently, creating strain on hosting resources.

Effect: Reduces server strain by slowing down crawl frequency, without fully blocking.
Directive: Crawl-delay (note: not all crawlers honor this).
Benefit: Maintains contextual scanning while reducing server load.

Example:

User-agent: ExampleBot
Crawl-delay: 10

Effect: Completely prevents the crawler from accessing the site, including ads.txt.
Risk: Can directly reduce ad revenue if ad crawlers cannot confirm ads.txt or scan pages.
Use case: Only for malicious/abusive crawlers (scrapers, spam bots).

Effect: Crawler cannot access certain pages or directories, but can still fetch ads.txt.
Risk: Prevents contextual scanning, which may lower CPMs for ad partners.
Use case: Limit access to sensitive or non-monetized areas of the site.

Example:

User-agent: ExampleBot
Disallow: /

Never block ad crawlers via firewall – this prevents ads.txt access and can impact revenue.
Use robots.txt for crawl control – safe way to manage frequency or scope of crawling.
Monitor logs – identify which bots are hitting the site most often, using IPs and user agents.
Whitelist legitimate crawlers – search engines, ad crawlers, monitoring services.