Our Bots

How GEO Tracker AI identifies itself when fetching public web pages

This page exists so domain owners can identify GEO Tracker AI in their server logs, understand what we fetch and why, and opt out cleanly via robots.txt. Questions or opt-out requests: legal@geotrackerai.com.

1. What we are

GEO Tracker AI (operated by Ing. Petra Vlčková, Czech Republic) is a SaaS tool that helps brand and product owners monitor how AI search systems (e.g., ChatGPT, Perplexity, Google AI Mode) cite their websites and the sources around them. We do not crawl the web at scale and we do not train AI models on the content we fetch. We fetch a small number of pages, on-demand, in two narrow contexts: (a) the user's own domain (Crawlability Monitor / Content Audit / Discovery Readiness), and (b) URLs that third-party AI systems already cited in response to the user's queries (Citation Source Intelligence).

2. Our user-agent strings

We use two stable, descriptive User-Agent strings. Both contain a link back to this page so you can identify us:

User-Agent	Used for
GeoTrackerAI/1.0 (+https://geotrackerai.com/bots)	Crawlability Monitor, Content Audit, Discovery Readiness — fetches your own robots.txt, headers, homepage, and sitemap-ranked pages on the domain you added to your account.
GeoTrackerAI-Citations/1.0 (+https://geotrackerai.com/bots)	Citation Source Intelligence — fetches public URLs (Reddit, Hacker News, GitHub, blogs, listicles, etc.) already cited by AI systems in response to a logged-in user's monitored queries.

The Free GEO Snapshot at /gradermay also fetch a small number of pages on the submitter's declared domain, using the same Crawlability/Content user-agent.

3. How we behave (defaults)

Polite identity. Every request includes the User-Agent string above and a back link to this page.
Per-host rate limiting. Concurrency and request rate per host are bounded at the orchestrator layer.
Body size caps. Approximately 1 MB on Citation Source Intelligence fetches and 256 KB on robots.txt reads — never read past the cap.
Timeouts. 5–8 seconds per request, with at most one retry for transient errors.
Public content only. We do not bypass paywalls, login walls, soft-walls, JS challenges, or anti-scraping measures.
No re-publication. We may store a small excerpt (typically up to ~4 KB) and structured data (Open Graph, JSON-LD) of fetched pages, used only inside the user's authenticated dashboard for the features described above.
Respect for robots.txt where reasonably possible, and respect for emerging conventions such as llms.txt and the IETF ai.txt proposal.

4. How to block us

If you do not wish GEO Tracker AI to fetch any pages on your domain, add the following to your robots.txt:

User-agent: GeoTrackerAI
Disallow: /

User-agent: GeoTrackerAI-Citations
Disallow: /

We will honor an updated robots.txt on the next fetch cycle (typically within 24 hours for our own audit cron).

Alternatively, write to legal@geotrackerai.com with the hostname you control and we will add a server-side block list entry within a reasonable timeframe.

5. We are not these other bots

The Crawlability Monitor reports access status of bots operated by third parties (e.g., GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, Bytespider, Meta-ExternalAgent, CCBot). Those bots are not operated by us; we only check whether your robots.txt allows them. To control them, edit your robots.txtdirectly per each vendor's documentation.

6. Contact

legal@geotrackerai.com — opt-out, abuse, legal questions.
support@geotrackerai.com — operational questions.
See also our Privacy Policy and Terms of Service.