# CitrusBurn™ — Maximum 2026 SEO & AEO Crawler Configuration # Version: 2026.4.17 — Optimized for AI/LLM Indexing & Traditional Search # Site: https://ignite-well-boost.lovable.app # Last updated: 2026-04-17 # ============================================ # GOOGLE ECOSYSTEM (Search, Images, Video, Shopping, AI) # ============================================ User-agent: Googlebot Allow: / Allow: /images/ Allow: /assets/ Crawl-delay: 1 User-agent: Googlebot-Image Allow: /images/ Allow: /assets/ Allow: /*.webp$ Allow: /*.png$ Allow: /*.jpg$ Allow: /*.jpeg$ Allow: /*.svg$ User-agent: Googlebot-Video Allow: / Allow: /videos/ Allow: /*.mp4$ Allow: /*.webm$ User-agent: Googlebot-News Allow: / User-agent: Google-InspectionTool Allow: / Allow: /sitemap.xml Allow: /robots.txt User-agent: Storebot-Google Allow: / Allow: /shop/ Allow: /products/ User-agent: Google-Extended Allow: / Allow: /faq/ Allow: /about/ User-agent: GoogleOther Allow: / User-agent: GoogleOther-Image Allow: /images/ User-agent: GoogleOther-Video Allow: /videos/ # ============================================ # MICROSOFT/BING ECOSYSTEM (Search, Copilot AI) # ============================================ User-agent: Bingbot Allow: / Allow: /images/ Allow: /assets/ Crawl-delay: 2 User-agent: BingPreview Allow: / User-agent: MicrosoftPreview Allow: / User-agent: msnbot Allow: / Allow: /images/ User-agent: msnbot-media Allow: /images/ Allow: /videos/ User-agent: AdIdxBot Allow: / User-agent: MicrosoftResearch Allow: / User-agent: MicrosoftOptimize Allow: / # ============================================ # YAHOO/VERIZON MEDIA # ============================================ User-agent: Slurp Allow: / Crawl-delay: 2 User-agent: Yahoo Link Preview Allow: / User-agent: Yahoo Gemini Allow: / User-agent: Y!J Allow: / # ============================================ # DUCKDUCKGO (Privacy-focused Search) # ============================================ User-agent: DuckDuckBot Allow: / Allow: /images/ Allow: /faq/ User-agent: DuckDuckGo-Favicons-Bot Allow: /favicon.ico Allow: /apple-touch-icon.png # ============================================ # BAIDU (China Search) # ============================================ User-agent: Baiduspider Allow: / Crawl-delay: 2 User-agent: Baiduspider-image Allow: /images/ Allow: /*.jpg$ Allow: /*.png$ Allow: /*.webp$ User-agent: Baiduspider-video Allow: /videos/ User-agent: Baiduspider-news Allow: / User-agent: Baiduspider-favo Allow: /favicon.ico # ============================================ # YANDEX (Russia/CIS Search) # ============================================ User-agent: YandexBot Allow: / Allow: /images/ Crawl-delay: 2 User-agent: YandexImages Allow: /images/ User-agent: YandexMedia Allow: / User-agent: YandexAccessibilityBot Allow: / User-agent: YandexMobileBot Allow: / User-agent: YandexCalendar Allow: / User-agent: YandexSitelinks Allow: / # ============================================ # ASIA-PACIFIC SEARCH ENGINES # ============================================ User-agent: Sogou web spider Allow: / Crawl-delay: 3 User-agent: Sogou inst spider Allow: / User-agent: Sogou spider2 Allow: / User-agent: 360Spider Allow: / User-agent: HaosouSpider Allow: / User-agent: NaverBot Allow: / User-agent: Yeti Allow: / User-agent: Gigabot Allow: / User-agent: Exabot Allow: / User-agent: SeznamBot Allow: / User-agent: SeznamBot/3.2 Allow: / User-agent: szukacz Allow: / # ============================================ # SOCIAL MEDIA PLATFORMS (Sharing, Previews, Cards) # ============================================ User-agent: Twitterbot Allow: / Allow: /images/ Allow: /*.webp$ Allow: /*.png$ Allow: /*.jpg$ User-agent: facebookexternalhit Allow: / Allow: /images/ Allow: /og-image* User-agent: Facebot Allow: / User-agent: LinkedInBot Allow: / Allow: /images/ User-agent: Pinterestbot Allow: / Allow: /images/ User-agent: Slackbot Allow: / Allow: /images/ User-agent: WhatsApp Allow: / Allow: /images/ User-agent: TelegramBot Allow: / Allow: /images/ User-agent: Discordbot Allow: / Allow: /images/ User-agent: Redditbot Allow: / Allow: /images/ User-agent: Embedly Allow: / User-agent: Quora-Bot Allow: / User-agent: SkypeUriPreview Allow: / User-agent: Snapchat Allow: / User-agent: TikTokBot Allow: / User-agent: Instagram Allow: / # ============================================ # AI / LLM / ANSWER ENGINE CRAWLERS (2026 OPTIMIZED) # ============================================ # OpenAI / ChatGPT User-agent: GPTBot Allow: / Allow: /faq/ Allow: /about/ User-agent: ChatGPT-User Allow: / User-agent: OAI-SearchBot Allow: / User-agent: OpenAI-Image Allow: /images/ # Anthropic / Claude User-agent: anthropic-ai Allow: / Allow: /faq/ User-agent: ClaudeBot Allow: / Allow: /faq/ Allow: /about/ User-agent: Claude-Web Allow: / User-agent: Claude-Image Allow: /images/ # Perplexity AI Search User-agent: PerplexityBot Allow: / Allow: /faq/ User-agent: Perplexity-User Allow: / # Cohere AI User-agent: cohere-ai Allow: / User-agent: cohere-training Allow: / # Apple Intelligence / Safari User-agent: Applebot Allow: / Allow: /faq/ User-agent: Applebot-Extended Allow: / User-agent: Apple-PubSub Allow: / # Meta AI / Facebook/Instagram AI User-agent: Meta-ExternalAgent Allow: / Allow: /faq/ User-agent: Meta-ExternalFetcher Allow: / User-agent: Meta-AI-Crawler Allow: / User-agent: meta-externalagent-llm Allow: / User-agent: meta-externalfetcher-llm Allow: / User-agent: llama Allow: / User-agent: llama-crawler Allow: / # Google AI (Gemini, Bard, SGE) User-agent: Google-Extended Allow: / Allow: /faq/ User-agent: Bard Allow: / User-agent: Gemini-Crawler Allow: / User-agent: Vertex-AI Allow: / # Microsoft AI (Copilot, Bing Chat) User-agent: BingCopilot Allow: / User-agent: MicrosoftCopilot Allow: / User-agent: Copilot-User Allow: / User-agent: GPT4Turbo Allow: / # Amazon AI (Alexa, Bedrock, Titan) User-agent: Amazonbot Allow: / User-agent: Alexa Allow: / User-agent: Amazon-Kendra Allow: / # You.com / YouBot User-agent: YouBot Allow: / Allow: /faq/ User-agent: You-Proxy Allow: / User-agent: You.com-Crawler Allow: / # Phind AI Search User-agent: Phind-Bot Allow: / User-agent: Phind Allow: / # Brave Search / AI User-agent: Brave Allow: / User-agent: BraveBot Allow: / User-agent: Brave-AI Allow: / # Neeva Search User-agent: Neeva Allow: / User-agent: NeevaBot Allow: / # ByteDance / TikTok AI User-agent: Bytespider Allow: / User-agent: TikTokSpider Allow: / # AI Data Scrapers & Research Bots User-agent: Diffbot Allow: / User-agent: Diffbot-Analyzer Allow: / User-agent: ImagesiftBot Allow: /images/ User-agent: Kangaroo Bot Allow: / User-agent: Kangaroo-Bot-Image Allow: /images/ User-agent: CCBot Allow: / User-agent: CommonCrawler Allow: / User-agent: Timpibot Allow: / User-agent: Imagesift Allow: /images/ User-agent: OAI-Image-Crawler Allow: /images/ # Academic/Research AI User-agent: Omgilibot Allow: / User-agent: Omgili Allow: / User-agent: Webz.io Allow: / User-agent: Webzio Allow: / User-agent: AcademicBot Allow: / # Cloud AI Services User-agent: Google-Vertex Allow: / User-agent: AWS-Bedrock Allow: / User-agent: Azure-OpenAI Allow: / User-agent: IBM-Watson Allow: / # ============================================ # SEO & MARKETING TOOLS # ============================================ User-agent: AhrefsBot Allow: / Crawl-delay: 3 User-agent: AhrefsSiteAudit Allow: / User-agent: SemrushBot Allow: / Crawl-delay: 3 User-agent: SemrushBot-BA Allow: / User-agent: SemrushBot-SA Allow: / User-agent: SemrushBot-CT Allow: / User-agent: SemrushBot-SI Allow: / User-agent: DotBot Allow: / User-agent: OpenSiteExplorer Allow: / User-agent: MJ12bot Allow: / User-agent: Majestic Allow: / User-agent: Screaming Frog SEO Spider Allow: / User-agent: rogerbot Allow: / User-agent: SiteAuditBot Allow: / User-agent: Uptimebot Allow: / User-agent: PetalBot Allow: / User-agent: DataForSeoBot Allow: / User-agent: search.marginalia.nu Allow: / User-agent: archive.org_bot Allow: / User-agent: ia_archiver Allow: / User-agent: AlexaBot Allow: / User-agent: Feedfetcher-Google Allow: / User-agent: Google-Adwords-Display Allow: / User-agent: Google-Safety Allow: / # ============================================ # VALIDATION & TESTING TOOLS # ============================================ User-agent: Google-Structured-Data-Testing-Tool Allow: / User-agent: Google-AMPHTML Allow: / User-agent: Validator.nu Allow: / User-agent: W3C-checklink Allow: / User-agent: W3C_Validator Allow: / User-agent: W3C_CSS_Validator Allow: / User-agent: W3C_I18n-Checker Allow: / User-agent: W3C_Unicorn Allow: / # ============================================ # MONITORING & UPTIME SERVICES # ============================================ User-agent: UptimeRobot Allow: / User-agent: Pingdom Allow: / User-agent: GTmetrix Allow: / User-agent: WebPageTest Allow: / User-agent: Chrome-Lighthouse Allow: / User-agent: Google-PageSpeed-Insights Allow: / User-agent: Site24x7 Allow: / User-agent: NewRelicPing Allow: / User-agent: StatusCake Allow: / User-agent: BetterUptime Allow: / # ============================================ # DEFAULT — ALL OTHER BOTS # ============================================ User-agent: * Allow: / Allow: /faq/ Allow: /about/ Allow: /images/ Allow: /assets/ Disallow: /api/ Disallow: /admin/ Disallow: /private/ Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /*.json$ Disallow: /*.xml$ Disallow: /*.txt$ Disallow: /src/ Disallow: /node_modules/ Disallow: /.git/ Disallow: /.env* Disallow: /*? Crawl-delay: 5 # ============================================ # SITEMAP & HOST DIRECTIVES # ============================================ Sitemap: https://ignite-well-boost.lovable.app/sitemap.xml Sitemap: https://ignite-well-boost.lovable.app/sitemap-images.xml Host: https://ignite-well-boost.lovable.app # ============================================ # CRAWL BUDGET OPTIMIZATION HINTS # ============================================ # Priority crawling for high-value pages # Allow: / (homepage - daily updates) # Allow: /faq/ (high engagement - weekly updates) # Allow: /testimonials/ (social proof - weekly updates) # Allow: /shop/ (conversion page - daily updates) # Rate limits to prevent server overload # Crawl-delay values optimized for: # - Major search engines: 1-2 seconds # - SEO tools: 3+ seconds # - AI crawlers: No delay (cache-friendly) # - Unknown bots: 5 seconds # ============================================ # END OF robots.txt — CitrusBurn 2026 # ============================================