• jaschen@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    29 days ago

    Web manager here. Don’t do this unless you wanna accidentally send google crawlers into the same fate and have your site delisted.

      • Zexks@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        29 days ago

        Lol. And they’ll delist you. Unless you’re really important, good luck with that.

        robots.txt

        Disallow: /some-page.html

        If you disallow a page in robots.txt Google won’t crawl the page. Even when Google finds links to the page and knows it exists, Googlebot won’t download the page or see the contents. Google will usually not choose to index the URL, however that isn’t 100%. Google may include the URL in the search index along with words from the anchor text of links to it if it feels that it may be an important page.

        • rosco385@lemm.ee
          link
          fedilink
          English
          arrow-up
          0
          ·
          29 days ago

          It’d be more naive to have a robot.txt file on your webserver and be surprised when webcrawlers don’t stay away. 😂