John Mueller, a webmaster trends analyst at Google, recently discussed how Googlebot discovers sites that lack external links.
The discussion arose from a Reddit thread where someone asked:
“How does Googlebot find a site if no one is linking to the site, and it hasn’t been submitted to Search Console?”
Mueller acknowledged that identifying precisely how these sites are located can be “tricky.”
Some possible ways include:
– Third parties monitoring domain registrations
– Unintentional backlinks due to URL typos
– Toolbars linking to related content
– The CMS may have generated a sitemap or RSS/Atom feed
Mueller advises using the noindex tag if you absolutely want a site to remain undiscovered. Don’t assume that just because a site isn’t promoted or linked, it won’t be found by search engines.
He also shared advice for those who wish to launch a new site with maximum effect:
“If you want to launch something new with a bang (if that’s your goal with a new and unknown domain), one approach is to use the site removal tool to hide the site in search, and then cancel that request when making it live — this allows Google to crawl and index the content in advance, but keeps it hidden from search results.”
This method is quicker than transitioning from noindex to indexable content, but there’s no guarantee it won’t be discovered by search engines other than Google.
The only sure way to ensure a site remains hidden from crawlers is to use a noindex tag.