How to prevent indexing by search engines private content?

I remember in July 2012 there was a boom of discussion on the issuance of a search engines content which in theory should not have been issued. Even in those cases when the private sections of the site was protected by user authorization.


On my site once again there are sections that I would not like to see in the search results. Actually a subject, how to configure the directory robots.txt and maybe something else to prevent private user content in open access.


On my site uses authorization, only after that user can go to your profile and see your content
October 3rd 19 at 04:27
7 answers
October 3rd 19 at 04:29
robots.txt not always works (why, is an open question), but tag meta content="noindex,nofollow" name="robots" have not failed.
October 3rd 19 at 04:31
You can also robots.txt (for example, to generate here), wrap the content with tags (see link): <noindex>...</noindex>
The description of the wiki, it only Yandex uchityvaetsya. You can try nofollow - Chesley_Ols commented on October 3rd 19 at 04:34
October 3rd 19 at 04:33
In robots.txt

User-agent: *
Disallow: /corenell

Well, for your own peace even in head in all personal topics:
<meta name="“robots”" content="“none”">
none — replaces noindex and nofollwo
October 3rd 19 at 04:35
Well, we like google-bot even scored at the basic authorization, such feeling that merges the passwords and with impunity index closed sites, absolutely hammering on robots.txt. After that, even I do not understand how to defend, apparently stupid not to use chrome, it immediately kicked on the UserAgent, it was not possible to merge the password for basic authorization.
Went to understand now and to cover the bench Googlebot, found a joint on our part. More precisely, even not our own, it seems it's either a bug or feature of nginx. Works https virtual host for another site (server_name ignores). In the end, creeping through the other upstream. - Chesley_Ols commented on October 3rd 19 at 04:38
Chrome merges the content (via a translation and a bunch of other things), but!
Google does not ignore robots and shows zaparoleny content (if you didn't want) - Jaclyn_Ro commented on October 3rd 19 at 04:41
October 3rd 19 at 04:37
Well, I have no basic authorization and chrome I really like.
October 3rd 19 at 04:39
https will not save?
October 3rd 19 at 04:41
well, first you need to understand how the robot gets on the pages that You have private... if you like here I assume Google merges Chrome passwords "To" :) You have the same, and the authorized session have leaked to the logs to be, if not then You have a PROBLEM, because the robot goes to where his name just like that :).
Well shoot from the hip, do the robot quite easy to detect, do not let eat something that is not expected — not what will be indexed.

Find more questions by tags Search enginesWeb Development