Search engines generally crawl a website using a computer program known as bots. Like google search web sites using Googlebot. robot.txt file, restrict a boat to have access to all the folders which contains some confidential data or, unnecessary data.
Below ate the file format explained with example,
The same result can be accomplished with an empty or missing robots.txt file.
This example tells all robots to stay out of a website:
This example tells all robots that they can visit all files because the wildcard
User-agent: * Disallow: /directory/file.html
This example tells all robots not to enter three directories:
Note that all other files in the specified directory will be processed.
This example tells a specific robot to stay out of a website:
Example demonstrating multiple user-agents:
Below ate the file format explained with example,
The same result can be accomplished with an empty or missing robots.txt file.
This example tells all robots to stay out of a website:
User-agent: * Disallow: /
This example tells all robots that they can visit all files because the wildcard
*
stands for all robots and the Disallow
directive has no value, meaning no pages are disallowed.User-agent: * Disallow:
This example tells all robots to stay away from one specific file:
User-agent: * Disallow: /directory/file.html
This example tells all robots not to enter three directories:
User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /junk/
Note that all other files in the specified directory will be processed.
This example tells a specific robot to stay out of a website:
User-agent: BadBot # replace 'BadBot' with the actual user-agent of the bot Disallow: /
This example tells two specific robots not to enter one specific directory:
User-agent: BadBot # replace 'BadBot' with the actual user-agent of the bot User-agent: Googlebot Disallow: /private/
Example demonstrating how comments can be used:
# Comments appear after the "#" symbol at the start of a line, or after a directive User-agent: * # match all bots Disallow: / # keep them out
It is also possible to list multiple robots with their own rules. The actual robot string is defined by the crawler. A few sites, such as Google, support several user-agent strings that allow the operator to deny access to a subset of their services by using specific user-agent strings.
Example demonstrating multiple user-agents:
User-agent: googlebot # all Google services Disallow: /private/ # disallow this directory User-agent: googlebot-news # only the news service Disallow: / # disallow everything User-agent: * # any robot Disallow: /something/ # disallow this directory
Thanks, for explanations are quite useful information in working for open txt file https://wikiext.com/txt well, I didn`t quite understand the difference between the doc. and txt. formats before, but to open most of the extensions
ReplyDeleteI use there is a universal file viewer on Windows, which I usually use.