This help topic describes how to prevent sections of a document from being indexed. To prevent an entire document from being indexed, see the topics above.
Spiderline supports the proprietary "robots" comment tag. This tag allows a web author to apply robots exclusion rules to arbitrary sections of a
document. The tag has one attribute, content, with the following possible values:
- noindex - the text enclosed in the tag is not saved in the index
- nofollow - links are not extracted from the text enclosed
- none - enclosed text is not indexed nor searched for links
Values "index", "follow", and "all" are also valid. In practice they
are ignored since they are the unspoken defaults.
This feature is expected to fit the customer need of preventing certain
parts of a document - such as a navigational sidebar - from being included
in the search.
Example:
<HTML>
<BODY>
This text will be indexed.
<A HREF="foo.html"> this link will be followed </A>
<!-- robots content="none" -->
This text will NOT be indexed.
<A HREF="bar.html"> this link will NOT be followed </A>
<!-- /robots -->
<!-- robots content="noindex" -->
This text will NOT be indexed.
<A HREF="bar1.html"> this link WILL be followed </A>
<!-- /robots -->
<!-- robots content="nofollow" -->
This text WILL be indexed.
<A HREF="bar1.html"> this link will NOT be followed </A>
<!-- /robots -->
la la la
</BODY>
</HTML>
For the example of a navigational sidebar, the "noindex" value
would be the best choice.
This syntax was designed to match the robots META tag.
For documents which have both the "robots" META tag and
the "robots" comment tag, the most restrictive interpretation will
be made, always erring on the side on not indexing or not following.