Spiderline
custom search engine solutions
Your Own Search Engine.
Just seconds after registering, your web site can be searchable with the features you want and reliability you need. No software to install or maintenance required. Search results can match your website design seamlessly.

Site Search Knowledge Base

Search  
   
Browse by Category
Site Search Knowledge Base .: Crawl Questions

Crawl Questions

Crawling and Indexing Questions

article Robot Exclusion Guide
The robots.txt file and robot META tags are methods used to allow and disallow crawling portions of your site by robots (web robots, spiders). Website administrators and content providers can...

  2005-01-20    Views: 22000   
article Learn How to Use NOINDEX & NOFOLLOW
"Crawling" is the process of finding content on your web site. Finding web pages is similar to a web user browsing through a site and clicking on links. Spiderline spiders also browse your...

  2005-01-20    Views: 14790   
article Learn How to Configure URLs
Configuring URLs can be as simple or detailed as needed for your website. The Starting URL and Pattern fields in combination with the INDEX and FOLLOW options allow you to control exactly...

  2005-01-20    Views: 8829   
article Excluding crawler from sections of pages.
This help topic describes how to prevent sections of a document from being indexed. To prevent an entire document from being indexed, see the topics above. Spiderline supports the proprietary...

  2005-01-20    Views: 189604   
article How do I exclude parts of my site from being crawled?
To exclude areas from being indexed, you will need to put in commands to the URLs section of the Crawl settngs. These Patterns will tell the crawler what to index and what to avoid indexing. Type...

  2005-04-27    Views: 20087   
article How many pages are on my site?
You or your webmaster is the best judge of the size of your site. (Spiderline cannot know how many pages are on your site without physically crawling it.) You can purchase a lower plan than needed...

  2005-04-20    Views: 18866   
article Will Spiderline follow links in frames?
Spiderline will follow links found in framesets.

(No rating)  2005-03-31    Views: 16301   
article Sever HTTP response (Error) codes
Errors that come from a HTTP server that display in a browser are the same that Spiderline Cralwer receives. Common codes are: 301 Moved Permanently 302 Found Redirect 303 See Other 307...

  2005-07-12    Views: 14715   
article My account is not gettng crawled!!
Reasons an account may not be crawled. Log into your account, check the crawl log and last crawl date. Is your account out of crawls? Is your account expired? Is your website up and...

(No rating)  2005-01-19    Views: 13304   
article Can Spiderline index ASP sites?
Yes, Spiderline can index ASP, PHP, JSP, CFM, MPE, and any other dynamically produced HTML.

  2005-01-19    Views: 12227   
article Why didn't Spiderline find all of my pages?
There are several reasons that all of your web site pages may not have been crawled. Account Document Limit: The default Document Limit is 100 pages. In order for Spiderline to crawl over this...

(No rating)  2005-01-19    Views: 11608   
article Does Spiderline allow manual and scheduled indexing?
Yes. Spiderline provides all accounts (under any service plan) with the ability to request a recrawl/update of your website at any time. Our automated crawl scheduling feature will let you choose...

  2005-01-19    Views: 11050   
article Why do I get duplicate pages in my results?
If duplicate pages are identical in all respects except for their URLs you should get a report of "document content matches previously processed document" next to the page in your crawl report....

(No rating)  2005-01-20    Views: 10925   
article My crawl never completed.
If your crawl is not returning after a adaquate time. Adaquate meaning enough time for you to upload your full site on a 56K connection, time for pdf files and text documents to be opened and read...

(No rating)  2005-04-27    Views: 9891   
article How do robot meta tags work?
The Robots META tag is another method that may be used to indicate to visiting robots whether a page should be indexed (crawled), or links on the page should be followed. It differs from the...

(No rating)  2005-04-27    Views: 9562   
article Why didn't Spiderline crawl documents on other sites that I linked to?
Check your URL Configuration Patterns. In order to make documents on other websites searchable, but only the documents you link to and not the entire other website, enter "/  INDEX  NOFOLLOW" on...

  2005-01-20    Views: 8943   
article Robot META tags Tutorial.
The Robots META tag is another method that may be used to indicate to visiting robots whether a page should be indexed (crawled), or links on the page should be followed. It differs from the...

  2005-04-27    Views: 8838   
article Useing NOINDEX and NOFOLLOW patterns
"Crawling" is the process of finding content on your web site. Finding web pages is similar to a web user browsing through a site and clicking on links. Spiderline spiders also browse your...

(No rating)  2005-05-03    Views: 8048   
article Does NOINDEX keep documents from being counted?
Yes, if the NOINDEX and page or directory is in the URL patterns, this is because the document is avoided by the crawler. No, if the pages are being ommited by Robot Meta-tags, this is becuase...

(No rating)  2005-06-27    Views: 7956   
article Does Spiderline honor the robot exclusion protocol?
Yes, Spiderline does honor the robot exclusion protocol. Our spiders will not index directories or follow links that have been disallowed in the robots.txt configuration file located on your...

(No rating)  2005-01-20    Views: 7686   
article What type of documents will Spiderline index?
Presently, Spiderline supports Microsoft Word, Adobe PDF, HTML, RTF and text. We can add on additional document types on request.

(No rating)  2005-01-19    Views: 7514   
article What changes to my account require re-crawling?
Changes to any of the following configurations will require re-crawling your website. URL configuration Exclude Word List Any authentication Robots handleing If you have an...

(No rating)  2005-01-20    Views: 7089   
article Do I have to wait for Spiderline to finish crawling my web site?
Search results are not available intill the first crawl is completed. Any future crawls will replace the current index. The search box in the design section of your control panel will search...

  2005-01-19    Views: 7050   
article Javascript Navagation
Spiderline (like most site search engines) does not have the ability to interpret java or javascript code to harvest links. Java and Javascript needs to be excecuted for it to be read, and it is...

  2005-03-30    Views: 6851   
article How do I use the Patterns fields to specify what should and should not be crawled?
To exclude areas from being indexed, you will need to put in commands to the URLs section of the Crawl settngs. These Patterns will tell the crawler what to index and what to avoid indexing. Type...

  2005-01-20    Views: 6732   
article What parts of a document does Spiderline crawl?
For HTML documents, the page title, Meta Keywords, Meta Description, and body text will be crawled. Image ALT tags and Robot comments are read but not indexed, you have the ability to use these...

(No rating)  2005-01-20    Views: 6321   
article No new crawls. Crawls stopped.
First check your account to see if you are either out of crawls or your account is expired. If your account is active , with crawls available. Does the crawl status say "crawl in progress",...

  2005-04-27    Views: 5817   


.: Powered by Lore 1.5.3

Powered by Lucene