Home of the Internet Optimization Network. Web site search engine optimization and web site traffic marketing.
Customized web site traffic marketing solutions from ION.
Your site just got popular.

Start here for web site traffic marketing instructions

Free Search Engine Optimization Tools
Keyword Selection SEO Tools page
META Tag SEO Tools page
Keyword Density SEO Tools page
Search Engine Submission SEO Tools page
Link Popularity SEO Tools page
Traffic Analysis SEO Tools page
SERP and PageRank SEO Tools page
General SEO Tools page
Search Engine Optimization Training Library
Contact Request Form
ION Home page, for web site Search Engine Optimization
ION's Frequently Asked Questions
About the Internet Optimization Network
ION SiteMap
Full, spidered Site Search
Link your website to ION

SEO articles:

Search Engine Optimization
Website Promotion




Tell-a-Friend©
Monitored by: InternetSeer - Web Site Monitoring

 

Robots.txt file and other
spider guiders

        Worried about how the search engines see your website? If so, then it is time for you to learn how to make their Spiders see what you want them to. Since Search Engine user agents, which we call spiders, robots, or crawlers, are the most important visitors that your website receives, (excepting maybe customers) it is really important to your traffic that you know when they are indexing your whole site, or just parts of it. You'll surely want to check up on their progress in order to know the answer to that question.

        On top of "feeding them" well, (developing great site content) you also get to tell your eight-legged visitors a few other things, like which pages on your site are off limits, and which spiders are even allowed to index your site in the first place! They can be guided through your website in only two ways that you have control over. The first is by using a simple META Tag, (see our META Tag page for more info on that method) and more importantly, by a simple, plain 'ol text (.txt) file, located at your root web directory.

Example: Http://www.yourdomain.com/robots.txt

        It should be noted that if you want all of your pages indexed, all links followed, and by all search engine's spiders, then the contents of your robots.txt file should look exactly like this:

User-agent: *
Disallow:

        That's it, nothing else at all is needed. All of the other commands are for DISALLOWING pages, directories, links, or particular spiders. If you are trying to get every page on your site indexed, by every search engines' spider, and have all your links followed, (maximum search engine exposure) then go make a ".txt" file right now. (MS write or wordpad will do nicely.) Simply copy and paste the two short lines above into it, name it "robots.txt" (without the quotes) and upload that to the root directory of your server. That's it, and you won't need to use the first two tools below, which others will use to create and analyze their more complicated robots.txt files. All the other tools on this page are quite useful to everyone, as they emulate the spiders themselves, allowing you to stay one step ahead of your eight-footed indexers.


Submit Corner.com's Robots.txt file generator

        This great little tool from Submit Corner is a real timesaver for building simple robots.txt files. No explanations necessary!

Who will you allow?
There are well over 200 user agents (spiders) that can read through your webpages and index them. Since new spiders get added & removed daily, we have limited the choices below to the major search engines (which make their User-Agent names public). The default setting is to apply your preferences to All Agents.

Allowed User Agents:

Imposing restrictions:

You may impose restrictions on which webpages to disallow indexing. By default, most users will want to allow all directories except their /cgi-bin directory which commonly holds scripts. To enable all webpages, check the "Enable All Webpages" checkbox. Otherwise, enter each webpage or directory path in the exclusion box, one per line (all paths must end with a "/"). If you checked off Enable All Webpages, this box will not be read.

Example: "http://www.sample.com/cgi-bin/" (Excludes /cgi-bin/ directory)
Example: "http://www.sample.com/hello.html" (Excludes /hello.html webpage)

WARNING: Do NOT leave a "/" in the box alone, or you will exclude your entire URL from being indexed!

Enable All
Webpages
Exclude These URL's:


Robots.txt file validator

        Writing a Robots.txt file? Use this handy tool from SearchEngineWorld.com to make sure it does just what you had in mind. While you're there, read up a bit on Search Engine spiders... These guys are the authority!


Search Engine Spider Simulator

        This fast, free tool shows you the output of a spidering (crawl) across yours or anyone else's website. Simply put in an URL below to see the standard data reported back to Search engines these days, just like a lot of the spiders out there do right now. -An excellent tool to use on your competition if you want to try to rank more like them!


PoodlePoodle Predictor

        We have the zany guys at GRI Technologies to thank for this excellent, Google™ specific tool. Simply enter your URL below to see what your site will look like in Google's search-engine results, and from there you can look at what Googlebot (Google's main Spider) would have seen to come up with this, so you know how to rank better in Google™, specifically. Can Googlebot crawl it easily? Will it get good rankings? Poodle Predictor is the answer to many of your Google™ questions.


Link Validation Utility 1.0

        This excellent free tool allows you to spider your web site Completely, test for broken links across your site, and even test the links on those sites that you link to! This tool is great at what it does, with plenty of options like reports and even a sitemap generator! It even can be told to do what your robots.txt file says to do, which is a lot smarter than other link validators out there these days. Use it whenever you make moderate or larger changes to dramatically reduce errors related to site presentation and usability.

        Only freeware version drawback: Your limited to three levels deep. -No problem for a site even this size, but "us.gov" might have a problem...

Software specs:
Name: Link Validation Utility 1.0
File name: linkval.exe
File size: 4.6 MB
License: Freeware
Support site: http://www.hisoftware.com/linkvalidate/index.html
Last known update: 4/4/2003
Developer: HiSoftware
OSs supported:Win 9x, Me, 2K, NT, XP


File not downloading properly? Contact us and we will send it to you through email.


To go directly to one of the other tools pages, click the appropriate button below:
Keyword selection & analysis tools META Tag generation & analysis tools Robots.txt file tools, the page you are on now!
KDAs & script checking tools Search Engine Submission tools Search Engine position & PageRank tools
Website log analysis tools Link Popularity SEO tools General SEO tools

        ION posts all SEO tools and links to tools on a first-come-first-posted basis, and does not intend to disclude any submissions or links that would fit on this site unless we feel that a better tool doing the exact same job is already present. We will continually be updating our site to make sure that all links are fresh and that all of the best free tools in the industry are accounted for. If you have any comments, suggestions, or possible additions, please email them to: Library@InternetOptimization.Net.

        All links to off-site applications on these pages are free to use, but such applications are not the property of ION. Please respect the terms of usage for each online tool, as defined on the owners page, or the page that you are sent to while using the application.

© Copyright 2004-2008 Internet Optimization Network - All Rights Reserved.
Best viewed in Internet Explorer ver. 5.5 or later