Making efforts to optimize a site is great but what counts is how
search engines see your efforts. While even the most careful
optimization does not guarantee tops position in search results, if
your site does not follow basic search engine optimisation truths, then it is more than
certain that this site will not score well with search engines. One
way to check in advance how your SEO efforts are seen by search
engines is to use a search
engine simulator.
Spiders Explained
Basically all search engine spiders function on the same principle
– they crawl the Web and index pages, which are stored in a
database and later use various algorithms to determine page ranking,
relevancy, etc of the collected pages. While the algorithms of
calculating ranking and relevancy widely differ among search engines,
the way they index sites is more or less uniform and it is very
important that you know what spiders are interested in and what they
neglect.
Search engine spiders are robots and they do not read your pages
the way a human does. Instead, they tend to see only particular stuff
and are blind for many extras (Flash, JavaScript) that are intended
for humans. Since spiders determine if humans will find your site, it
is worth to consider what spiders like and what don't.
Flash, JavaScript, Image Text or Frames?!
Flash, JavaScript and image text are NOT visible to search
engines. Frames are a real disaster in terms of SEO ranking. All of
them might be great in terms of design and usability but for search
engines they are absolutely wrong. An incredible mistake one can make
is to have a Flash intro page (frames or no frames, this will hardly
make the situation worse) with the keywords buried in the animation.
Check with the Search
Engine Spider Simulator tool a page with Flash and images (and
preferably no text or inbound or outbound hyperlinks) and you will
see that to search engines this page appears almost blank.
Running your site through this simulator will show you more than
the fact that Flash and JavaScript are not SEO favorites. In a way,
spiders are like text browsers and they don't see anything that is
not a piece of text. So having an image with text in it means nothing
to a spider and it will ignore it. A workaround (recommended as a SEO
best practice) is to include meaningful description of the image in
the ALT attribute of the <IMG> tag but be careful not to use
too many keywords in it because you risk penalties for keyword
stuffing. ALT attribute is especially essential, when you use links
rather than text for links. You can use ALT text for describing what
a Flash movie is about but again, be careful not to trespass the line
between optimization and over-optimization.
Are Your Hyperlinks Spiderable?
The search engine spider simulator can be of great help when
trying to figure out if the hyperlinks lead to the right place. For
instance, link exchange websites often put fake links to your site
with _javascript (using mouse over events and stuff to make the link
look genuine) but actually this is not a link that search engines
will see and follow. Since the spider simulator would not display
such links, you'll know that something with the link is wrong.
It is highly recommended to use the <noscript> tag, as
opposed to _javascript based menus. The reason is that _javascript
based menus are not spiderable and all the links in them will be
ignored as page text. The solution to this problem is to put all menu
item links in the <noscript> tag. The <noscript> tag can
hold a lot but please avoid using it for link stuffing or any other
kind of SEO manipulation.
If you happen to have tons of hyperlinks on your pages (although
it is highly recommended to have less than 100 hyperlinks on a page),
then you might have hard times checking if they are OK. For instance,
if you have pages that display “403 Forbidden”, “404
Page Not Found” or similar errors that prevent the spider from
accessing the page, then it is certain that this page will not be
indexed. It is necessary to mention that a spider simulator does not
deal with 403 and 404 errors because it is checking where links lead
to not if the target of the link is in place, so you need to use
other tools for checking if the targets of hyperlinks are the
intended ones.
Looking for Your Keywords
While there are specific tools, like the Keyword
Playground or the Website
Keyword Suggestions, which deal with keywords in more detail,
search engine spider simulators also help to see with the eyes of a
spider where keywords are located among the text of the page. Why is
this important? Because keywords in the first paragraphs of a page
weigh more than keywords in the middle or at the end. And if keywords
visually appear to us to be on the top, this may not be the way
spiders see them. Consider a standard Web page with tables. In this
case chronologically the code that describes the page layout (like
navigation links or separate cells with text that are the same
sitewise) might come first and what is worse, can be so long that the
actual page-specific content will be screens away from the top of the
page. When we look at the page in a browser, to us everything is fine
– the page-specific content is on top but since in the HTML
code this is just the opposite, the page will not be noticed as
keyword-rich.
Are Dynamic Pages Too Dynamic to be Seen At All
Dynamic pages (especially ones with question marks in the URL) are
also an extra that spiders do not love, although many search engines
do index dynamic pages as well. Running the spider simulator will
give you an idea how well your dynamic pages are accepted by search
engines. Useful suggestions how to deal with search engines and
dynamic URLs can be found in the Dynamic
URLs vs. Static URLs article.
Meta Keywords and Meta Description
Meta keywords and meta description, as the name implies, are to be
found in the <META> tag of a HTML page. Once meta keywords and
meta descriptions were the single most important criterion for
determining relevance of a page but now search engines employ
alternative mechanisms for determining relevancy, so you can safely
skip listing keywords and description in Meta tags (unless you want
to add there instructions for the spider what to index and what not
but apart from that meta tags are not very useful anymore).