Website optimization
When you're building a new website or completely renovating an old one,
it's important to create your design in a search engine friendly way. The
choices you make are going to be with you for a long time and errors will
be very time-consuming to repair at a later stage.
In other parts of this site, we've looked at how to make individual
pages rank well. Now, let's focus on website optimization and examine your
site as a whole. We'll go over the design techniques and principles that
the search engines like, but we'll also take a brief glimpse at some
potential pitfalls. Welcome aboard, I hope you enjoy the trip!
Use as much text as possible
When the World Wide Web was born in the early 1990's, it was mainly a
text-based medium. Sounds, images and complex animations were either very
rare or completely unheard of. Not surprisingly, the first major search
engines that came around a couple years later were built to classify and
rank WWW pages largely based on textual content. After all, the WWW
consisted of text and would continue to do so for the foreseeable future,
right?
Towards the late 1990's, the web had started to change. Although the
role of text was still very important, it was now common for web pages to
contain large images, Flash animations and other bells and whistles.
However, due to numerous technical difficulties, the search engines were
unable to widen their reach beyond the world of text. While search engines
that specifically search for images have been created, general-purpose
engines still mostly ignore everything that is not in text.
The moral of the story is, unless your pages are built to contain a lot
of text, they're unlikely to do well in most search engines. This doesn't
mean that you should drop all the images from your website, but keep in
mind that as far as the search engines are concerned, images, Flash
animation and sounds do not exist.
Keep non-HTML code in external files
Many of today's sites use JavaScript, CSS, or both in their designs.
Some of them have quite a lot of code in these languages on each of their
pages and have placed it above the HTML containing the text used on the
page. In terms of website optimization, this is a bad idea.
First of all, it forces the spider to wade through something that it is
not at all interested in before being able to read the text. While modern
spiders are probably quite well-accustomed to such unfriendly pages, it's
safe to say that filling your pages with non-HTML code is more likely to
hurt than to help you.
Second, the less the search engine knows what kind of CSS and
JavaScript you use, the better. If your code is attached to the HTML,
search engine spiders can freely read and analyze it if they want to. On
the other hand, if you place your code in external files and use a
robots.txt file to forbid search engines from downloading them, your code
is fairly secure. Of course the search engines could still get it if they
wanted to, but then they would have to both disobey your robots.txt and
grab the .css or .js file, both things that they're unlikely to do.
But why would you want to keep your CSS and JavaScript away from the
eyes of the search engines if you're not doing anything wrong? Well, the
problem is that search engines define what is acceptable and what is not,
and it often seems like they have a lot of trouble making up their minds.
For example, using a JavaScript redirect is occasionally "OK, if you have
a legitimate reason for doing it" and occasionally "spamming, and we'll
skin you from head to toe if we catch you". The point is that it's better
to be safe than sorry, because the rules change all the time.
Frames or tables - or CSS?
The layout of your website and the way it is created is another factor
that can either boost or reduce your search engine success. Here at the
APG site, I've decided to use a table-based layout, which is usually
considered something both human visitors and search engines can
appreciate. However, it is not the only method available and all of them
have their pro's and con's.
Tables
Search engines generally don't have any trouble reading a table-based
page, provided that the layout is not overly complex or incorrectly
designed. The only serious problem arises if you wish to have a navigation
menu on the left side of the screen, just like I do. Placing the menu on
the left causes its contents to be displayed above the rest of the content
on the page in your source code. Humans won't mind about that, but because
search engines read your source code rather than what you see on the
screen, this kind of arrangement may damage your ranking in them.
You see, most search engines consider the text at the very top of the
page to be more important than the one at the middle. This sounds a bit
odd, but it's actually a very reasonable assumption. Take a look at some
of the pages on this site for example; if you begin reading from the top,
it won't take long before you've got a general idea about the contents of
the page. But if you start from the middle, it will take on average
substantially longer to determine what subject is being discussed.
So, if your menu pushes the actual content of your page downwards in
your source code, the search engine will have difficulty determining what
your page is about, which might cause your ranking to drop. However,
fortunately there is a solution to this problem that allows you to use
tables, keep your menu on the left and please the search engines at the
same time. If you plan to use tables, I recommend using the
table trick.
Frames
Some like them, some hate them. Think of them what you will, but
generally frames are not as search engine friendly as tables. That is not
to say that its impossible to build a site that uses frames and does well
in the engines, it is just harder to do than with tables.
If you already have a site that uses frames, or if you just are
determined to use them, it would be a good idea to implement a few website
optimization tricks to prevent some of the most common problems.
To begin with, use a <NOFRAMES> tag on your frameset page. In it,
have a simplified version (less graphics, no Flash, no JavaScripts etc.)
of the content page your frameset points to and links to all of your other
content pages. By having a good NOFRAMES tag, you'll make it easier for
the search engines that can't read framesets to index your pages. As an
added bonus, the NOFRAMES tag enables those who are using browsers that
can't read frames to access your site.
However, there's another serious problem caused by frames that can't be
solved with the NOFRAMES tag. Usually, a typical design that uses frames
has the site navigation in one frame and the content in another. After
submitting your content pages to the search engines, they will eventually
be indexed and hopefully start receiving visitors. The trouble is that
when someone arrives directly to one of the content pages, the navigation
frame will not load. This can deter visitors from venturing further to
your site and thus reduce the usefulness of the traffic sent to you by the
search engines.
While this is a difficult situation, there are things you can do to
correct it. The simplest of them is to install the following JavaScript to
all of your content pages:
<script type="text/javascript"
language="javascript"> <!-- if (top == self)
location.replace("FILENAME OF YOUR FRAMESET
PAGE"); --> </script>
As long as you remember to place the name of your frameset page into
the script, you can get it to work simply by cut 'n pasting it to between
the <HEAD> and </HEAD> tags in your HTML. However, as
mentioned above, it would be best to spend some extra time and place the
script in an external file instead.
So, what will the script do? Quite simply, it'll check whether the
frameset is loaded and if not, it will load it. This will give the
visitors who arrive directly to your content pages the opportunity to see
your navigation menu and thus browse your site. Sounds great, right?
Unfortunately, the script is not as good as it seems. If you point it
to your entry frameset page, you'll notice that while it loads the
navigation, it will also load your homepage. You've given the visitor a
possibility to navigate your site, but in turn, you're redirecting him to
a page that might be completely different from the one he found in the
search engine. This is in my opinion better than doing nothing, but it is
still a very unsatisfactory solution.
Luckily, there are some more refined ways of handling the issue with
JavaScript. They'll require a bit more effort and skill, but can
deliver both the navigation menu and the correct page to the user at the
same time. While these scripts have their own problems, such as not being
100% valid HTML code, they're far superior to any other solutions I've
seen. So, if you're using frames and want to offer a satisfying experience
to those of your users who arrive through the search engines, using them
instead of that simple script I showed you is really the way to go.
To sum it up, by implementing the above suggestions, you can create
frame-based sites that get along with search engines a lot better than
they would normally do. They won't be perfect, but what in this world
really is?
Cascading Style Sheets
Search engine-wise, using CSS to create your layout is probably the
best possible solution. In addition to being more flexible than frames and
tables, CSS also gives you the possibility to easily arrange your source
code. This is a helpful ability, because you can use it to ensure that the
spiders always read the most important and well-optimized content on the
page first without having to make changes to the layout itself.
Even though it has many excellent properties, it feels like a CSS
layout is a bit ahead of its time at the moment. While it is completely
possible to implement, it will cause problems with older browsers, for
example with Netscape Navigator 4. CSS is likely to ultimately become the
layout method of choice, but for now it is still better to stick with
tables.
Avoid non-HTML filetypes
Due to the great success of Adobe's Acrobat and Microsoft's Word and
Excel, many sites now make parts of their content available in files
created with these programs. While this may be the fastest and easiest way
to post content on the Web, it can make getting your information listed on
the search engines very difficult.
Although the search engines are continuously becoming better in their
task of finding and indexing information, most of them can't read .PDF
(Acrobat), .DOC (Word) or .XLS (Excel) files. Google is ahead of the rest
in this area, as it supports all of these filetypes. Another major player,
FAST, is able to index .PDF's, but not Word or Excel documents. If you
want your file to be found on the rest of the engines, you're going to
have to stick with HTML.
However, it must also be noted that even plain old HTML pages may cause
trouble with search engines if they are generated dynamically, for example
with a CGI script. There are several good ways of taking care of these
problems without having to sacrifice the flexibility of generating HTML
dynamically, but it's important to be aware that they do exist.
Conclusion
In order to get your pages listed at the search engines and get them to
rank well, you'll have to do more than just add META tags and get a couple
of links to point to your site. By designing and constructing your site
correctly, you're building a solid foundation on which is it possible to
apply various optimization techniques in the future.
Changing an existing site structure to one that works better with the
search engines can feel like a large task, and it often is one. However,
if you're planning to make improvements, it's better to start your website
optimization project as quickly as possible. Sites tend to become larger
and more complex with age, so the job is unlikely to get any smaller as
time passes.
|