Alphatech5
SEO1  

A Beginners Guide to SEO

Google+

 

Search our Site - Google Custom Search

 


 

SEOmoz is a Seattle-based Search Engine Optimization (SEO) firm and community resource for those seeking knowledge in the SEO/M field. You can learn more about SEOmoz here. We provide a great variety of free information via a daily blog, automated tools and advanced articles.

This article is offered as a resource to help individuals, organizations, and companies inexperienced with search engine optimization learn the basics of how the service and process operate. It is our goal to improve your ability to drive search traffic to your site and debunk major myths about SEO. We share this knowledge to help businesses, government, educational, and non-profit organizations benefit from being listed in the major search engines.

 

SEO is the active practice of optimizing a web site by improving internal and external aspects in order to increase the traffic the site receives from search engines. Firms that practice SEO can vary; some have a highly specialized focus, while others take a more broad and general approach. Optimizing a web site for search engines can require looking at so many unique elements that many practitioners of SEO (SEOs) consider themselves to be in the broad field of website optimization (since so many of those elements intertwine).

This guide is designed to describe all areas of SEO - from discovery of the terms and phrases that will generate traffic, to making a site search engine friendly, to building the links and marketing the unique value of the site/organization's offerings.

Why does my company/organization/website need SEO?

The majority of web traffic is driven by the major commercial search engines - Yahoo!, MSN, Google & AskJeeves (although AOL gets nearly 10% of searches, their engine is powered by Google's results). If your site cannot be found by search engines or your content cannot be put into their databases, you miss out on the incredible opportunities available to websites provided via search - people who want what you have visiting your site. Whether your site provides content, services, products, or information, search engines are a primary method of navigation for almost all Internet users.

Search queries, the words that users type into the search box which contain terms and phrases best suited to your site, carry extraordinary value. Experience has shown that search engine traffic can make (or break) an organization's success. Targeted visitors to a website can provide publicity, revenue, and exposure like no other. Investing in SEO, whether through time or finances, can have an exceptional rate of return.

Why can't the search engines figure out my site without SEO help?

Search engines are always working towards improving their technology to crawl the web more deeply and return increasingly relevant results to users. However, there is and will always be a limit to how search engines can operate. Whereas the right moves can net you thousands of visitors and attention, the wrong moves can hide or bury your site deep in the search results where visibility is minimal. In addition to making content available to search engines, SEO can also help boost rankings so that content that has been found will be placed where searchers will more readily see it. The online environment is becoming increasingly competitive, and those companies who perform SEO will have a decided advantage in visitors and customers.

How much of this article do I need to read?

If you are serious about improving search traffic and are unfamiliar with SEO, I recommend reading this guide front-to-back. There's a printable MS Word version for those who'd prefer, and dozens of linked-to resources on other sites and pages that are worthy of your attention. Although this guide is long, I've attempted to remain faithful to Mr. Strunk's famous quote:

"A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts."

Every section and topic in this report is critical to understanding the best known and most effective practices of search engine optimization.

 

Search engines have a short list of critical operations that allows them to provide relevant web results when searchers use their system to find information.

  1. Crawling the Web
    Search engines run automated programs, called "bots" or "spiders", that use the hyperlink structure of the web to "crawl" the pages and documents that make up the World Wide Web. Estimates are that of the approximately 20 billion existing pages, search engines have crawled between 8 and 10 billion.
  2. Indexing Documents
    Once a page has been crawled, its contents can be "indexed" - stored in a giant database of documents that makes up a search engine's "index". This index needs to be tightly managed so that requests which must search and sort billions of documents can be completed in fractions of a second.
  3. Processing Queries
    When a request for information comes into the search engine (hundreds of millions do each day), the engine retrieves from its index all the document that match the query. A match is determined if the terms or phrase is found on the page in the manner specified by the user. For example, a search for car and driver magazine at Google returns 8.25 million results, but a search for the same phrase in quotes ("car and driver magazine") returns only 166 thousand results. In the first system, commonly called "Findall" mode, Google returned all documents which had the terms "car", "driver", and "magazine" (they ignore the term "and" because it's not useful to narrowing the results), while in the second search, only those pages with the exact phrase "car and driver magazine" were returned. Other advanced operators (Google has a list of 11) can change which results a search engine will consider a match for a given query.
  4. Ranking Results
    Once the search engine has determined which results are a match for the query, the engine's algorithm (a mathematical equation commonly used for sorting) runs calculations on each of the results to determine which is most relevant to the given query. They sort these on the results pages in order from most relevant to least so that users can make a choice about which to select.

Although a search engine's operations are not particularly lengthy, systems like Google, Yahoo!, AskJeeves, and MSN are among the most complex, processing-intensive computers in the world, managing millions of calculations each second and funneling demands for information to an enormous group of users.

Speed Bumps & Walls

Certain types of navigation may hinder or entirely prevent search engines from reaching your website's content. As search engine spiders crawl the web, they rely on the architecture of hyperlinks to find new documents and revisit those that may have changed. In the analogy of speed bumps and walls, complex links and deep site structures with little unique content may serve as "bumps." Data that cannot be accessed by spiderable links qualify as "walls."

Possible "Speed Bumps" for SE Spiders:

  • URLs with 2+ dynamic parameters; i.e. http://www.url.com/page.php?id=4&CK=34rr&User=%Tom% (spiders may be reluctant to crawl complex URLs like this because they often result in errors with non-human visitors)
  • Pages with more than 100 unique links to other pages on the site (spiders may not follow each one)
  • Pages buried more than 3 clicks/links from the home page of a website (unless there are many other external links pointing to the site, spiders will often ignore deep pages)
  • Pages requiring a "Session ID" or Cookie to enable navigation (spiders may not be able to retain these elements as a browser user can)
  • Pages that are split into "frames" can hinder crawling and cause confusion about which pages to rank in the results.

Possible "Walls" for SE Spiders:

  • Pages accessible only via a select form and submit button
  • Pages requiring a drop down menu (HTML attribute) to access them
  • Documents accessible only via a search box
  • Documents blocked purposefully (via a robots meta tag or robots.txt file
  • Pages requiring a login
  • Pages that re-direct before showing content (search engines call this cloaking or bait-and-switch and may actually ban sites that use this tactic)

The key to ensuring that a site's contents are fully crawlable is to provide direct, HTML links to each page you want the search engine spiders to index. Remember that if a page cannot be accessed from the home page (where most spiders are likely to start their crawl), it is likely that it will not be indexed by the search engines. A sitemap (which is discussed later in this guide) can be of tremendous help for this purpose.

Measuring Relevance and Popularity

Modern commercial search engines rely on the science of information retrieval (IR). That science has existed since the middle of the 20th century, when retrieval systems powered computers in libraries, research facilities, and government labs. Early in the development of search systems, IR scientists realized that two critical components made up the majority of search functionality:

Relevance - the degree to which the content of the documents returned in a search matched the user's query intention and terms. The relevance of a document increases if the terms or phrase queried by the user occurs multiple times and shows up in the title of the work or in important headlines or subheaders.

Popularity - the relative importance, measured via citation (the act of one work referencing another, as often occurs in academic and business documents) of a given document that matches the user's query. The popularity of a given document increases with every other document that references it.

These two items were translated to web search 40 years later and manifest themselves in the form of document analysis and link analysis.

In document analysis, search engines look at whether the search terms are found in important areas of the document - the title, the meta data, the heading tags, and the body of text content. They also attempt to automatically measure the quality of the document (through complex systems beyond the scope of this guide).

In link analysis, search engines measure not only who is linking to a site or page, but what they are saying about that page/site. They also have a good grasp on who is affiliated with whom (through historical link data, the site's registration records, and other sources), who is worthy of being trusted (links from .edu and .gov pages are generally more valuable for this reason), and contextual data about the site the page is hosted on (who links to that site, what they say about the site, etc.).

Link and document analysis combine and overlap hundreds of factors that can be individually measured and filtered through the search engine algorithms (the set of instructions that tells the engines what importance to assign to each factor). The algorithm then determines scoring for the documents and (ideally) lists results in decreasing order of importance (rankings).

Information Search Engines Can Trust

As search engines index the web's link structure and page contents, they find two distinct kinds of information about a given site or page - attributes of the page/site itself and descriptives about that site/page from other pages. Since the web is such a commercial place, with so many parties interested in ranking well for particular searches, the engines have learned that they cannot always rely on websites to be honest about their importance. Thus, the days when artificially stuffed meta tags and keyword-rich pages dominated search results (pre-1998) have vanished and given way to search engines that measure trust via links and content.

The theory goes that if hundreds or thousands of other websites link to you, your site must be popular, and thus, have value. If those links come from very popular and important (and thus, trustworthy) websites, their power is multiplied to even greater degrees. Links from sites like NYTimes.com, Yale.edu, Whitehouse.gov, and others carry with them inherent trust that search engines then use to boost your ranking position. If, on the other hand, the links that point to you are from low-quality, interlinked sites or automated garbage domains (aka link farms), search engines have systems in place to discount the value of those links.

The most well-known system for ranking sites based on link data is the simplistic formula developed by Google's founders - PageRank. PageRank, which relies on a mathematical formula (based around finding a given document in a random pattern of clicking on links), is described by Google in their technology section:

PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."

Google uses a PageRank “proxy” value, which logarithmically translates the actual PageRank of a document to a value between 1 and 10, to rank Web sites listed in its directory (which offers a PageRank order or an Alphabetical order for listings) and in its toolbar (below).


Google's toolbar (available here) includes an icon that shows a PageRank value from 0-10

PageRank is, in essence, a rough system for estimating the value of a given link based on the links that point to the host page. Since PageRank's inception in the late '90s, more subtle and sophisticated link analysis systems have taken the place of PageRank. Thus, in the modern era of SEO, the PageRank measurement in Google's toolbar, directory, or through sites that query the service is of limited value. Pages with PR8 can be found ranked 20-30 positions below pages with a PR3 or PR4. In addition, the toolbar numbers are updated only every 3-6 months by Google, making the values even less useful. Rather than focusing on PageRank, it's important to think holistically about a link's worth.

Here's a small list of the most important factors search engines look at when attempting to value a link:

  • The Anchor Text of Link - Anchor text describes the visible characters and words that hyperlink to another document or location on the web. For example, in the phrase "CNN is a good source of news, but I actually prefer the BBC's take on events," two unique pieces of anchor text exist - "CNN" is the anchor text pointing to http://www.cnn.com, while "the BBC's take on events" points to http://news.bbc.co.uk. Search engines use this text to help them determine the subject matter of the linked-to document. In the example above, the links would tell the search engine that when users search for "CNN", SEOmoz.org thinks that http://www.cnn.com is a relevant site for the term "CNN" and that http://news.bbc.co.uk is relevant to "the BBC's take on events". If hundreds or thousands of sites think that a particular page is relevant for a given set of terms, that page can manage to rank well even if the terms NEVER appear in the text itself (for example, see the BBC's explanation of why Google ranks certain pages for the term "Miserable Failure").
  • Global Popularity of the Site - More popular sites, as denoted by the number and power of the links pointing to them, provide more powerful links. Thus, while a link from SEOmoz may be a valuable vote for a site, a link from bbc.co.uk or cnn.com carries far more weight. This is one area where PageRank (assuming it was accurate) could be a good measure, as it's designed to calculate global popularity.
  • Popularity of Site in Relevant Communities - In the example above, the weight or power of a site's vote is based on its raw popularity across the web. As search engines became more sophisticated and granular in their approach to link data, they acknowledged the existence of "topical communities"; sites on the same subject that often interlink with one another, referencing documents and providing unique data on a particular topic. Sites in these communities provide more value when they link to a site/page on a relevant subject rather than a site that is largely irrelevant to their topic.
  • Text Directly Surrounding the Link - Search engines have been noted to weight the text directly surrounding a link with greater important and relevant than the other text on the page. Thus, a link from inside an on-topic paragraph may carry greater weight than a link in the sidebar or footer.
  • Subject Matter of the Linking Page - The topical relationship between the subject of a given page and the sites/pages linked to on it may also factor into the value a search engine assigns to that link. Thus, it will be more valuable to have links from pages that are related to the site/page's subject matter than those that have little to do with the topic.

These are only a few of the many factors search engines measure and weigh when evaluating links. For a more complete list, see SEOmoz's search engine ranking factors article.

Link metrics are in place so that search engines can find information to trust. In the academic world, greater citation meant greater importance, but in a commercial environment, manipulation and conflicting interests interfere with the purity of citation-based measurements. Thus, on the modern WWW, the source, style, and context of those citations is vital to ensuring high quality results.

The Anatomy of a HyperLink

A standard hyperlink in HTML code looks like this:

<a href="http://www.seomoz.org">SEOmoz</a>
SEOmoz

In this example, the code simply indicates that the text "SEOmoz" (called the "anchor text" of the link) should be hyperlinked to the page http://www.seomoz.org. A search engine would interpret this code as a message that the page carrying this code believed the page http://www.seomoz.org to be relevant to the text on the page and particularly relevant to the term "SEOmoz".

A more complex piece of HTML code for a link may include additional attributes such as:

<a href="http://www.seomoz.org" title="Rand's Site" rel="nofollow">SEOmoz</a>
SEOmoz

In this example, new elements such as the link title and rel attribute may influence how a search engine views the link, despite its appearance on the page remaining unchanged. The title attribute may serve as an additional piece of information, telling the search engine that http://www.seomoz.org, in addition to being related to the term "SEOmoz", is also relevant to the phrase "Rand's Site". The rel attribute, originally designed to describe the relationship between the linked-to page and the linking page, has, with the recent emergence of the "nofollow" descriptive, become more complex.

"Nofollow" is a tag designed specifically for search engines. When ascribed to a link in the rel attribute, it tells the engine's ranking system that the link should not be considered an editorially approved "vote" for the linked-to page. Currently, 3 major search engines (Yahoo!, MSN, & Google) all support "nofollow". AskJeeves, due to its unique ranking system, does not support nofollow, and ignores its presence in link code. For more information about how this works, visit Danny Sullivan's description of nofollow's inception on the SEW blog.

Some links may be assigned to images, rather than text:

<a href="http://www.seomoz.org/randfish.php"><img src="rand.jpg" alt="Rand Fishkin of SEOmoz"></a>


This example shows an image named "rand.jpg" linking to the page - http://www.seomoz.org/randfish.php. The alt attribute, designed originally to display in place of images that were slow to load or on voice-based browsers for the blind, reads "Rand Fishkin of SEOmoz" (in many browsers, you can see the alt text by hovering the mouse over the images). Search engines can use the information in an image-based link, including the name of the image and the alt attribute to interpret what the linked-to page is about.

Other types of links may also be used on the web, many of which pass no ranking or spidering value due to their use of re-direct, Javascript, or other technologies. A link that does not have the classic <a href="URL">text</a> format, be it image or text, should be generally considered not to pass link value via the search engines (although in rare instances, engines may attempt to follow these more complex style links).

<a href="redirect/jump.php?url=%2Fgro.zomoes.www%2F%2F%3Aptth" title="http://www.seomoz.org/" target="_blank" class="postlink">SEOmoz</a>

In this example, the redirect used scrambles the URL by writing it backwards, but unscrambles it later with a script and sends the visitor to the site. It can be assumed that this passes no search engine link value.

<a href="redirectiontarget.htm">SEOmoz</a>

This sample shows the very simple piece of Javascript code that calls a function referenced in the document to pull up a specified page. Creative uses of Javascript like this can also be assumed to pass no link value to a search engine.

It's important to understand that, based on a link's anatomy, search engines can (or cannot) interpret and use the data therein. Whereas the right sort of links can provide great value, the wrong sort will be virtually useless (for search ranking purposes). More detailed information on links is available at this resource - anatomy and deployment of links.

Keywords and Queries

Search engines rely on the terms queried by users to determine which results to put through their algorithms, order, and return to the user. But, rather than simply recognizing and retrieving exact matches for query terms, search engines use their knowledge of semantics (the science of language) to construct intelligent matching for queries. An example might be a search for loan providers that also returned results that did not contain that specific phrase, but instead had the term lenders.

The engines collect data based on the frequency of use of terms and the co-occurrence of words and phrases throughout the web. If certain terms or phrases are often found together on pages or sites, search engines can construct intelligent theories about their relationships. Mining semantic data through the incredible corpus that is the Internet has given search engines some of the most accurate data about word ontologies and the connections between words ever assembled artificially. This immense knowledge of language and its usage gives them the ability to determine which pages in a site are topically related, what the topic of a page or site is, how the link structure of the web divides into topical communties, and much, much more.

Search engines' growing artificial intelligence on the subject of language means that queries will increasingly return more intelligent, evolved results. This heavy investment in the field of natural language processing (NLP) will help to achieve greater understanding of the meaning and intent behind their users' queries. Over the long term, users can expect the results of this work to produce increased relevancy in the SERPs (Search Engine Results Pages) and more accurate guesses from the engines as to the intent of a user's queries.

Sorting the Wheat from the Chaff

In the classic world of Information Retrieval, when no commercial interests existed in the databases, very simplistic algorithms could be used to return high quality results. On the world wide web, however, the opposite is true. Commercial interests in the SERPs are a constant issue for modern search engines. With every new focus on quality control and growth in relevance metrics, there are thousands of individuals (many in the field of SEO) dedicated to manipulating these metrics in order to control the SERPs, typically by aiming to list their sites/pages first.

The worst kind of results are what the industry refers to as "search spam" - pages and sites with little real value that contain primarily re-directs to other pages, lists of links, scraped (copied) content, etc. These pages are so irrelevant and useless that search engines are highly focused on removing them from the index. Naturally, the monetary incentives are similar to email spam - although few visit and fewer click on the links (which are what provide the spam publisher with revenue), the sheer quantity is the decisive factor in producing income.

Other "spam" results range from sites that are of low quality or affiliate status that search engines would prefer not to list, to high quality sites and businesses that are using the link structure of the web to manipulate the results in their favor. Search engines are focused on clearing out all types of manipulation and hope to eventually achieve fully relevant and organic algorithms to determine ranking order. So-called "search engine spammers" engage in a constant battle against these tactics, seeking new loopholes and methods for manipulation, resulting in a never-ending struggle.

This guide is NOT about how to manipulate the search engines to achieve rankings, but rather how to create a website that search engines and users will be happy to have ranking permanently in the top positions, thanks to its relevance, quality, and user friendliness.

Paid Placement and Secondary Sources in the Results

The search engine results pages contain not only listings of documents found to be relevant to the user's query, but other content, including paid advertisements and secondary source results. Google, for example, serves up ads from its well-known AdWords program (which currently fuels more than 99% of Google's revenues), as well as secondary content from its local search, product search (called Froogle), and image search results.

Below is a screenshot of Google's search engine results page. Hover on any of the areas of the image to reveal the source of the content:

 

The sites/pages ranking in the "organic" search results receive the lion's share of searcher eyeballs and clicks - between 60-70%, depending on factors such as the prominence of ads, relevance of secondary content, etc. The practice of optimization for the paid search results is called SEM, or Search Engine Marketing, while optimizing to rank in the secondary results requires unique, advanced methods of targeting specific searches in arenas such as local search, product search, image search, and others. While all of these practices are a valuable part of any online marketing campaign, they are beyond the scope of this guide. Our sole focus remains on the "organic" results, although links at the bottom of this paper can help direct you to resources on other subjects.

 

Keyword research is critical to the process of SEO. Without this component, your efforts to rank well in the major search engines may be mis-directed to the wrong terms and phrases, resulting in rankings that no one will ever see. The process of keyword research involves several phases:

  1. Brainstorming - Thinking of what your customers/potential visitors would be likely to type in to search engines in an attempt to find the information/services your site offers (including alternate spellings, wordings, synonyms, etc).
  2. Surveying Customers - Surveying past or potential customers is a great way to expand your keyword list to include as many terms and phrases as possible. It can also give you a good idea of what's likely to be the biggest traffic drivers and produce the highest conversion rates.
  3. Applying Data from KW Research Tools - Several tools online (including Wordtracker ) offer information about the number of times users perform specific searches. Using these tools can offer concrete data about trends in keyword selection.
  4. Term Selection - The next step is to create a matrix or chart that analyzes the terms you believe are valuable and compares traffic, relevancy, and the likelihood of conversions for each. This will allow you to make the best informed decisions about which terms to target. SEOmoz's KW Difficulty Tool can also aid in choosing terms that will be achievable for the site.
  5. Performance Testing and Analytics - After keyword selection and implementation of targeting, analytics programs (like Indextools and ClickTracks) that measure web traffic, activity, and conversions can be used to further refine keyword selection.

Wordtracker & Overture

Currently, the two most popular sources of keyword data are Wordtracker, whose statistics come primarily from use of the meta-search engine Dogpile (which has ~1% of the share of searches performed online) and Overture (recently re-branded as Yahoo! Search Marketing), which offers data collected from searches performed on Yahoo!'s engine (with a 22-28% share). While neither's data is flawless or entirely accurate, both provide good methods for measuring comparative numbers. For example, while Overture and Wordtracker may disagree on numbers and say that "red bicycles" gets 240 vs. 380 searches per day (across all engines), both will generally indicate that this is a more popular term than "scarlet bicycles", "maroon bicycles", or even "blue bicycles."

In Wordtracker, which provides more detail but has a considerably smaller share of data, terms and phrases are separated by capitalization, plurality, and word ordering. In the Overture tool, multiple search phrases are combined. For example, Wordtracker would independently show numbers for "car loans", "Car Loans", "car loan", and "cars Loan", whereas Overture would give a single number that encompasses all of these. The granularity of data can be more useful for analyzing searches that may result in unique results pages (plurals often do and different word orders almost always do), but capitalization is of less consequence as the search engines don't deliver different results based on capitalization.

Remember that Wordtracker and Overture are both useful tools for relative keyword data, but can be highly inaccurate when compared to the actual number of searches performed. In other words, use the tools to select which terms to target, but don't rely on them for predicting the amount of traffic you can achieve. If your goal is estimating traffic numbers, use programs like Google's Adwords to test the number of impressions a particular term/phrase gets.

Targeting the Right Terms

Targeting the best possible terms is of critical importance. This encompasses more than merely measuring traffic levels and choosing the highest trafficked terms. An intelligent process for keyword selection will measure each of the following:

  • Conversion Rate - the percent of users searching with the term/phrase that converts (click an ad, buy a product, complete a transaction, etc.)
  • Predicted Traffic - An estimate of how many users will be searching for the given term/phrase each month
  • Value per Customer - An average amount of revenue earned per customer using the term or phrase to search - comparing big-ticket search terms vs. smaller ones.
  • Keyword Competition - A rough measurement of the competitive environment and the level of difficulty for the given term/phrase. This is typically measured by metrics that include the number of competitors, the strength of those competitors' links, and the financial motivation to be in the sector. SEOmoz's Keyword Difficulty Tool can assist in this process.

Once you've analyzed each of these elements, you can make effective decisions about the terms and phrases to target. When starting a new site, it's highly recommended to target only one or possibly two unique phrases on a single page. Although it is possible to optimize for more phrases and terms, it's generally best to keep separate terms on separate pages, as you can provide individualized information for each in this manner. As websites grow and mature, gaining links and legitimacy with the engines, targeting multiple terms per page becomes more feasible.

The Long Tail of Search

The "long tail" is a concept pioneered by Chris Anderson (the editor-in-chief of Wired magazine, who runs the Long Tail blog). From Chris's description:

The theory of the Long Tail is that our culture and economy is increasingly shifting away from a focus on a relatively small number of "hits" (mainstream products and markets) at the head of the demand curve and toward a huge number of niches in the tail. As the costs of production and distribution fall, especially online, there is now less need to lump products and consumers into one-size-fits-all containers. In an era without the constraints of physical shelf space and other bottlenecks of distribution, narrowly-targeted goods and services can be as economically attractive as mainstream fare.

This concept relates exceptionally well to keyword search terms in the major engines. Although the largest traffic numbers are typically for broad terms at the "head" of the keyword curve, great value lies in the thousands of unique, rarely used, niche terms in the "tail." These terms can provide higher conversion rates and more interested and valuable visitors to a site, as these specific terms can relate to exactly the topics, products, and services your site provides.

For example:

Keyword Term/Phrase
# of Searches per Month
men's suit 27,770
armani men's suit 723
italian men's suit 615
Jones New York Men's Suit 424
Men's 39S Suit 310
Gucci Men's Suit 222
Versace Men's Suit 178
Hugo Boss Men's Suit 138
Men's Custom Made Suit 126
*Source - Overture Keyword Selection Tool (Sept. '05 data)

In the scenario in the table above, the traffic for the term "men's suit" may be far greater, but the value of more specific terms is greater. A searcher for "Hugo Boss Men's Suit" is more likely to make a purchase decision than one searching for simply a "men's suit." There are also thousands of other terms, garnering far fewer monthly searches, that, when taken together, have a value greater than the terms garnering the most searches. Thus, targeting many dozens or hundreds of smaller terms individually can be both easier (on a competitive level) and more profitable.

Sample Keyword Research Chart

The following chart diagrams how we conduct basic keyword research at SEOmoz. You are welcome to copy and use this format for your own keywords:

Term/Phrase
KW Difficulty
Top 3 OV Bids
OV Mthly Pred. Traf.
WT Mthly Pred. Traf.
Relevance Score
San Diego Zoo
63%
$0.41
$0.41
$0.40
116,229
42,360
25%
Joe Dimaggio
51%
$0.28
$0.19
$0.11
5,847
7,590
10%
Starsky and Hutch
53%
$0.16
$0.00
$0.00
19,769
16,950
30%
Art Museum
77%
$0.51
$0.50
$0.25
19,244
7,410
5%
DUI Attorney
52%
$1.63
$1.62
$1.60
13,923
3,960
60%
Search Engine Marketing
83%
$4.99
$3.26
$3.25
1,183,633
74,430
40%
Microsoft
89%
$0.69
$0.51
$0.32
1,525,265
256,620
10%
Interest Only Mortgage Loan
50%
$4.60
$4.39
$4.39
3,745
8,910
75%


Key

  • KW Difficulty - The score from SEOmoz's tool
  • Top 3 OV Bids - The bid amount from the top 3 listings in Yahoo!'s PPC results
  • Overture Monthly Predicted Traffic - The amount of traffic estimated via Overture for the previous month's data
  • Wordtracker Monthly Predicted Traffic - The amount of traffic estimated via Wordtracker (note that you must add up all terms in their database that match and multiply by the number of days in the month - the "exact/precise search" function can help make this easier)
  • Relevance Score - The % of searchers using this term/phrase that you feel are likely to be interested in your site's products/services/offerings. Although this is a subjective number, you can use conversion rates or click-through rates from previous campaigns to more accurately estimate this in the future.

In selecting final terms, those with lower difficulty, higher relevance, and more traffic will offer the greatest value.


Optimizing a Site

Each of the following components are critical pieces to a site's ability to be crawled, indexed, and ranked by search engine spiders. When properly used in the construction of a website, these features give a site/page the best chance of ranking well for targeted keywords.

Accessibility

An accessible site is one that ensures delivery of its content successfully as often as possible. The functionality of pages, validity of HTML elements, uptime of the site's server, and working status of site coding and components all figure into site accessibility. If these features are ignored or faulty, both search engines and users will select other sites to visit.

The biggest problems in accessibility that most sites encounter fit into the following categories. Addressing these issues satisfactorily will avoid problems getting search engines and visitors to and through your site.

  • Broken Links - If an HTML link is broken, the contents of the linked-to page may never be found. In addition, some surmise that search engines negatively degrade rankings on sites & pages with many broken links.
  • Valid HTML & CSS - Although arguments exist about the necessity for full validation of HTML and CSS in accordance with W3C guidelines, it is generally agreed that code must meet minimum requirements of functionality and successful display in order to be spidered and cached properly by the search engines.
  • Functionality of Forms and Applications - If form submissions, select boxes, javascript, or other input-required elements block content from being reached via direct hyperlinks, search engines may never find them. Keep data that you want accessible to search engines on pages that can be directly accessed via a link. In a similar vein, the successful functionality and implementation of any of these pieces is critical to a site's accessibility for visitors. A non-functioning page, form, or code element is unlikely to receive much attention from visitors.
  • File Size - With the exception of a select few documents that search engines consider to be of exceptional importance, web pages greater than 150K in size are typically not fully cached. This is done to reduce index size, bandwidth, and load on the servers, and is important to anyone building pages with exceptionally large amounts of content. If it's important that every word and phrase be spidered and indexed, keeping file size under 150K is highly recommended. As with any online endeavor, smaller file size also means faster download speed for users - a worthy metric in its own right.
  • Downtime & Server Speed - The performance of your site's server may have an adverse impact on search rankings and visitors if downtime and slow transfer speeds are common. Invest in high quality hosting to prevent this issue.

URLs, Title Tags & Meta Data

URLs, title tags and meta tag components are all information that describe your site and page to visitors and search engines. Keeping them relevant, compelling and accurate are key to ranking well. You can also use these areas as launching points for your keywords, and indeed, successful rankings require their use.

The URL of a document should ideally be as descriptive and brief as possible. If, for example, your site's structure has several levels of files and navigation, the URL should reflect this with folders and subfolders. Individual pages' URLs should also be descriptive without being overly lengthy, so that a visitor who sees only the URL could have a good idea of what to expect on the page. Several examples follow:

Comparison of URLs for a Canon Powershot SD400 Camera

Amazon.com - http://www.amazon.com/gp/product/B0007TJ5OG/102-8372974-
4064145?v=glance&n=502394&m=ATVPDKIKX0DER&n=3031001&s=photo&v=glance

Canon.com - http://consumer.usa.canon.com/ir/controller?
act=ModelDetailAct&fcategoryid=145&modelid=11158

DPReview.com - http://www.dpreview.com/reviews/canonsd400/

With both Canon and Amazon, a user has virtually no idea what the URL might point to. With DPReview's logical URL, however, it is easy to surmise that a review of a Canon SD400 is the likely topic of the page.

In addition to the issues of brevity and clarity, it's also important to keep URLs limited to as few dynamic parameters as possible. A dynamic parameter is a part of the URL that provides data to a database so the proper records can be retrieved, i.e. n=3031001, v=glance, categoryid=145, etc.

Note that in both Amazon and Canon's URLs, the dynamic parameters number 3 or more. In an ideal site, there should never be more than two. Search engineer representatives have confirmed on numerous occasions that URLs with more than 2 dynamic parameters may not be spidered unless they are perceived as significantly important (i.e. have many, many links pointing to them).

Well written URLs have the additional benefit of serving as their own anchor text when copied and pasted as links in forums, blogs, or other online venues. In the DPReview example, a search engine might see the URL http://www.dpreview.com/reviews/canonsd400/ and give ranking credit to the page for terms in the URL like dpreview, reviews, canon, sd, 400. The parsing and breaking of terms is subject to the search engine's analysis, but the chance of earning this additional credit makes writing friendly, usable URLs even more worthwhile.

Title tags, in addition to their invaluable use in targeting keyword terms for rankings, also help drive click-through-rates (CTRs) from the results pages. Most of the search engines will use a page's title tag as the blue link text and headline for a result (see image below), and thus it is important to make them informative and compelling without being overly "salesy". The best title tags will make the targeted keywords prominent, help brand the site, and be as clear and concise as possible.

Examples and Recommendations for Title Tags

Page on Alexander Calder from the Calder Foundation:
- Current Title: Alexander Calder
- Recommended: Alexander Calder - Biography of the Artist from the Calder Foundation

Page on Plasma TVs from Tiger Direct:
- Current Title: Plasma Televisions, Plasma TV, Plasma Screen TVs, SONY Plasma TV, LCD TV at TigerDirect.com
- Recommended: Plasma Screen & LCD Televisions at TigerDirect.com

For each of these, the idea behind the recommendations is to distill the information into the clearest, most useful snippet while retaining the primary keyword phrase as the first words in the tag. The title tag provides the first impression of a web page and can either serve to draw the visitor in or compel him or her to choose another listing in the results.

Meta Tag Recommendations:

Meta tags once held the distinction of being the primary realm of SEO specialists. Today, the use of meta tags, particularly the meta keywords tag, has diminished to an extent that search engines no longer use them in their ranking of pages. However, the meta description tag can still be of some importance, as several search engines use this tag to display the snippet of text below the clickable title link in the results pages.

In the image to the left, an illustration of a Google SERP (Search Engine Results Page) shows the use of the meta description and title tags. It is on this page that searchers generally make their decision as to which result to click, and thus, while the meta description tag may have little to no impact on where a page ranks, it can significantly impact the # of visitors the page receives from search engine traffic. Note that meta tags are NOT always used on the SERPs, but can be seen (at the discretion of the search engine) if the description is accurate, well-written, and relevant to the searcher's query.

Search-Friendly Text

Making the visible text on a page "search-friendly" isn't complicated, but it is an issue that many sites struggle with. Text styles that cannot be indexed by search engines include:

  • Text embedded in a Java Application or Macromedia Flash file
  • Text in an image file - jpg, gif, png, etc
  • Text accessible only via a form submit or other on-page action

If the search engines can't see your page's text, they cannot spider and index that content for visitors to find. Thus, making search-friendly text in HTML format is critical to ranking well and getting properly indexed. If you are forced to use a format that hides text from search engines, try to use the right keywords and phrases in headlines, title tags, URLs, and image/file names on the page. Don't go overboard with this tactic, and never try to hide text (by making it the same color as the background or using CSS tricks). Even if the search engines can't detect this automatically, a competitor can easily report your site for spamming and have you de-listed entirely.

Along with making text visible, it's important to remember that search engines measure the terms and phrases in a document to extract a great deal of information about the page. Writing well for search engines is both an art and a science (as SEOs are not privy to the exact, technical methodology of how search engines score text for rankings), and one that can be harnessed to achieve better rankings.

In general, the following are basic rules that apply to optimizing on-page text for search rankings:

  • Make the primary term/phrase prominent in the document - Measurements like keyword density are useless (see kw density myth thread), but general frequency can help rankings.
  • Make the text on-topic and high quality - Search engines use sophisticated lexical analysis to help find quality pages, as well as teams of researchers identifying common elements in high quality writing. Thus, great writing can provide benefits to rankings, as well as visitors.
  • Use an optimized document structure - The best practice is generally to follow a journalistic format wherein the document starts with a description of the content, then flows from broad discussion of the subject to narrow. The benefits of this are arguable, but in addition to SEO value, they provide the most readable and engaging informational document. Obviously, in situations where this would be inappropriate, it's not necessary.
  • Keep text together - Many folks in SEO recommend using CSS rather than table layouts in order to keep the text flow of the document together and prevent the breaking up of text via coding. This can also be achieved with tables - simply make sure that text sections (content, ads, navigation, etc.) flow together inside a single table or row and don't have too many "nested" tables that make for broken sentences and paragraphs.

Keep in mind that the text layout and keyword usage in a document no longer carries high importance in search engine rankings. While the right structure and usage can provide a slight boost, obsessing over keyword placement or layout will provide little overall benefit.

Information Architecture

The document and link structure of a website can provide benefits to search rankings when performed properly. The keys to effective architecture are to follow the rules that govern human usability of a site:

  • Make Use of a Sitemap - It's wise to have the sitemap page linked to from every other page in the site, or at the least from important high-level category pages and the home page. The sitemap should, ideally, offer links to all of the site's internal pages. However, if more than 100-150 pages exist on the site, a wiser system is to create a sitemap that will link to all of the category level pages, so that no page in a site is more than 2 clicks from the home page. For exceptionally large sites, this rule can be expanded to 3 clicks from the home page.
  • Use a Category Structure that Flows from Broad > Narrow - Start with the broadest topics as hierarchical category pages, then expand to deep pages with specific topics. Using the most on-topic structure tells search engines that your site is highly relevant and covers a topic in-depth.

Canonical Issues & Duplicate Content

One of the most common and problematic issues for website builders, particularly those with larger, dynamic sites powered by databases, is the issue of duplicate content. Search engines are primarily interested in unique documents and text, and when they find multiple instances of the same content, they are likely to select a single one as "canonical" and display that page in their results.

If your site has multiple pages with the same content, either through a content management system that creates duplicates through separate navigation, or because copies exist from multiple versions, you may be hurting those pages' chances of ranking in the SERPs. In addition, the value that comes from anchor text and link weight, through both internal and external links to the page, will be diluted by multiple versions.

The solution is to take any current duplicate pages and use a 301 re-direct (described in detail here) to point all versions to a single, "canonical" edition of the content.

One very common place to look for this error is on a site's homepage - oftentimes, a website will have the same content on http://www.url.com, http://url.com, and http://www.url.com/index.html. That separation alone can cause lost link value and severely damage rankings for the site's homepage. If you find many links outside the site pointing to both the non-www and the www version, it may be wise to use a 301 re-write rule to affect all pages at one so they point to the other.

 

One of the most important (and often overlooked) subjects in SEO is building a site deserving of top rankings at the search engines. A site that ranks #1 for a set of terms in a competitive industry or market segment must be able to justify its value or risk losing out to competitors who offer more. Search engines' goals are to rank the best, most usable, functional, and informative sites first. By intertwining your site's content and performance with these goals, you can help to ensure its long-term prospects in the search engine rankings.

Usability

Usability represents the ease-of-use inherent in your site's design, navigation, architecture, and functionality. The idea behind the practice is to make your site intuitive so that visitors will have the best possible experience on the site. A whole host of features figure into usability, including:

  • Design
    The graphical elements and layout of website have a strong influence on how easily usable the site is. Standards like blue, underlined links, top and side menu bars, logos in the top, left-hand corner may seem like rules that can be bent, but adherence to these elements (with which web users are already familiar) will help to make a site usable. Design also encompasses important topics like visibility & contrast, affecting how easy it is for users to interest the text and image elements of the site. Separation of unique sections like navigation, advertising, content, search bars, etc. is also critical, as users follow design cues to help them understand a page's content. A final consideration would also take into account the importance of ensuring that critical elements in a site's design (like menus, logos, colors, and layout) were used consistently throughout the site.
  • Information Architecture
    The organizational hierarchy of a site can also strongly affect usability. Topics and categorization impact the ease with which a user can find the information they need on your site. While an intuitive, intelligently designed structure will seamlessly guide the user to their goals, a complex, obfuscated hierarchy can make finding information on a site disturbingly frustrating.
  • Navigation
    A navigation system that guides users easily through both top-level and deep pages and makes a high percentage of the site easily accessible is critical to good usability. Since navigation is one of a website's primary functions, provide users with obvious navigation systems: breadcrumbs, alt tags for image links, and well-written anchor text that clearly describes what the user will get if he or she clicks a link. Navigation standards like these can drastically improve usability performance.
  • Functionality
    To create compelling usability, ensure that tools, scripts, images, links, etc. all function as they are intended and don't provide errors to non-standard browsers, alternative operating systems, or uninformed users (who often don't know what/where to click).
  • Accessibility
    Accessibility refers primarily to the technical ability of users to access and move through your site, as well as the ability of the site to serve disabled or impaired users. For SEO purposes, the most important aspects are limiting code errors to a minimum and fixing broken links, making sure that content is accessible and visible in all browsers and without special actions.
  • Content
    The usability of content itself is often overlooked, but its importance cannot be overstated. The descriptive nature of headlines, the accuracy of information and the quality of content all factor highly into a site's likelihood to retain visitors and gain links.

Overall, usability is about gearing a site towards the potential users. Success in this arena garners increased conversion rates, a higher chance that other sites will link to yours, and a better relationship with your users (fewer complaints, lower instance of problems, etc.).

Professional Design

Elegant, high quality, high impact design is critical to gaining the trust of your users. If your site appears "low budget" or only marginally professional, it can hurt the chances of gaining a link and, more importantly, the chances of engendering trust in your visitors. The first impression of a website by a user occurs in less than 7 seconds. That's all the time you have to convey the importance and authority of your company through the site's design. I've prepared two examples below:

Although the above examples are not perfect (note that Haworth is missing a critical element - a search bar, while Workplace Office UK has one), it's easy to see why consumers visiting websites like these would be more inclined to trust and buy from Haworth rather than Workplace Office. The application of professional design to sites can induce greater numbers of links from visiting content creators, greater number of users who return to the site, higher conversion rates, and a better overall perception of your site by visitors.

Although high quality, professional design is not one of the factors directly ranked by search engines, it indirectly influences many factors that do affect the rankings (i.e. link-building, trust, usability, etc).

Authoring High Quality Content

Why Should a Search Engine Rank Your Site Above All the Others in its Field?

If you cannot answer this question clearly and precisely, the task of ranking higher will be exponentially more difficult. Search engines attempt to rank the very best sites with the most relevant content first in their results, and until your site's content is the best in its field, you will always struggle against the engines rather than bringing them to your doorstep.

It is in content quality that a site's true potential shows through, and although search engines cannot measure the likelihood that users will enjoy a site, the vote via links system operates as a proxy for identifying the best content in a market. With great content, therefore, come great links and, ultimately, high rankings. Deliver the content that users need, and the search engines will reward your site.

Content quality, however, like professional design, is not always dictated by strict rules and guidelines. What passes for "best of class" in one sector may be below average in another market. The competitiveness and interests of your peers and competitors in a space often determine what kind of content is necessary to rank. Despite these variances, however, several guidelines can be almost universally applied to produce content that is worthy of attention:

  • Research Your Field
    Get out into the forums, blogs, and communities where folks in your industry spend their time.