SEO Site Audit – Crawl Analysis for the Faint of Heart

Jp MendezSEO Techniques0 Comments

seo site audit

pic courtesy of

I remember the first time I heard “Go ahead and see how search engine spiders view the site”. What?! There are spiders online? I’d rather swat them than have em mess with my SEO!

But seriously, crawl analysis is one of the important, if not the most important factor when doing an seo site audit. And it’s one thing many seo specialists overlook. They want to dive in, create a linking plan and start generating backlinks from day one. Not a very good idea!

Crawling the site will give you a wealth of idea on what to do in order to better optimize the site for search engines. In most cases, in my experience at least, you’ll be busy for a month or two correcting all the errors you’re gonna get. Lucky for you if the local business owner knows seo and have been seo-ing the site correctly, but that is not always the case.

What to Look for in an SEO Site Audit?

Long URLs, Dynamic URLs

Dynamic URL is the default URL created when you publish a page. It looks something like

Just looking at it you’ll agree that it’s not very pleasing and, in a way, search engines also agree with that. Though Google can read the URL, it does not contain keywords that can give big G a hint of what keyword you are targeting. Thus, pretty URLs are the way to go.

So insead of the above URL, you should make it look like

Just how long a URL should be? 70 characters…75 at most and it should include the keyword you are targeting.

Dr. Pete wrote an awesome article about URLs for SEO. That should give you a more in depth analysis of the subject.

You should also avoid stop words as they are nothing but a waste of characters in your URL.

404 Errors

404 error

404 errors occur when a page is deleted or the url was changed without the former being redirected. This is a common issue especially after the good ol’ penguin hit. Many sites are re-structuring URLs thus creating tons of 404s and 301s.

But do 404s affect your sites health and ranking? Luckily, they do not.

For conversion purposes and to keep users from leaving your site after seeing the 404 page, you can customize the page to contain links to your most important or your most irresistable pages. You can also make it entertaining just like these

What to do if you found 404 errors in your seo site audit? Check to see whether the pages returning a 404 should be returning a 404. Other pages may have been intended to be 301d but was not. Also, check for misspellings, some webmasters may be trying to link to your content or website but are misspelling your URL. If that is the case, and if the potential traffic is huge, 301 redirect the misspelled URL.

301 redirects


If you are moving a page from one url to another, then you must 301 redirect the old URL so users get redirected to the new page whenever they click on the old URL. Very useful especially if you have tons of backlinks pointing to the old URL. Plus, it passes on link juice so whatever credibility the old URL has will get passed on to the new one.

How to do a 301 redirect?

If you’re using wordpress and the awesome thesis theme, all the seo goodies are built-in. Just scroll down when creating your new post and you’ll see a question asking if you want to redirect the page. You can also use plugins like the All in one SEO Pack

301 redirect

If, however, you created your website from scratch you’ll have to use .htaccess to redirect your pages.

5XX error

Indicates a problem with your server so should be taken seriously. In most cases, 5xx are temporary and a simple refresh will make the webpage magically re-appear. But it’s always good to contact your hosting provider and tell them about the error. Chances are, they already know about it but at least you’re not the passive customer.

Here is a list of html status codes. Keep it handy because you may encounter codes other than what’s listed above.

Meta Title

<title>My Awesomest SEO Optimized Title</title>

This is a very important ranking factor and should contain your exact keywords whenever possible. Be on the lookout for these during your seo site audit because you will be spending time creating great, search engine optimized titles for your pages or blog posts.

Reminds me of keyword research. I wrote an article awhile back about keyword research for local businesses, you should have somebody accomplish this for you as you crawl the site so they’re ready when you go nitty gritty on optimization. Here’s a great free tool I use for keyword research.

Additionally, here are some best practices when it comes to optimizing title tags.

Meta Description


Meta descriptions are more for the users rather than the search engines. This is where you can promote the content of your page and tell the users “Click me, I have exactly what you’re looking for”. It is always good to include keywords in your meta description though.

Here’s a simple strategy I use that increased the number of phone calls my clients receive.

If your service entails providing estimates and phone consultations, make sure to add that in your meta description. And go the extra mile to attract customers when they see your site on SERPs.

For example, if you are a tree service company provide a free onsite estimate instead of just a free estimate over the phone where you provide nothing but a ball part figure that could blow up easily. If you are an attorney, provide a free phone consultation.

You get what I mean? Your meta description could look like this.

Sure Win Defense Attorney in Vanuatu. I bring criminals back in the street. Free Phone Consultation (123) 456 7892

Though on some cases, you can leave the meta description blank and let the search engines browse the content for the words being searched. This is true for high authority pages.

Duplicate Page Content and Page title

Another thing to look out for during your seo site audit for these can give you a hard time ranking your pages. Duplicate contents are a known no-no and the duplicate titles will confuse the search engines.

duplicate content title

A few words from the Big G on duplicate content.

This brings us to,

Canonical tags

You can use canonical tags to solve duplicate content issues.

Many sites make the same HTML content or files available via different URLs. Say you have a clothing site and one of your top items is a green dress. The product page for the dress may be accessible through several different URLs, especially if you use session IDs or other parameters:

To gain more control over how your URLs appear in search results, and to consolidate properties, such as link popularity, we recommend that you pick a canonical (preferred) URL as the preferred version of the page.

If you want to be the canonical URL for your listing, you can indicate this to search engines by adding a element with the attribute rel=”canonical” to the section of the non-canonical pages. To do this, create a link as follows

<link rel=”canonical” href=””>

Meta Robots Tags

Robots play a huge role in making the internet work. They are hard workers that scour the web for new content and index amazing articles just like this one. You should agree with me. Don’t make me beg. 🙂

But no matter how amazing these robots are, you should learn how to control them as they do things that could hurt your SEO sometimes.

It looks like this and you’ll find it in the <head>


The Meta Name = “Robots” means the instruction applies to all search engine bots. You can target specific bots like “Googlebot” for Google, “Slurp” for Yahoo, but I don’t see why you’d want to single out a bot so “robots” would do just fine.

NOINDEX & NOFOLLOW means you do not want the page indexed on search engines and you do not want any link on the page followed.

But why would you want to do such a thing?

One is to avoid duplicate content issues and sometimes we just want some pages out of search engines’ index like your nude picture or whatever page you do not want appearing on search results.

Nofollowing on the other hand sometimes is used to keep the pages page rank to itself. No link love outta here 🙂

List of values you can use

NOINDEX – prevents the page from being included in the index.
NOFOLLOW – prevents Googlebot from following any links on the page. (Note that this is different from the link-level NOFOLLOW attribute, which prevents Googlebot from following an individual link.)
NOARCHIVE – prevents a cached copy of this page from being available in the search results.
NOSNIPPET – prevents a description from appearing below the page in the search results, as well as prevents caching of the page.
NOODP – blocks the Open Directory Project description of the page from being used in the description that appears below the page in the search results.
NONE – equivalent to “NOINDEX, NOFOLLOW”

What’s Next?

OK. So now what?! SEO site audit is done and we have a ton of information at hand, what do we do with them? This is the point where you draft an actionable SEO plan. The data you collected should give you a clear view of what you need to do as far as on-page SEO is concerned.

Hey! Let me know what you think. It’s way past my bedtime and I’m seeing double already double already.

I’d love to hear your suggestions and whatever valuable info you can add.

Let us know what you think

Leave a Reply

Your email address will not be published. Required fields are marked *