jump to navigation

How do search engines index sites? July 7, 2008

Posted by seonlinks in seo learn.
Tags: , , , , , , , , , ,
add a comment

The first step in the indexing process is discovery. A search engine has to know the pages exist. Search engines generally learn about pages from following links, and this process works great. If you have new pages, ensure relevant sites link to them, and provide links to them from within your site. For instance, if you have a blog for your business, you could provide a link from your main site to the latest blog post. You can also let search engines know about the pages of your site by submitting a Sitemap file. Google, Yahoo!, and Microsoft all support the Sitemaps protocol and if you have a blog, it couldn’t be easier! Simply submit your blog’s RSS feed. Each time you update your blog and your RSS feed is updated, the search engines can extract the URL of the latest post. This ensures search engines know about the updates right away.

Once a search engine knows about the pages, it has to be able to access those pages. You can use the crawl errors reports in webmaster tools to see if we’re having any trouble crawling your site. These reports show you exactly what pages we couldn’t crawl, when we tried to crawl them, and what the error was.

Once we access the pages, we extract the content. You want to make sure that what your page is about is represented by text. What does the page look like with Javascript, Flash, and images turned off in the browser? Use ALT text and descriptive filenames for images. For instance, if your company name is in a graphic, the ALT text should be the company name rather than “logo“. Put text in HTML rather than in Flash or images. This not only helps search engines like google index your content, but also makes your site more accessible to visitors with mobile browsers, screen readers, or older browsers.