How Does Google Understand Webpages Today?

How Google has evolved
Back in 1998 when Google first started, it was little more than a text-based search algorithm that matched query string with websites it knew about, and returned the best possible match based on the frequency of keywords. Webpages back then were just as simple, with normal text content and little styling. Webmasters didn't have to worry about a thing - they built textual content that showed up in search results if relevant. Now though, a lot has changed. Technologies like JavaScript and CSS have completely taken over, and website developers now need to take care of a lot more things.

How Google understands a webpage

Since websites were a collection of simple text-based webpages back in Google's infancy, Google could easily index a webpage and parse it for search queries. It only looked at the raw textual content that was returned by an HTTP response body, and didn't really interpret what a typical browser running JavaScript would see.

Now, however, the web is full of rich websites that make heavy use of JavaScript and CSS. Most of these are dynamic websites, and their content is rendered by JavaScript, not ordinarily visible to a text crawler.

In order to solve this problem, Google started executing JavaScript to understand the content on a page. Doing is for a web that is so large is a difficult task. But Google has gradually been improving how they do this, and in the past few months, their indexing system has been rendering a substantial number of webpages more like an average user's browser with JavaScript turned on. There are, however, many problems still being faced.

How can webmasters help?

Rendering so much content isn't a walk in the park, and the fact that there is so much variety in development techniques makes it even harder. So naturally, things don't always go perfectly, which may negatively impact search results for your site. Webmasters can follow these guidelines to ensure that the content on their sit is accessed easily.

Note: We discussed some of these techniques in our recent post about helping Google recognize the mobile version of a webpage. Go check it out if you have a separate mobile site, and see if you're doing everything correctly.
  • If resources like JavaScript or CSS in separate files are blocked (say, with robots.txt) Google won’t be able to see your site like an average user. It is recommended that you allow Googlebot to retrieve JavaScript and CSS so that your content can be indexed better. This is especially important for mobile websites, where external resources like CSS and JavaScript help search algorithms understand that the pages are optimized for mobile.
  • If your web server is unable to handle the volume of crawl requests for resources, it may have a negative impact on Google's capability to render your pages. If you’d like to ensure that your pages can be rendered by Google, make sure your servers are able to handle crawl requests for resources.
  • It's always a good idea to have your site degrade gracefully. This will help users enjoy your content even if their browser doesn't have compatible JavaScript implementations. It will also help visitors with JavaScript disabled or off, as well as search engines that can't execute JavaScript yet.
  • Sometimes the JavaScript may be too complex or arcane for Google to execute, in which case the page(s) can't be rendered fully and accurately.
  • Some JavaScript removes content from the page rather than adding, which prevents Google from indexing the content.
Make sure you rectify any of these problems on your site as soon as possible!

Found these tips helpful? Do remember to give us a shout-out in the comments section below. And don't forget to any tips of your own. Peace :)

If you don't want to get yourself into Serious Technical Trouble while editing your Blog Template then just sit back and relax and let us do the Job for you at a fairly reasonable cost. Submit your order details by Clicking Here »

1 comments

PLEASE NOTE:
We have Zero Tolerance to Spam. Chessy Comments and Comments with 'Links' will be deleted immediately upon our review.
  1. Here you go dear friend
    http://www.mybloggertricks.com/2009/11/how-to-customize-block-quotes-in.html

    ReplyDelete