|
![]() |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||
| Domain | IP Address |
| www-ex.google.com | 216.239.33.100 |
| www-sj.google.com | 216.239.35.100 |
| www-va.google.com | 216.239.37.100 |
| www-dc.google.com | 216.239.39.100 |
| www-ab.google.com | 216.239.51.100 |
| www-in.google.com | 216.239.53.100 |
| www-zu.google.com | 216.239.55.100 |
| www-cw.google.com | 216.239.57.100 |
| www-fi.google.com | 216.239.41.100 |
| www-gv.google.com | 216.239.59.100 |
| www-kr.google.com | 66.102.11.100 |
| www-mc.google.com | 66.102.7.100 |
| www-lm.google.com | 66.102.9.100 |
Note: Searches at www-zu and www-sj are currently redirected to other data centers. Since results for searches at their IP addresses fluctuate heavily during a Google Dance, also these searches seem to be internally routed to other data centers. As we can see from our statistics for Google's DNS records, there are currently no searches at www.google.com directed to www-zu and www-sj. So, we can assume that the data centers are offline.
Those that keep an eye on Google's index updates often think that the Google Dance is over, when they see the new index at www.google.com or when they don't see the old index at www.google.com for some time. In fact, the update is not finished until all the domains listed above provide results from the new index.
The index updates at the single data centers seem to happen at one point in time. As soon as one data center shows results from the new index, it won't switch back to the old index. This happens most likely because the index is redundant at each data center and at first, only one part of the servers (eventually half of them) is updated. During this period, only the other half of the servers is active and provides search results. As soon as the update of the first half of servers is finished, they become active and provide search results while the other half receives the new index. Thus, from the user's perspective, the update of one data centers happens at one point in time.
Finally, it shall be noted that the access to the single data centers is generally controlled by the DNS only, but sometimes queries are redirected. However, this is easy to detect: When for a query at one of the domains listed above, the links to Google's cache do not comply with the IP address that belongs to the domain, then the query is redirected. If this happens, Google inhibits - for whatever reason - the access to one data center.
The beginning of a Google Dance can always be watched at the test domains www2.google.com and www3.google.com. Those domains normally have stable DNS records which make the domains resolve to only one (often the same) IP address. Before the Google Dance begins, at least one of the test domains is assigned the IP address of the data center that receives the new index first.
Building up a completely new index once per month can cause quite some trouble. After all, Google has to spider some billion documents an then to process many TeraBytes of data. Therefore, testing the new index is inevitable. Of course, the folks at Google don't need the test domains themselves. Most certainly, they have many options to check a new index internally, but they do not have a lot of time to conduct the tests.
So, the reason for having www2 and www3 is rather to show the new index to webmasters which are interested in their upcoming rankings. Many of these webmasters discuss the new index at the Google forums out on the web. These discussions can be observed by Google employees. At that time, the general public cannot see the new index yet, because the DNS records for www.google.com normally do not point to the IP address of the data center that is updated first when the update begins.
As soon as Google's test community of forums members does not find any severe malfunctions caused by the new index, Google's DNS records are ready to make www.google.com resolve the the data center that is updated first. This is the time when the Google Dance begins. But if severe malfunctions become obvious during this test phase, there is still the possibility to cancel the update at the other data centers. The domain www.google.com would not resolve to the data center which has the flawed index and the general public could not take any notice about it. In this case, the index could be rebuilt or the web could be spidered again.
So, the search results which are to be seen on www2.google.com and www3.google.com will always appear on www.google.com later on, as long as there is a regular index update. However, there may be minor fluctuations. On the one hand, the index at one data center never absolutely equals the index at another data center. We can easily check this by watching the number of results for the same query at the data center domains listed above, which often differ from each other. On the other hand, it is often assumed that the iterative PageRank calculation is not finished yet, when the Google Dance begins so that preliminary values exert influence on rankings at that point in time.
Most webmasters are interested in ranking changes for their web site during the Google Dance. But, besides that, many also want to know about their new PageRank values. Normally, the Google Toolbar fetches the PageRank values from the data center that is specified by its IP address in the actual DNS record for www.google.com. Hence, when the Google Dance begins, the Toolbar usually displays the old PageRank values.
Google submits PageRank values in simple text files to the Toolbar. In former times, this happened via XML. The switch to text files occured in August 2002. The PageRank files can be requested directly from the domain www.google.com. Basically, the URLs for those files look like follows (without line breaks):
http://www.google.com/search?
client=navclient-auto&
ch=0123456789&
features=Rank&
q=info:http://www.domain.com/
There is only one line of text in the PageRank files. The last cipher in this line is PageRank.
The parameters incorporated in the above shown URL are inevitable for the display of the PageRank files in a browser. The value "navclient-auto" for the parameter "client" identifies the Toolbar. Via the parameter "q" the URL is submitted. The value "Rank" for the parameter "features" determines that the PageRank files are requested. If it is omitted, Google's servers still transmit XML files. The parameter "ch" transfers a checksum for the URL to Google, whereby this checksum can only change when the Toolbar version is updated by Google.
The PageRank files that are requested by the Google Toolbar are cached by the Internet Explorer. So, their URLs and the checksums can simply been found out by having a look at the folder Temporary Internet Files. Knowing the checksums of your URLs, you can view the PageRank files in your browser. Since the PageRank files are kept in the browser cache and, thus, are clearly visible, and as long as requests are not automated, watching the PageRank files in a browser should not be a violation of Google's Terms of Service. However, you should be cautious. The Toolbar submits its own User-Agent to Google. It is:
Mozilla/4.0 (compatible; Google Toolbar 1.1.60-deleon; OS SE 4.10)
1.1.60-deleon is a Toolbar version which may of course change. OS is the operating system that you have installed. So, Google is able to identify requests by browsers, if they do not go out via a proxy and if the User-Agent is not modified accordingly.
Now, let's see how we can get the new PageRank values. Taking a look at IE's cache, you will notice that the PageRank files are not requested from the domain www.google.com but from IP addresses like 216.239.33.102. Additionally, the PageRank files' URLs often contain a parameter "failedip" that is set to values like "216.239.35.102;1111" (Its function is not absolutely clear). However, it is pretty easy to get the new PageRank values. Simply modify the IP addresses in the URL so that the request goes to one of the data centers that already has the new index. The necessary information is given above.
Article Copyright: dance.efactory.de
![]() |
||
![]() |
![]() |
|
![]() |
||
![]() |
![]() |
|
![]() |
![]() |
![]() |