• Aucun résultat trouvé

Redirection Using 301 and 302

Dans le document Search Engine Optimization with PHP (Page 104-108)

The 301 and 302 codes are the HTTP status codes used for redirection. These codes indicate that another request must be made in order to fulfill the HTTP request — the content is located elsewhere. When a web page replies with either of these codes, it does not return any HTML content, but includes an additional Location:HTTP header that indicates another URL where the content is found.

Figure 4-3 shows an example of how redirects occur in practice. As you can see, when a redirect occurs, the URL that issues the redirect doesn’t return any content, but indicates the new URL that should be referenced instead.

Figure 4-3

Note that in case the user agent is a search engine spider or a software application, there is not a user involved in the process, as shown in Figure 4-3. Search engines follow the same basic process to update SERPs when they encounter a redirect.

Redirections can be chained, that is, one redirect can point to a page that, in turn, redirects again. However, multiple redirects should be avoided to the extent that it is possible. A maximum of five redirections was stipulated by an older version of RFC 2616, but that limit was later lifted. Regardless, it is wise to avoid chained redirects because they can slow down site spidering — spiders may only schedulethe result of the redirection for spidering instead of immediately fetching it.

We recommend chaining no more than three redirects.

The HTTP standard actually contains many redirection status codes. They are listed in Table 4-1.

In practice only the 301 and 302 status codes are used for redirection. Furthermore, because browsers are known to struggle with certain of the other status codes, it is probably wise to avoid them, even if they seem more relevant or specific. It can only be assumed that search engines may also struggle with them, or at least that it is not entirely understood how they should be interpreted.

Web client User

URL A replies with a status code indicating a redirect to URL B Web server at URL A

URL B replies with a status code of 200 and some HTML content Web server at URL B User requests URL A

User is displayed the content received from URL B

Web client requests URL A

Web client requests URL B, updating the address bar to display URL B

Web client reads the content received from URL B

Table 4-1

301

The 301 status code indicates that a resource has been permanently moved to the new location specified by the Location:header that follows. It indicates that the old URL is obsolete and should replace any references to the old URL with the indicated URL.

Take as an example a fictional page named http://www.example.com/old_page.php, which returns this header:

HTTP/1.1 301 Moved Permanently Date: Wed, 14 Jun 2006 09:50:39 GMT Server: Apache/2.0.54 (Unix) X-Powered-By: PHP/5.0.4

Location: http://www.example.com/new_page.php Content-Length: 0

Connection: close

Content-Type: text/html; charset=ISO-8859-1

When loading the page in a web browser, the response will be automatically redirected to the new loca-tion specified by the Locationheader. After the redirection, the back button in your browser won’t ref-erence the initially requested page, as a result of the old page being permanentlyredirected.

The 301 status code also indicates to search engines that link equity from the previous URL should be credited to the new one. In theory, the new page will inherit the rankings of the original page. In prac-tice, however, it may take some time for this to occur. It would be wise not to frivolously change URLs regardless, if this is a concern.

Status Code Description

300 Multiple choices

301 Moved permanently

302 Found

303 See other

304 Not modified

305 Use Proxy

307 Temporary Redirect

The 301 status code is arguably the most important one when it comes to search engine optimization. The largest part of this chapter is dedicated to this status code, and to exercises demonstrating its use. But before getting to the exercises, the following sections examine the 302, 404, and 500 status codes. These are important to understand as well.

302

The 302 status code is a bit ambiguous in meaning. It indicates that a resource is “temporarily” moved. The old URL is not obsolete at all, and clients will not cache the result unless indicated explicitly by a Cache-Controlor Expiresheader. To confuse things even further, 302 is also used for some paid advertising links, but that is not the usage discussed here.

The big problem with the 302 status code is that its meaning for search engines depends on its context.

In practice, it is worth dividing them into internaltemporary redirects, that is, from a page on domain A to another page on domain A, and externaltemporary redirects, from a page on domain A to a page on domain B.

Browsers always abide by the definitions for interpreting a 302 redirect — both internal and external.

However, today, most search engines (Google and Yahoo! included) only use it for an internal302. For an internal 302 redirect, then, search engines will not cache the result of the redirect, and continue to list domain A in the SERPs. This is consistent with the definition.

External 302 redirects are more of a problem. Matt Cutts of Google states that more than 99% of the time, Google will list the result with the destination result, that is, domain B, instead of domain A. This is against the standard, and Google behaves like this to mitigate a vulnerability called “302 hijacking.”

302 hijacking refers to the practice of using a page on domain A to link to a page on domain B, which has fresh quality content. Typically, that page would rank well based on that “stolen” fresh content from domain B, and employ a form of cloaking to redirect users to another page. This practice became prevalent enough to warrant a change in policy from both Google and Yahoo!, and, according to Matt Cutts, “Google is moving to a set of heuristics that return the destination page more than 99% of the time. Why not 100% of the time? Most search engines reserve the right to make exceptions when we think the source page will be better for users, even though we’ll only do that rarely.”

In the article at http://www.mattcutts.com/blog/seo-advice-discussing-302-redirects/, Matt Cutts discusses external 302s. In this case, the RFC definition is not the rule — it is the exception!

For the most part, external 302s are treated as 301s, but they do not affect the transference of link equity.

In practice, on a dynamic site, you should evaluate whether 302s are necessary anyway. If you want a URL to host some different content than the usual temporarily, it is better to change the content transparently.

Possible implementations include using an include(), or fetching and displaying the alternative content remotely, obviating the need for the 302 in the first place. To do this you can use the cURL functions in PHP — Client URL Library.

Supposing the old page was called old_page.php, and the page that contains the required content is called new_page.php, you could simply include the content from the latter page as follows:

include(‘new_page.php’);

To include the content from a different server using cURL, you could do something like this:

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL,’http://www.example.com/new_page.php’);

curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);

echo curl_exec($ch);

Dans le document Search Engine Optimization with PHP (Page 104-108)