Redirecting Dynamic Affiliate URLs - Search Engine Optimization with PHP

1.

Create a new file named url_utils.inc.phpin your seophp/includefolder, and type this code in:

<?php

// transforms a query string into an associative array function parse_query_string($query_string)

{

// split the query string into individual name-value pairs

$items = explode(‘&‘, $query_string);

// split the name-value pair and save the elements to $qs_array

$pair = explode(‘=’, $i);

// removes a parameter from the query string function remove_query_param($url, $param) {

// extract the query string from $url

$tokens = explode(‘?’, $url);

$url_path = $tokens[0];

$query_string = $tokens[1];

// transform the query string into an associative array

$qs_array = parse_query_string($query_string);

// remove the $param element from the array unset($qs_array[$param]);

// create the new query string by joining the remaining parameters

$new_query_string = ‘’;

if ($qs_array) {

foreach ($qs_array as $name => $value)

{

$new_query_string .= ($new_query_string == ‘’ ? ‘?’ : ‘&‘) . urlencode($name) . ‘=’ . urlencode($value);

} }

// return the URL that doesn’t contain $param return $url_path . $new_query_string;

}

2.

In your ^seophpfolder, create a new script named aff_test.phpand type the following code:

<?php

// obtain the URL with no affiliate ID

$clean_url = SITE_DOMAIN . remove_query_param($_SERVER[‘REQUEST_URI’], ‘aff_id’);

// 301 redirect to the new URL

header(‘HTTP/1.1 301 Moved Permanently’);

header(‘Location: ‘ . $clean_url);

}

// display affiliate details

echo ‘You got here through affiliate: ‘;

if (!isset($_SESSION[‘aff_id’]))

3.

Load http://seophp.example.com/aff_test.phpand expect to get the results displayed in Figure 5-3.

Figure 5-3

4.

_Loadhttp://seophp.example.com/aff_test.php?a=1&aff_id=34&b=2, and expect to be redirected to http://localhost/seophp/aff_test.php?a=1&b=2, with the affiliated ID being retained, as shown in Figure 5-4.

Figure 5-4

Although you’ve written quite a bit of code, it’s pretty straightforward. There’s no URL rewriting involved, so you’re working with plain-vanilla PHP scripts this time.

Your aff_test.phpscript starts by loading url_utils.inc.phpand config.inc.php, and starting the PHP session:

<?php

// include URL utils library

require_once ‘include/url_utils.inc.php’;

// load configuration script

require_once ‘include/config.inc.php’;

// start PHP session session_start();

Then you verify if the ^aff_idquery string parameter is present. In case it is, you save its value to the user’s session, and redirect to a version of the URL that doesn’t contain ^aff_id. The remove_query_param() function from the URL utils library is very handy, because it correctly preserves all existing parameters:

// redirect affiliate links if (isset($_REQUEST[‘aff_id’]))

{

// save the affiliate ID

$_SESSION[‘aff_id’] = $_REQUEST[‘aff_id’];

// obtain the URL with no affiliate ID

$clean_url = SITE_DOMAIN . remove_query_param($_SERVER[‘REQUEST_URI’], ‘aff_id’);

// 301 redirect to the new URL

header(‘HTTP/1.1 301 Moved Permanently’);

header(‘Location: ‘ . $clean_url);

}

Finally, aff_test.phpchecks whether there’s a session parameter named ^aff_idset. In case it is, your user got to the page through an affiliate link, and you’ll need to take this into account when dealing with this particular user. For the purposes of this simple exercise, you’re simply displaying the affiliate ID, or (no affiliate)in case the page was first loaded without an ^aff_idparameter.

// display affiliate details

echo ‘You got here through affiliate: ‘;

if (!isset($_SESSION[‘aff_id’]))

It’s also worth analyzing what happens inside the remove_query_param()function from ^url_utils .inc.php. The strategy you took for removing one query string parameter is to transform the existing query string to an associative array, and then compose the associative array back to a query string, after removing the ^aff_idelement.

To keep the code cleaner, you have a separate function that transforms a query string to an associative array. This function is named parse_query_string(), and receives as parameter a query string, such as a=1&aff_id=34&b=2, and returns an associative array with these elements:

‘a’ => ‘1’

‘aff_id’ => ‘34’

‘b’ => ‘2’

The function starts by using ^explode()to split the query string on the ^&character:

// transforms a query string into an associative array function parse_query_string($query_string)

{

// split the query string into individual name-value pairs

$items = explode(‘&‘, $query_string);

If $query stringis a=1&aff_id=34&b=2, then the ^itemsarray will have these three elements:

$items[0] => ‘a=1’

$items[1] => ‘aff_id=34’

$items[2] => ‘b=2’

You then parse each of these items using the ^foreach,and you split each item on the ⁼character using explode. Using the resulting data you build an associative array named ^$qs_array, which is returned at the end. You use the urldecode()function when saving each query string parameter, because special characters may have been encoded:

// split the name-value pair and save the elements to $qs_array

$pair = explode(‘=’, $i);

We looked at parse_query_string()because it’s used by remove_query_param(), which is the function of interest — as you remember, it’s called from aff_test.phpto remove the ^aff_idparameter from the query string.

The remove_query_param()function receives two parameters: the URL and the parameter you want to remove from that URL. The URL will be something like /aff_test.php?a=1&b=2. Because the query string is what you need for further processing, remove_query_param()starts by splitting the ^$url parameter into an ^$url_pathand a $query_string:

// removes a parameter from the query string function remove_query_param($url, $param) {

// extract the query string from $url

$tokens = explode(‘?’, $url);

$url_path = $tokens[0];

$query_string = $tokens[1];

Next, you use the parse_query_string()function to transform the query string into an associative array:

// transform the query string into an associative array

$qs_array = parse_query_string($query_string);

As mentioned earlier, the associative array will be something like this:

‘a’ => ‘1’

‘aff_id’ => ‘34’

‘b’ => ‘2’

The good thing with associative arrays, and the reason for which it’s sometimes worth using them, is that they’re easy to manipulate. Now that you have the query string broken into an associative array, you only need to ^unset()the parameter you want to remove. In this case, ^$paramwill be ^‘aff_id’:

// remove the $param element from the array unset($qs_array[$param]);

You now have an associative array that contains the elements you need in the new query string, so you join these elements back in a string formatted like a query string. Note that you only do this if the asso-ciative array is not empty, and that you use the ternary operator to compose the query string:

// create the new query string by joining the remaining parameters

$new_query_string = ‘’;

if ($qs_array) {

foreach ($qs_array as $name => $value) {

$new_query_string .= ($new_query_string == ‘’ ? ‘?’ : ‘&‘) . urlencode($name) . ‘=’ . urlencode($value);

} }

As soon as the new query string is composed, you join in back with the URL path you initially stripped (which in this case is /aff_test.php), and return it:

// return the URL that doesn’t contain $param return $url_path . $new_query_string;

}

We hope this has proven to be an interesting string handling exercise. For those of you in love with regu-lar expressions, we’re sure you may guess they can be successfully used to strip the ^aff_idparameter from the query string. Here it is if you’re curious:

function remove_query_param($url, $param) {

// remove $param

$new_url = preg_replace(“#aff_id=?(.*?(&|$))?#“, ‘’, $url);

// remove ending ? or &, in case it exists

if (substr($new_url, -1) == ‘?’ || substr($new_url, -1) == ‘&‘) {

$new_url = substr($new_url, 0, strlen($new_url) - 1);

}

What about the ternary operator? If you aren’t familiar with it, here’s a quick expla-nation. The ternary operator has the form (condition ? valueA : valueB). In case the condition is true, it returns ^valueA, otherwise it returns ^valueB.

In your case, you verify if $new_query_stringis empty; if this is true, then you first add the ?character to it, which is the delimiter for the first query string parameter.

For all the subsequent parameters you use the &character.

// return the new URL return $new_url;

}

We’ll leave understanding this regular expression for you as homework.

Summar y

We would summarize the entire chapter right here in this paragraph, but that would be duplicate content!

Because we don’t wish to frustrate our readers, we will keep it short.

Ideally, every URL on a web site would lead to a page that is unique. Unfortunately, it is rarely possible to accomplish this in reality. You should simply eliminate as much duplication as possible. Parameters in URLs that do not effect significant changes in presentation should be avoided. Failing that, as in the prob-lems posed by using breadcrumb navigation, URLs that yield duplicate content should be excluded. The two tools in your arsenal you can use to accomplish this are ^robots.txtexclusion and meta-exclusion.

SE-Friendly HTML

Dans le document Search Engine Optimization with PHP (Page 137-144)