Stoping Duplicate Related Links, Who Voted Content In Drigg For Drupal

drigg-duplicate-contentThe other day we were working on a Drigg based site and noticed that a lot of the posted articles had up to three separate links to the same content indexed within google. This is due to Drigg’s handling of Who Voted and Related Links on a submitted story page being on separate indexable pages from the main article, this in turn causes google to index three separate instances of the same content not good. We ended up modifying the Drigg Module to stop these pages being followed from the main Story page, this article will show you how we achieved this.

First i will explain exactly what the duplicate content issue you might face with Drigg will be. On a posted articles story page you will see three tabs under the article Submission titled Comments, Who Voted and Related Links. These tabs are indexable by google and the source of duplicate content.

drigg-duplicate-content

I have used an example below to show a typical drigg url and we will use this to explain the duplicate content issue.

Drigg URL Example

http://yourdriggsite.com/Programming/Google-Dont-Like-Duplication

The URL above is a link to your main article page after you have submitted the article to your drgg website. Drigg also creates two other liks for each article page however, one for Who Voted and another for Related Links and this is where the problem lies. Below are some URL examples of what the Who Voted and Related Links Urls would look like for our subitted article.

Drigg Main URL Example

http://yourdriggsite.com/Programming/Google-Dont-Like-Duplication

Drigg Who Voted Tab URL Example

http://yourdriggsite.com/Programming/Google-Dont-Like-Duplication/who_voted

Drigg Related Links Tab URL Example

http://yourdriggsite.com/Programming/Google-Dont-Like-Duplication/related_links

As you can see from the examples above these links can lead to google indexing basically the same story up to three individual times. After further inspection it could prove to be that there are actually Four independent links thats possible to reach the same story through.

The Comments tab provides a different link from the ones above to the article which again could be indexed by Google as a fourth url by adding a trailing slash after the URL. This example below will show better #what we are talking about.

Drigg Main URL Example

http://yourdriggsite.com/Programming/Google-Dont-Like-Duplication

Drigg Comments Tab URL Example

http://yourdriggsite.com/Programming/Google-Dont-Like-Duplication/

Note the trailing sash after the Comments tab URL, we have seen with wordpress before that google will not distinguish between no / and a url ending with a / and in fact can index both separately.

Okay so now we know how so many urls with the same content are getting index from you Drigg based website lets find a quick solution by adding a rel=”nofollow” to the article page tabs for Drigg. Since the way drigg works separates most of the coe from the templates you wont find the tab code inside you template files for drigg as they are located in the drigg modules drigg_ui.module file.

Instructions

First open the file below with a text editor we recommend PSPad.

/sites/all/modules/drigg/drigg_ui/drigg_ui.module

Look for the code snippet below, it should be on line 1066.

    $output .= "
  • ".''. $item_value .'
  • ';

    Replace that code with this snippet below.

        $output .= "
  • ".''. $item_value .'
  • ';

    Save the file and upload it to your server, and your done.

    This will stop google following those links when it indexes your main articles page withing Drigg. You can use a tool like SEOQuake for Firefox to check that your links and set SEOQuake to strike out any links that have a rel=”nofollow” tag.

    If you enjoyed this post, make sure you subscribe to my RSS feed!

    Article Details

    #

    Author: on January 27th, 2009

    Category: Drigg, Drupal

    Tags: , , , ,

    1. Adam says:

      Great tip – worked perfectly.

      Just a note For me, that snippet of code on /sites/all/modules/drigg/drigg_ui/drigg_ui.module was on line 973.

      Custom editing Drigg will probably move this up/down since its towards the bottom but if you can’t find it, it’s on there… CTRL+F a part of it and you’ll locate it somewhere towards the bottom.

    2. Lincoln says:

      Hi Adam,

      Thanks for the update on location, just out of interest was it drupal 5 or 6 version of drigg you modified. Reason i ask is there have been several instances of the code being on a different line from the original post in different drigg versions.

      Any more info would be great, thanks.

    3. Jabber says:

      It seems the code didn’t got pasted properly due to formatting and parsing problem, is there any change u can paste it again ?

    4. Jabber says:

      I meant, is there any chance you can paste the proper code again ? 🙂

    5. nikdenis says:

      Why so complicated method? It can be done much more easier, by blocking who_voted in robots.txt ( Disallow: /*who_voted$ )

    6. nikdeni says:

      Ok google is still indexing pages blocked by robots, so meta nofollow will be much more effective.