|
Feeds
- Downloads
- FAQ
- News
- Tutorials
|
|
|
View previous topic :: View next topic |
Author |
Message |
GazJ webmaster
Joined: Mar 20, 2007 Posts: 29
|
Posted: Fri Apr 01, 2011 4:59 pm Post subject: add nofollow and target _blank trough a filter function |
|
|
not sure i would call it a filter function but owell
anyways i was attempting to manipulate the output buffer sent through the headers and soon found draw backs to this so i moved on and decided to apply this to my template system i suppose it could added to check_html function.
the function its self
Code:function playWithHtml($OutputHtml){
if(!preg_match_all("/<a>]+)>(.*)</a>/Usi",$OutputHtml,$Links)){
return $OutputHtml;
}
$InnerHtmls=$Links[2];
$LinkTags=$Links[1];
foreach($LinkTags as $LinkTag){
$OldLinkTag=$LinkTag;
if(preg_match("/href=[\"']?http/i",$LinkTag)){
if(!preg_match("/\starget=/i",$LinkTag)){
$LinkTag=trim($LinkTag).' target="_blank"';
$OutputHtml = str_replace($OldLinkTag,$LinkTag,$OutputHtml);
}
}
}
foreach($LinkTags as $LinkTag){
$OldLinkTag=$LinkTag;
if(preg_match("/href=[\"']?http/i",$LinkTag)){
if(!preg_match("/\srel=/i",$LinkTag)){
$LinkTag=trim($LinkTag).' rel="nofollow"';
$OutputHtml = str_replace($OldLinkTag,$LinkTag,$OutputHtml);
}
}
}
return $OutputHtml;
}
|
now apply it to a string
Code:$sting = playWithHtml($sting);
|
what does it do exactly it uses regex to search for link tags to offsite links or links with http://somelink.com without target _blank and adds target _blank and does the same for nofollow
just a quick convo starter any suggestions to improve this would be welcomed |
|
Back to top |
|
|
Guardian webmaster
Joined: Dec 25, 2005 Posts: 364 Location: Vsetin, Czech Republic
|
Posted: Fri Apr 01, 2011 5:34 pm Post subject: |
|
|
I have seen some simple and also very elaborate ways to do the same thing using java script. The only problem is, the people that devised these efforts forgot one simple fact; generally, spiders don't use java script, so using it to add the nofollow attribute is pointless.
The PHP approach, like your example is really the only way, though it is also pretty simple to add the nofollow ability to the FCKeditor for links in things like News and comments.
You should also keep in mind that only Google adheres to the nofollow attribute religiously. A few bots like MSN/Slurp crawl the links but don't index them (even though they still count it as an outward link in terms of link juice) but the majority simply ignore it.
If your sole goal is to reduce link dilution, the only really affective way to do it is to hide the link and expose it with java script (so it's visible in a browser) or use PHP to hide the link from specific user agents
http://www.code-authors.com/modules.php?name=CA_Snips&op=view_snip&sid=11 |
|
Back to top |
|
|
GazJ
|
Posted: Fri Apr 01, 2011 6:33 pm Post subject: |
|
|
good thinking i will adjust my code to exclude bots thanks
oh and the code needs updating bud theres an eregi
Code:/**
* @author Guardian
* @return <boolean>
* EXAMPLE USEAGE:
* if(!is_spider()) {
* // display hidden content here
* }
*/
function is_spider(){
$spiders = array(
'Googlebot', 'Yammybot', 'Openbot', 'Yahoo', 'Slurp', 'msnbot',
'ia_archiver', 'Lycos', 'Scooter', 'AltaVista', 'Teoma', 'Gigabot',
'Googlebot-Mobile'
);
// Loop through each spider and check if it appears in
// the User Agent
foreach ($spiders as $spider)
{
if (preg_match('/'.$spider.'/i', $_SERVER['HTTP_USER_AGENT']))
{ return TRUE; }
}
return FALSE;
}
|
|
|
Back to top |
|
|
Guardian
|
Posted: Fri Apr 01, 2011 8:03 pm Post subject: |
|
|
Good catch, I have updated the snippet.
Thanks also for the PM - I don't even have a clue where that newuser.css file came from lol |
|
Back to top |
|
|
GazJ
|
Posted: Fri Apr 01, 2011 9:57 pm Post subject: |
|
|
i noticed the notice about spam emails on your registration i know you already fixed the issue but i was searching for a solution to this my self as my site had over 2000 users and all but afew were lets say not right
anyways i found this function for drupal and modified it slightly and added it to nuke validate_mail function
Code:function user_validate_bogus_email($mail){
global $nukeurl;
// http://drupal.org/node/780476#comment-2887028
// reads an entire file and stores it into an array so each line of the file is
// stored into a new element in $spamlist_array.
/*
$spamlist_array[0] = 'This is line 1';
$spamlist_array[1] = 'This is line 2';
*/
$spamlist_array = file($nukeurl.'/includes/spamlist.txt');
// iterates till the end of the file where each element of the array is represented as
// $line_num and the actual value as $value
/*
$spamlist_array[$line_num];
*/
foreach ($spamlist_array as $line_num => $value){
// we want to check whether the current line ($value) is in $mail
// we are not simply checking for $value because it has some type of character that makes the strpos()
// function fail (most probably the new line character)
$realvalue = substr($value, 0, strlen($value) - 1); // string to search in, start_pos, length
$pos = strpos($mail, $realvalue);
if ($pos === false){
return true;
}else{
return false;
}
}
}
function validate_mail($email) {
if(strlen($email) < 7 || !preg_match("/\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*/", $email) ||
user_validate_bogus_email($email) === false) {
// These next 3 lines have been commented out by Raven on 1/14/2007.
// Reason being, this function should only validate the email and return to the calling script.
// The calling script should handle the validation results.
// OpenTable();
// echo _ERRORINVEMAIL;
// CloseTable();
return false;
} else {
return $email;
}
}
|
spam list link http://compuweb.com/url-domain-bl.txt
any thoughts on this as of yet it is untested i will also be redoing the user registration to confuse things a little |
|
Back to top |
|
|
Guardian
|
Posted: Fri Apr 01, 2011 11:50 pm Post subject: |
|
|
It really depends on what you are trying to achieve.
I spent a long time trying to find something to correctly validate an email address for a form builder Class I'm working on and the consensus seems to be that it is something of a holy grail.
I have not yet seen any single piece of code that validates an email address 100% correctly to the required RFC specifications.
If you are validating to "the address conforms to the RFC specification" the one used in RN is probably the closest as I know Raven did a lot of research but it is really heavy on resources due to those regex's.
I was speaking with one of the Facebook guys a few weeks ago about this very thing and he said they use the perfect code (which he wouldn't share) but in actual fact they don't. If you try to change your Facebook email address to a legitimate address with a hyphen in it, it falls over - unless they have fixed it.
What I'm doing at the moment, for the sake of efficiency is;
Code:
$email = 'you@atyourdomain.com';
if(filter_var($email, FILTER_VALIDATE_EMAIL)) {
// this is valid proceed
}
else {
// filter again with RN function in mainfile.php
validate_email($email);
}
|
If you are trying to validate it as "this doesn't belong to a spammer" you might want to hang on a couple of weeks for Site Guardian to be released as I'm building hooks into RNYA to prevent known bad domains from being used for registrations. And I have a LOT of them |
|
Back to top |
|
|
GazJ
|
Posted: Sat Apr 02, 2011 12:38 am Post subject: |
|
|
well im currently working on a new site for myself so i have time to wait as cleaning up stock nuke latest patch takes awhile damn ereg's lol
also im removing intval's in favour of int's and other performance related stuff just to help speed things up without the use of cache then its onto rewriting the your account, news and downloads modules so theres alot todo so yup i can wait lol |
|
Back to top |
|
|
kguske Site Admin
Joined: May 12, 2005 Posts: 876
|
Posted: Tue Apr 05, 2011 9:55 pm Post subject: |
|
|
GazJ, regarding my other post referring you to Site Guardian...looks like you already know of it. _________________ |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001-2008 phpBB Group
|
|
|