Here’s another bit of code to help protect your site from email harvesters and other robots that troll through the HTML source of your site looking for <a href=“mailto:...”> tags.
Most current email obfuscation works by disguising the email address, i.e. everything after the mailto: part of the link, in order to hide the @-sign.
I’m guessing, though, that trollers have gotten smart to these tricks and will now harvest everything after the characters mailto: up to the closing quote or end-of-tag, then try a bunch of techniques to decode the characters it found. Converting a string like “&102;&111;&111;&64; ...” into a useable email address doesn’t take much computing power, does it?
However, what about not using mailto: at all, and instead use a PHP script that does an HTTP Redirect to a mailto: URI?
Your links could look something like this now:
<a href="/email.php?address=foo@example.com">mail me</a>
Let’s also obfuscate the actual address, incase the harvesters are simply looking for string@string.string combinations of characters:
<a href="/send.php?address=foo/example.com">mail me</a>
Finally, let’s use some simple request-string parsing, and an Apache Force-Type setting to make that even simpler:
<a href="/send/foo/example.com/Subject">mail me</a>
All that’s left to do is write the code for the send PHP script. Oh look … here it is.
Combine that with the Turing Protection I mentioned before, and you should have a fairly troll-proof site.
Comments and improvements are always welcome.
Copyright © 2000-2010 Colin Viebrock • All Rights Reserved
It seems to me, you and I are right and it kinda works quite well. But one unsolved problem remains: Some browsers (if not all) get redirected to an empty page only containing the header information to open an email dialog. That solves the problem, but I think it isn’t nice to leave my users on an empty page. It should be filled with some content that makes sense.
So I’m in doubt if I want to continue this kind of “spam protection” or if it would be more useful to use a contact form, as I’ve done before.
What are your considerations on this topic?
14 April 2004, 04:04 • PermaLink
15 April 2004, 04:08 • PermaLink
I’m glad to hear that I’m not the only one using this technique.
The “problem” you are seeing with the blank page (I’m guessing), is that the browser opens up a new blank page before it receives the HTTP header redirecting them to a mailto. The better-behaving browsers wait a bit and, when they see the redirecting header, just do the redirect and don’t open a blank page.
I don’t know if there is a way around this. I’m guessing the misbehaving browsers are old versions of Netscape and IE, so one consolation is that as people upgrade, they will experience this less and less.
You could try outputting some text after the header is sent. This is pretty much The Wrong Way to do things, against the HTTP spec, etc., but it may work.
Let me know how it goes!
15 April 2004, 06:29 • PermaLink
I’m using the php version of FormMail.
In the form I use the following:
input type="hidden" name="recipient" value="realname"Then at the top of formmail.php I parse the contents of $recipient and replace it with a regular e-mail adress:
if ($recipient == "harold") $recipient = "h.bakker@my.domain.nl";Works in all browsers.
19 April 2004, 15:43 • PermaLink
I don’t like using forms for email. I’d much rather use my (more powerful) mail client. What if I wanted to send you an attachment? Or cc your ISP on what I thought was bad behaviour on your part? Unless you think of all of those scenarios in advance, an HTML form isn’t going to cut it.
20 April 2004, 17:48 • PermaLink
29 April 2004, 15:29 • PermaLink
> the header is sent. This is pretty much
> The Wrong Way to do things, against the
> HTTP spec, etc., but it may work.
Actually, it’s the right way. Section 10.3.3 of RFC 2616 states:
“Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI.”
29 April 2004, 21:40 • PermaLink
20 May 2004, 12:14 • PermaLink
I tried meta refresh instead of the header and surprise! It’s OK! :)
I will test soon with Opera! :))
Thank you Colin! This is a greet script! :)
the html:
[META HTTP-EQUIV=Refresh CONTENT=”0; URL=mailto:xxx@xxx.com”]
Thank you for pass the Turing test.
15 October 2004, 12:03 • PermaLink
Can you please include the specifics of getting this script to work with Apache Force-Type?
Do you have to rename the file without the php extension for it to work?
16 November 2004, 21:05 • PermaLink
The script doesn’t have a closing PHP tag, correct. You actually don’t need them. I’ve been making it a habit of not including them in my code, if there is no HTML afterwards, of course. It has the added benefit of not potentially introducing a few extra newlines or white-space of output at the end of the script—a problem if your script is not outputting text or HTML (e.g. images, HTTP redirection, etc.).
Anyway, to use Force-Type, you just need something like this in your .htaccess or httpd.conf file:
<Files send>
ForceType application/x-httpd-php
</Files>
And no, you don’t need the .php extension if you don’t want it. That’s what the Force-Type does: forces that file to be parsed by PHP, regardless of the name/extension. I chose to keep it extensionless because I think it makes the URL look prettier.
16 November 2004, 22:53 • PermaLink
One last question, is the audio not working on your site? I can’t get it to read me the letters on the image. Windows Media Player opens, but it nevers plays anything.
Great script by the way.
17 November 2004, 10:00 • PermaLink
I really appreciate your solution and choosed it for our new website ethersound.com, but…
I still have a different behaviour depending on the browser; I added a
Back at the end of the send.php code and it gives me the following results:
in Mozilla, the page remains the same and a new e-mail client window appears (perfect)
in Firefox, the page displays the back link and a new e-mail client window appears ( close to perfect)
in IE, a new e-mail client window appears but the page is blank with no back link.
Any idea to fix this?
Thank you
Jocelyn
10 January 2005, 06:04 • PermaLink
25 February 2005, 03:55 • PermaLink
Not anymore I won’t … read the latest article.
25 February 2005, 10:56 • PermaLink
1 March 2005, 05:37 • PermaLink
for a long time i have not checked your site :-) But still, everytime when i come back, i find new stuff that i like!
I wanted to code some e-mail protection for me that sets a random number in from of the @. When they come to parse, they will always get a different number. So i can catch the Spammer-IP.
But i like yours more :-)
Thx
Tom
19 August 2005, 10:41 • PermaLink
9 September 2005, 23:45 • PermaLink
[a href="/email.php?name=john"]mail me[/a]And in email.php there’s (simplified for demonstration purposes)
[?php header("Location: mailto:".$_GET['name']."@domain.com");?]The problem is: won’t the e-mail-bots get back the “mailto:”-location too and simply include the address in their database?
Thanks!
Jakob
24 October 2005, 07:33 • PermaLink
Quite possibly. But, by using a PHP script instead of a simple
mailtolink, you can add other protective features, like the Turing test I have in place on this site.24 October 2005, 09:07 • PermaLink