Email Protection

Here’s another bit of code to help protect your site from email harvesters and other robots that troll through the HTML source of your site looking for <a href=“mailto:…”> tags.

Most current email obfuscation works by disguising the email address, i.e. everything after the mailto: part of the link, in order to hide the @-sign.

I’m guessing, though, that trollers have gotten smart to these tricks and will now harvest everything after the characters mailto: up to the closing quote or end-of-tag, then try a bunch of techniques to decode the characters it found. Converting a string like “&102;&111;&111;&64; …” into a useable email address doesn’t take much computing power, does it?

However, what about not using mailto: at all, and instead use a PHP script that does an HTTP Redirect to a mailto: URI?

Your links could look something like this now:


<a href=”/email.php?address=foo@example.com”>mail me</a>

Let’s also obfuscate the actual address, incase the harvesters are simply looking for string@string.string combinations of characters:


<a href=”/send.php?address=foo/example.com”>mail me</a>

Finally, let’s use some simple request-string parsing, and an Apache Force-Type setting to make that even simpler:


<a href=”/send/foo/example.com/Subject”>mail me</a>

All that’s left to do is write the code for the send PHP script. Oh look … here it is.

Combine that with the Turing Protection I mentioned before, and you should have a fairly troll-proof site.

Comments and improvements are always welcome.

20 Responses to Email Protection

  1. I’m using this technique too, for several months now.

    It seems to me, you and I are right and it kinda works quite well. But one unsolved problem remains: Some browsers (if not all) get redirected to an empty page only containing the header information to open an email dialog. That solves the problem, but I think it isn’t nice to leave my users on an empty page. It should be filled with some content that makes sense.

    So I’m in doubt if I want to continue this kind of “spam protection” or if it would be more useful to use a contact form, as I’ve done before.

    What are your considerations on this topic?

  2. Christian:

    I’m glad to hear that I’m not the only one using this technique.

    The “problem” you are seeing with the blank page (I’m guessing), is that the browser opens up a new blank page before it receives the HTTP header redirecting them to a mailto. The better-behaving browsers wait a bit and, when they see the redirecting header, just do the redirect and don’t open a blank page.

    I don’t know if there is a way around this. I’m guessing the misbehaving browsers are old versions of Netscape and IE, so one consolation is that as people upgrade, they will experience this less and less.

    You could try outputting some text after the header is sent. This is pretty much The Wrong Way to do things, against the HTTP spec, etc., but it may work.

    Let me know how it goes!

  3. Here’s what I have done to hide e-mail adresses:

    I’m using the php version of FormMail.

    In the form I use the following:
    input type="hidden" name="recipient" value="realname"

    Then at the top of formmail.php I parse the contents of $recipient and replace it with a regular e-mail adress:

    if ($recipient == "harold") $recipient = "h.bakker@my.domain.nl";

    Works in all browsers.

  4. The problem with forms (IMHO) is that you are limiting the user.

    I don’t like using forms for email. I’d much rather use my (more powerful) mail client. What if I wanted to send you an attachment? Or cc your ISP on what I thought was bad behaviour on your part? Unless you think of all of those scenarios in advance, an HTML form isn’t going to cut it.

  5. > You could try outputting some text after
    > the header is sent. This is pretty much
    > The Wrong Way to do things, against the
    > HTTP spec, etc., but it may work.

    Actually, it’s the right way. Section 10.3.3 of RFC 2616 states:

    “Unless the request method was HEAD, the entity of the response SHOULD contain a short hypertext note with a hyperlink to the new URI.”

  6. I have tested this redirect method with IE6 and Firefox. IE6 gave me mailto:xxx@xxx.com in address field. :( Firefox is good.

    I tried meta refresh instead of the header and surprise! It’s OK! :)

    I will test soon with Opera! :))

    Thank you Colin! This is a greet script! :)

    the html:

    [META HTTP-EQUIV=Refresh CONTENT=”0; URL=mailto:xxx@xxx.com”]
    Thank you for pass the Turing test.

  7. Is the script missing the closing php tag(?>)?

    Can you please include the specifics of getting this script to work with Apache Force-Type?

    Do you have to rename the file without the php extension for it to work?

  8. Glenn:

    The script doesn’t have a closing PHP tag, correct. You actually don’t need them. I’ve been making it a habit of not including them in my code, if there is no HTML afterwards, of course. It has the added benefit of not potentially introducing a few extra newlines or white-space of output at the end of the script—a problem if your script is not outputting text or HTML (e.g. images, HTTP redirection, etc.).

    Anyway, to use Force-Type, you just need something like this in your .htaccess or httpd.conf file:

    <Files send>
    ForceType application/x-httpd-php
    </Files>

    And no, you don’t need the .php extension if you don’t want it. That’s what the Force-Type does: forces that file to be parsed by PHP, regardless of the name/extension. I chose to keep it extensionless because I think it makes the URL look prettier.

  9. Thanks Colin, it works now. Had to change the header location to an absolute url. For some reason it didn’t like the relative url.

    One last question, is the audio not working on your site? I can’t get it to read me the letters on the image. Windows Media Player opens, but it nevers plays anything.

    Great script by the way.

  10. Hi Colin,
    I really appreciate your solution and choosed it for our new website ethersound.com, but…

    I still have a different behaviour depending on the browser; I added a
    Back at the end of the send.php code and it gives me the following results:

    in Mozilla, the page remains the same and a new e-mail client window appears (perfect)
    in Firefox, the page displays the back link and a new e-mail client window appears ( close to perfect)
    in IE, a new e-mail client window appears but the page is blank with no back link.

    Any idea to fix this?

    Thank you
    Jocelyn

  11. Hi,

    for a long time i have not checked your site :-) But still, everytime when i come back, i find new stuff that i like!

    I wanted to code some e-mail protection for me that sets a random number in from of the @. When they come to parse, they will always get a different number. So i can catch the Spammer-IP.

    But i like yours more :-)

    Thx

    Tom

  12. I wrote a script very similar to yours in the form of:

    [a href="/email.php?name=john"]mail me[/a]

    And in email.php there’s (simplified for demonstration purposes)

    [?php header("Location: mailto:".$_GET['name']."@domain.com");?]

    The problem is: won’t the e-mail-bots get back the “mailto:”-location too and simply include the address in their database?

    Thanks!
    Jakob