How To Phish, Protect Your Email, and Defeat Copy-And-Paste with CSS
It’s not often that you learn something from spam, besides that there are an extraordinary number of generous Nigerians (replete with theme song) and amazing number of variations in the spelling of viagra. Yet, I recently got spam where the offer was written in pristine English: no numbers replacing letters, no images, and no misspellings. How had such a brazen piece of spam got through my filters? The answer, it turns out, was some clever CSS that caused the HTML markup to be garbled but its visual rendering to be readable. I’ll show you how to use this for both good and evil.
The Good
Using CSS to obfuscate HTML can be used to protect your email address in a quick-as-you-type way. No need for unwieldy inline images or arcane Javascript functions. For example, here’s my email address:
azSPAMa@masREMOVEsivehealth.com This email address cannot be copy and pasted. Try it.
Notice that it appears to be in normal, selectable text. Don’t be fooled by appearances. Copy and paste it. You’ll get the following: azSPAMa@masREMOVEsivehealth.com, which is easy for a human to fix but difficult for a bot. How does it work? Here’s the sample code:
<style>
.z{
float:right;
font-size:.001px;
color:transparent;
display:inline-block;
width:0px;
}
</style>
...
az<span class="z">SPAM</span>a@mas<span class="z">
REMOVETHIS</span>sive<span class="y">health.c</span>om
This takes anything marked with the class z and visually hides it, leaving only the email address rendered as if nothing was amiss. When you go to copy my email address, however, the browser doesn’t see the garbage text as hidden and so it gets selected along with my email address. This is the same way that your DNA protects itself, by hiding the real information in a huge amount of noise. It would be difficult for a scraper to even know that there was an email address there, let alone implement a program that understands CSS well enough to parse it out: there are a semi-infinite number of ways to use CSS to make text visually disappear. If you wanted to get really tricky, you can add multiple classes to each little span; because CSS cascades and inherits it becomes very hard to know, without just looking at the results, what the text will end up saying.
Foiling spammers with a bit of their own medicine brings a smile to my lips.
The Bad & Evil
Besides spammers using this trick to get around your Bayesian spam filters, there are other bad things for which this can be used. This first is for that misguided holy-grail of publishers: copy-protection for their words. A publisher could generate, on the server side, a new random mess of HTML and CSS that would render their text uncopyable. This also has the side-effect of making your pages impossible for search engines to index sensibly; it’s an easy way to keep your information human-readable but cloaked from Google’s all-seeing Sauronic eye.
Here’s a simplified example of how a publisher might use this:
Try copying and pasting this. It will only give you garbage. Then look at the source. It’s pretty unintelligible. Note that this is a quickly-coded example and could be made much harder to reverse engineer.
mTheIismM trpexsmt aEiscy uexnc nopiNyanxblaie.vk KnAeiLAthev Roaup zerant dwMutnrde.oc‘mh eswaaIs svbo LrnsB inyn .MehMlbilou Crnnae wGineo 1gx93 z1.oC Advt vothlae pBtiavmefI, rGhi rs fkfaizthsherld, nGKetxitarh MMuourdlyoc rh,es wdkasle avE rryegdlioblna ml iFnelzwsTEpaaqpe ar eomaezgnrvathbe robaBfseasd cmou yt aIof P MizelrwbopIursOneih a vndhO ayNs rIa Hrel.suoPltun tiphegt ftnameGilrLy kvwaozs lMwedfalevthtMy. n Reeupeuerost fwa hs oAgrmdooomme Ed npbyul hisis F fhwattFhecHr uzfrdAombM a Jn eneaiqrleOy eMag ze,yp apwndlx w.venest msofrof sptoaL scatueudy b piGhinploaasovqphnFy,Ld peNolotit Micads dkant d etec‘LoneHomazicsas natsp Onyxf .orhrd ieUn civnPerwksieytygF i vn owEndnglvcanlcd.p
The other evil thing this could be used for is phishing attacks. Sometimes you run across a URL unlinked on a web page. To go there, you copy and paste it into your location bar. In fact, we are often told that the only way to really trust a website’s location is to put it there yourself. Try copy and pasting this URL into your location bar:
http://facebook.com.evil.comThis URL will phish you if copy-and-pasted into the location bar.
While you think you copied one url, it directs you to an entirely different, evil URL. While this particular example is easy to detect for pedagogical reasons, you can use all of the standard phishing methods of disguising a URL to make it look more legitimate.
Brainstorm
I’m sure there are other good and bad things you can do with this technique—these are the ones I came up with in a couple minutes. What are your ideas?
RT @aza How To Phish, Protect Your Email, and Defeat Copy-And-Paste with CSS | Follow @aza on Twitter | All blog posts
Sridhar Ratnakumar
When the CSS is not used for rendering HTML (feed readers like Google Reader ignore the site css), your original email is never shown correctly.
One simple solution is to embed your email address in a TABLE. LiveJournal uses this for user profile pages. No CSS dependency. For example, view the HTML source of this page to see how the email address is embedded in a TABLE – http://lustymonk.livejournal.com/profile
nakliyat
evet google reader aslında güzel ama kafakarıştırıcı
Simon
Same remark here. I was reading your post in Google Reader and missed the point, you need the CSS. Does an RSS reader strip inline CSS?
nakliyat
bende rss kullanıyorum ama hiç yararını görmedim
TrueChoice
There are two other CSS methods mentioned at
http://en.wikipedia.org/wiki/Address_munging#Alternatives
The text mini-logo one would still be readable if the inline css wasn’t applied. It’d just turn into a big-arse logo :)
George
I have seen script such as the following work well.
<!–
x5h=’‘,l6m,’@',g8x,
”); // –>
Blair McBride
I was about to post about the possibility of a bot simply stripping anything enclosed in additional tags.
But I now see you’ve already done what I was about to suggest: add an additional tag for a valid piece of the address (in the example the “a.c”).
I’d like to see more of these type of techniques on the net, rather than some of the horrible ways some sites use.
George
this might work this time
<!–
x5h=’‘,l6m,’@',g8x,
”); // –>
striped out the script tags
Mossop
You are placing a lot of faith in people actually noticing that the email has changed when they paste it into their mail. I know I wouldn’t
Kevin
Hi,
I’m using Safari 4 Beta and when I’m copypasting your e-mail, I actually get the real one.
Would a Webkit-using spam robot get the same ?
Mansoor Ahmed
I have the exact same query Kevin! :)
Aza Raskin
Good catch—I’ve changed the CSS so that it works in Safari, Chrome, Opera, and Firefox.
Dave
Works too well… When this post shows up on planet.mozilla.org the example shows as azspama@moremovethiszilla.com both times. I was confused on first read. ;)
Kuno
Same remark here. Did you try to read your post at planet.mozilla.org?
[ICR]
The good thing is that it degrades fairly gracefully. Yes, if you don’t have CSS it will show the obsfucated one, but (if you don’t mangle it quite as much as Aza) it’s not really terribly different from what hundreds of people do already.
You could do something interesting like:
.z{ position:absolute; visibility: hidden; }
#username::after { content: “@”; }
#domain::after { content: “.”; }
andrewjanuary at gmail dot com
Though that doesn’t work in a most of older browsers.
As someone already said, it does rely on people realising it’s changed, so I think it’s probably safer to just list the email with the obsfucation in place.
[ICR]
Ack, when will people realise eating tags is bad!
Hopefully this will work:
<style>
.z{ position:absolute; visibility: hidden; }
#username::after { content: “@”; }
#domain::after { content: “.”; }
</style>
<span id=”username”>andrewjanuary</span><span class=”z”> at </span><span id=”domain”>gmail</span><span class=”z”> dot </span>com
James Heaver
How well will screen readers and accessibility devices handle the above?
Also, could the above technique be used to produce captchas? Mechanical turk techniques have made captchas easy to beat, but perhaps a technique like this could make it more difficult
There isn’t a single image to pick up and pass to a human, you would either have to identify the bit of text to display, or pass the whole page. With typing captchas such a quick process, perhaps this could increase the cost by a factor of four or five as the human has to break the repetitive routine to search teh page for the captcha.
I don’t know CSS at all, but could a similar technique be used with traditional captcha images aswell – displaying a real image along with a number of coded, but hidden fake captchas?
voracity
These guys are implementing “DRM” using a slightly different technique:
http://www.misaustralia.com/viewer.aspx?EDP://20080708000020876768&magsection=news-headlines-list&portal=_misnews§ion=news&title=Marriage+made+in+customer+heaven&source=/_xmlfeeds/mis/news/feed.xml
If you try to copy and paste, you’ll only get every second letter. Although they have to used fixed-width fonts and it’s easier to defeat.
Dao
Not particularly user friendly …
I prefer this:
<a>foo at bar.com</a>
respectively:
<a>contact me (foo at bar.com)</a>
… and then fix it up with JS:
http://phpfi.com/330174
Dao
Btw, when using CSS, I think you want display:none rather than position:absolute;left:-100px;.
Aza Raskin
@ Sridhar et al: Bleck. I forgot entirely about feed readers. It would certainly be nice if they didn’t strip tags. I like the TABLE trick, but it seems to be attackable simply by striping the tags, which is I think one of the more common ways of creating a scrapper.
@Simon, Dan: I normally put the garbage characters in all caps, which helps readability. Or I’ll put other delimiter letters. That way it looks like azSPAMa@moREMOVETHISzilla.com. I think I’ll update the article to put it back this way — I just didn’t want to get comments saying that a spammer could remove any all caps letters…
@IRC: That’s a wonderful use of the CSS!
@Dao: The problem with foo at bar.com style email address obfuscation is that it is trivial to write a regexp to find and scrap the email address. In fact, I would assume that because that method is so popular, scrappers have long since gotten wise. And although the user-interaction isn’t great when copying, visually it looks perfect on-page.
@James: Good question. I think the reader would probably read it out, so it won’t sound perfect, but a human should be able to figure it out. It’s at least better than an image.
@Kevin: That’s great from a user-perspective. I was trying to find some magic CSS to make that happen in Firefox, but couldn’t manage it. It wouldn’t be possible to write a Webkit scrapper because it would be very difficult for a script to know visually where the email address was.
kourge
My normal approach would be something like:
kourge AT gmail DOT com
And then use JavaScript to replace the AT and DOT with the correct characters. Not only are users able to copy and paste correctly, if the user is using a screen reader, the reader would read it out loud correctly as well.
kourge
Oops, I guess those tags got swallowed up. I mean to wrap the text with a span tag whose class is “email”.
elliottcable
I just don’t even bother.
Hey, spammers! Here’s my e-mail address!
azarask.in-spam@elliottcable.com
I haven’t gotten more than one spam mail to any given domain in 6 years – instead of removing/moving to a folder I ignore, my scripts flag mail they think is spam, so I can decide whether or not to blacklist a particular address (and notify the owner of the relevant domain that their site is somehow leaking e-mail addresses).
Wildcards are the shit d-;
NICCAI
The table solution would cripple a screen reader. Also, what about making it clickable? I guess you could add some js to do that using your classes as selectors, but it seems like overkill when you could just use the js to begin with. I also throw in a vote to @Mossop’s comment regarding cut and paste – I see a lot of bounced emails with this solution.
All of this said, the value here is in the different approach. It will hopefully lead to more thought in using css for obfuscation or machine blurring.
Vijay Chakravarthy
Thinking about this from the flip side –
This could be an interesting way for spammers to build messages that are human readable, but get past the bayesian filters.
jiimiona
Viva La Evolucion ;)
Jack
Hi,
I am using
http://www.mobilefish.com/services/hideemail/hideemail.php to protect my email address against spam bots.
This site also contains other useful tools.
DR
Stuart Langridge (of LugRadio fame) mentioned something similar: see http://www.kryogenix.org/days/2008/08/21/readable-non-harvestable-email-addresses-with-css .
tekkie
CSS obfuscation must be the most inconvenient way to do it. Obfuscation based on HTML entities is slightly better but not the best available option to obfuscate as you can all also confirm from this chart.
Mac OS X Dashboard widget called obfuscatr provides JavaScript encoding, which is more convenient (for both ends) than with CSS. The other possible option is just plain hexadecimal encoding of your email addy, involving above mentioned HTML entities. So 2 alternatives available from obfuscatr. See the details at flash tekkie.
obfuscatr was also featured in MacWorld Italy of March 2008.
Manmohanjit Singh
Never thought of that, heh. Good idea.
Littlebtc
Using the hidden text to make text uncopyable was already common used on Chinese forums.
This feature was included in Discuz, a popular fourm system, on five years ago.
sep332
> Yet, I recently got spam where offers was written in pristine English:
Irony!
Aza Raskin
Very ironic. And fixed now :)
Mike Smith
It’s probably not going to show up correctly in this comment, but I like the idea of using the unicode mirroring character (U+202E) in front of your email, reversed.
moc.allizom@aza
Aza Raskin
That is very cool. I didn’t even know unicode had a mirroring character.
Alix Axel
Very cool indeed. To bad this has problems with copy & paste and needs to be enclosed with specific HTML tags.
Alix Axel
*Problems with copy and pasting with specific applications, EditPlus for instance will paste the reversed email, while on Notepad and Firefox the email will be pasted correctly but also the control char – which makes navigating the text with the keyboard arrows very difficult.
Abdulla
yeah but I don’t think it will work in most cases because of the naming issue..
I have read about this last week from trend micro when they talked about SASFIS Malware
http://blog.trendmicro.com/sasfis-malware-uses-a-new-trick/
Mike Smith
Oh, sweet. It worked. Try copying the above email address.
Matthijs
When posting essays on-line to websites use the CSS to prevent plagiarization
Tony Mechelynck
Without the need to use the U+202E control character, wouldn’t
<span dir=”ltr”>moc.allizom@aza</span>
do the same?
Aza Raskin
It would, but that would be easy for spammers to modify their scripts to nab your email address. The CSS trick relies on the difficulty of processing CSS rules and their visual affects.
Alix Axel
LTR? Don’t you mean RTL? Anyway, even RTL doesn’t seem to work for me.
cc
Great, so the technology to prevent web pages from being machine-readable is already available. I guess firefoxen’s copy&paste will resort to on-screen OCR by the time such schemes become common as ‘copyright-protection’.
Lugovskoy
Wow, so simple and useful!
Wladimir Palant
Actually, I wondered a while ago why Gecko would copy invisible symbols in a selection. It seems to do more harm than good, ignoring any symbols that don’t have any dimensions sounds like a viable idea – and Kevin’s comment sounds like Webkit already does exactly that.
Btw, any scraper using a real rendering engine (with DOM access and everything) can easily get around your trick – exactly by ignoring the DOM nodes that have no dimensions and only extracting the text from the other nodes.
PS: Is it intentional that your blog displays no dates whatsoever? When I saw 40 comments I tried to figure out whether this is really an old post that only bubbled up due to a minor modification – no luck…
Wladimir Palant
PPS: Ok, apparently this is intentional – I found the dates commented out in the source code. And this post is ancient as I suspected…
Aza Raskin
It is intentional—I feel that a lot of these blog entries are more akin to articles than time-limited pieces. This post, for example, was based on a thought from 2008 which I then entirely rewrote and extended dramatically.
kl
Opera 10.6 and Safari 5 are immune to this, so this clipboard hack seems more like Firefox bug.
Aza Raskin
I’ve testing with Safari 4, Chrome, and Firefox 3.6. I haven’t tested with Opera or Safari 5. I’m sure slight tweaks to the CSS would fix the problem.
Adam A
I faintly remember reading a blog post about someone where a spam mail passed his filters using css tricks. He had a clever idea of using it against spam harvesters. Hmm.. dejavu?
Robert Accettura
I’ve done something similar, but found that using JS to print your email is just as effective at deterring spam bots, and as a bonus can leave the email easy to copy paste (I don’t want to make people’s lives hard).
Even more interesting is to just list your email on pages that use SSL. I’ve done that as well now. Since I force my contact page over SSL I’ve yet to get spam go through my contact form, I’m pretty sure no bots scrape it either. I still use JS to print my email address though.
Dan
Hey Aza, on a completely unrelated note, Are you involved with the Enzo zenPad (http://www.enso-now.com/), or did they steal your logo?
Wo0T
Hello,
I’ve made a little python script to generate the code.
You can view it here:
http://dpaste.com/210601/
Zayıflama Lida Fx15 Ve Biber Hapı Zlfvbh
I’ve done something similar, but found that using JS to print your email is just as effective at deterring spam bots, and as a bonus can leave the email easy to copy paste (I don’t want to make people’s lives hard).
Constantine c69
spambot detected ;)
Debayan Gupta
I’ve been using a combination of js and css to hide my email – I usually use a variation of what you’ve mentioned here. Say,
john dot smith @ xyz dot com
Here, I put the “dot”s inside spans, resize them, and give them black backgrounds to make them look like dots – that way, a human who copies my address ends up with understandable text.
You can even reorder the letters using css, instead of hiding them – just use letter-spacing with a normal span and a floated one (so that they overlap – you could use multiple z-indexes or something if you’re feeling particularly vindictive).
I’ve also experimented with using different fonts and languages (“Foiling spammers with a bit of their own medicine brings a smile to my lips.” – absolutely!). The unicode mirror character is also particularly useful.
Constantine c69
Bot evasion, copy prevention and gray-hat seo of varying shades of gray – that’s the main applications of this broad family of css tricks.
ps: You forgot pixel.gif, styled with width, height and background.jpg
sildur
Any CSS-based DRM can be easily defeated by exporting the web page to PDF and copying the resulting text.
celebrity fuck you
Sign: zdbrw Hello!!! cguhn and 4759meuegzrgun and 4743 : I love your blog. :) I just came across your blog.
mido
thank you bro very nice post
porno
When the CSS is not used for rendering HTML (feed readers like Google Reader ignore the site css), your original email is never shown correctly.
Sex
Same remark here. I was reading your post in Google Reader and missed the point, you need the CSS. Does an RSS reader strip inline CSS?
gucci belts
C Beijing China mens belts NBA star center Yao Ming GHD can cry without pain, to cheap gucci belts the Court in the eyes of basketball fans, but some irreducible on cheap louis vuitton belts for men the possibility cheap desiger belts gucci belts on sale of the first louis vuitton belts cheap born of the soul would have U.S. passports. Yao and his wife returned to the United States MBT, where he plays for the Houston Rockets before the birth of her first child, which should eventually took place between May and July.
Clément F
Great to use against gen-pop.
Meanwhile, easy to bypass.
With devloper tool, or directly, change display property to none, into
.z, .x, .y classes.
Fred Jhones
How do you apply it to an existing email address?
شات صوتي
University on her work; that lecture was similar to the one
شات كتابي
University on her work; that
Asha Saini
i need a software or an application that can help me protect ma chat from being copy pasted by the person on the other side
can any1 help
Dan
Asha: That’s like asking if you can prevent someone from eating the sandwich you gave them. The solution is obvious: don’t give him the sandwich in the first place (ie don’t say anything you wouldn’t want him to share with everyone).
Except it’s even worse because he can make a sandwich himself and claim you gave it to him (it is fairly easy to make your own fake chat logs).
Dr_Faulk
I am a Diplomat named Mr.Jimmy Edwards sent to deliver
your contract/inheritance fund of$9.6M to you. I’m
presently at jfk international airport.Reconfirm your
details, Name,phone,occupation and your proof of
identification
Tab Atkins
(Sorry, I didn’t read the preceding comments. If my comments have already been stated elsewhere, consider them doubly important.)
Your first usage is completely unnecessary. Your email is already known to spammers. Ruining the ability to copypaste is a high cost compared to the tiny benefit of reducing one avenue among many for spammers to get your email address. Use a spam filter, or an email provider that filters for you. Don’t hurt your users.
Your second usage is stupid and evil. Stupid because there’s already a vastly easier way to keep Google from indexing your page, called “robots.txt”. Extra stupid because it bloats your page with hundreds or thousands of extraneous span elements, which not only make it take somewhat longer to download and increase your bandwidth bill, but can slow down other operations due to the increased number of elements. Evil because it ruins the page for the blind and other people forced to use screen-readers.
Your third example is evil, but that’s the point, and you can’t really do anything to stop it. There are tons of ways to hide letters, and you can still phish people just as easily by just linking to the evil page while displaying the good url.
I beg you to reconsider publishing these as “tips”, particularly the second one which has a massive cost to some of your users for literally *no* benefit to you over just using robots.txt (and may even trigger cloaking penalties, if our algorithms are smart enough).
Fluff
‘Top of my head thinking’ would say that the early captcha bots could OCR the image, so all the bots would do here is grab the page, then render, snap then OCR the result.
Given they have the OCR section already, the scrape & rendering part is a doddle.
Kwpolska
I’ve found a better way. http://kwpolska.co.cc/hidemail/
Phil
This code, used as a Firefox bookmarklet will defeat this:
http://pastebin.com/wd2JE01a
pepe
Chrome 9 on Mac – easily c&p to whatever text editor.
Alex Lane
Hmmm. I’m using Chrome and copied your examples by highlighting them and then chording Ctrl-C. Then I pasted the result in a text editor.
The result was eminently readable.
PeterStJ
I don’t get the point really.
Wouldn’t it be easy to just cycle the CSS rules for font-size with JavaScript and ‘fix’ the ones that seem too small? (for URL readability, third example), or remove elements by class name if rule is found to set the size too small? This seems so logical to me.
If it is so easy for even novice coder do overcome this, what is the point of putting it there altogether?
Maybe one can set the style inline, then one need to cycle all elements, instead the CSS collection, but still, the speed of now days engines this should be fast enough, especially if one wants the content really bad.
Please explain if I am wrong.
John Wilger
Actually, it wouldn’t be too hard to decode this. For one thing, your browser already knows how to do it; the fact that it presents it visually rather than in a readily machine-parable way notwithstanding. And even short of borrowing code from an open source rendering engine, one could always just screen-grab and OCR it.
M. Elkstein
Here’s the thing: most spammers don’t get their email addresses off websites. Too much hassle. It’s way easier to just grab the address book of hijacked accounts. In most cases, the account owner doesn’t even know his account was hijacked; but his friends start getting spam…
This also defeats the “unique email used only on website X” scheme, which caused people to report that site X is showing emails in an insecure way, and thus leaking addresses to spammers (it’s not: some human picked it from the site, and his account was compromised).
Ilya Vassilevsky
This is amazing. Thank you so much!
Ben
Here’s another bookmarklet – designed to be more adaptable to various hiding techniques. It hinges on jQuery’s :visible pseudo-selector, which should allow it to detect hidden blips of code regardless of how they’re hidden.
http://pastebin.com/CKsdcfpy
teddlesruss
It’s basically steganography. Can it hide / obfuscate any executable lines in a page, i.e. JS or HTML5? That could be really nasty…
Sourav Chakraborty
That is a great idea, Ruskin!
Matt
The one problem I’ve found with this is that if one copies and pastes into a web based email, such as hotmail or gmail, the hidden characters are still hidden, and will trigger an error, making the email unusable, and for no apparent reason. Pasting into notepad reveals the hidden characters.
wholesale beads
welcome to china wholesale beads store
GlitchMr
Open Opera source editor and insert this anywhere on site (well, not in middle of tag…)
.z,.x,.y{display:none !important}
Now you can easily copy-paste it, as browsers ignore “display:none” while copying stuff.
direk film izle
Thank you for sharing your friends. Hope to see you another day.
gordon
thanx
Online Oyunlar
This takes anything marked with the class z and visually hides it, leaving only the email address rendered as if nothing was amiss. When you go to copy my email address, however, the browser doesn’t see the garbage text as hidden and so it gets selected along with my email address.
شات صوتي
thnks
goooooooooooood
min:)ااا
دردشة صوتية
I like such topics
icon designs
Certainl.y I join told all above. We can communicate on this theme.
P.S. Please review icons
diyet
I have seen script such as the following work well.
ugcjemad
[url=http://orbitmedia.com.au/tsa/Nikefreerun.html]Nike Free Run[/url] One of Connecticut’s best water trails, Salmon River is located near East Haddam, CT. [url=http://highbankscamping.com/OakleyJawbone.html]Oakley Jawbone[/url] Szdumaymb [url=http://www.windows7pro.co.uk]http://www.windows7pro.co.uk[/url]
uriwxi 804423 [url=http://www.robesenligne.com/]robe de bal[/url] 437781 [url=http://homegardenexpos.com/design/cheapnikefree.html]cheap nike free[/url]