|
Another paper from the Fifth Workshop on the Economics of Information Security, (WEIS 2006) is Proof of Work can Work by Debin Liu and L, Jean Camp of Indiana University. Proof of work (p-o-w) systems are a variation on e-postage that uses computation rather than money. A mail sender solves a lengthy computational problem and presents the result with the message. The problem takes long enough that the sender can only do a modest number per time period, and so cannot send a lot of messages, thereby preventing spamming. But on a net full of zombies, proof of work doesn’t work.
The current definitive analysis of proof of work is a paper by Ben Laurie and Richard Clayton that looks at the amount of compute power available to spammers who use zombies compared to legitimate mailers who don’t, and concludes that any p-o-w system demanding enough to deter spammers would prevent a significant number of legitimate users from sending their mail. I talked to Cynthia Dwork, who co-invented p-o-w in 1992, at the CEAS meeting a couple of years ago and she agreed that zombies made it unusable.
Camp and Liu attempt to address this situation by combining p-o-w with a reputation system. They note that most hosts on the Internet send no mail, and when a new host starts sending mail, there is an upwards of 95% chance that it’s sending spam. So they track existing hosts, giving them a reputation score that increases every time they receive legitimate mail from a host and decreases or is reset when they get spam, and adjust the size of a p-o-w problem demanded of a sender based on its reputation. They show that a plausible reputation function would allow essentially all the legitimate mail through while keeping spam hosts from sending more than a trickle.
This is all well and good, but it misses a major point—spammers adapt. The number of zombie computers on the net is stupendously large, and it is reasonable to assume that only a small fraction of them are in use at any given time, and that the bad guys have a large number in reserve that they have never used. Assuming that the bad guys can query the reputation function for a given host, which is likely since a useful reputation system has to be a shared one that aggregates data from a lot of recipients, they can check to see what hosts have what reputation and send spam from the ones with good reputations. They already do this to circumvent blacklists, with people telling me they see zombies switching targets when they get blacklisted. They don’t have to send spam at full speed to defeat a reputation system like this, merely screw it up enough to provoke a chorus of complaints from people who suddenly lose the ability to send mail. (In a more sensible world, the response would be “tough, fix your zombie” but ISPs have never been able to make that stick.)
So although this is an interesting idea and the analysis is fine as far as it goes, it’s too simplistic, and I just don’t believe that spammers would sit still for it. So proof of work still doesn’t work.
Sponsored byVerisign
Sponsored byRadix
Sponsored byVerisign
Sponsored byWhoisXML API
Sponsored byCSC
Sponsored byDNIB.com
Sponsored byIPv4.Global
Good analysis, John. Let’s explore that ‘spammers adapt’ scenario a bit more. While clearly it’s not a perfect solution, that’s an unreasonably high bar to set.
Let’s follow through and see what happens when spammers try to send from the zombies with good reputations. You say “people who suddenly lose the ability to send mail” but how does that happen in this situation? Consider: only machines that send large amounts of ham would have good reputations - typical machines start out with AND MAINTAIN reputations that keep them from being used to send lots of spam. So people in general wouldn’t have the ability to send lots of mail to people they haven’t emailed before in the first place. If spammers break into and spam from these, they can’t send much spam, and they aren’t doing much damage to the zombie’s reputation - it already has a lousy one. So they aren’t able to make what you have in quotes above happen to such a user. Perhaps they can make it happen to users with good reputations. These would include folks originating large amounts of ham and little spam. Well, we CAN expect these folks to be able to keep their machines from being cracked. There aren’t many of them, and they can afford to keep them secure. But what about senders with mixed and good reputations because they’re ‘too big to block’? While they send spam and ham, the spam will be coming largely from small compromised systems that use these big machines (e.g. bigsmarthost.bigisp.dom and virtualhost.sharedhostinggiant.dom), and it’s those small systems that will be particularly targeted because of their reputations. While some folks are reluctant to hold bigisp and sharedhostinggiant responsible for the activities of their users, we can find rough consensus on holding them responsible for at least not letting their own machines be broken into. In a p-o-w system, it won’t be too hard to tie reputation to the little machine using bigsmarthost.bigisp.dom, via secondary received line (something AOL does). Sharedhostinggiant and its peers are doing a fairly decent job policing their users already, and if a lousy p-o-w reputation sends CPU usage through the roof, we can expecct them to be more diligent. In other words, the p-o-w is letting our reputation system be more finegrained than the IP level. This is the same thing that plain A&R (domain Authentication and Reputation) systems do. Is this hybrid better than A&R alone?
Well, I haven’t thought for very long about this new twist to an old scheme, and I’ve only read half the paper, but maybe this’ll spark further thinking.
P.S. has anyone calculated how many megatons of carbon dioxide a successful p-o-w system would pump into the atmosphere?
The situation I forsee is that a spammer hijacks Grandma’s computer, then Grandma can’t send mail any more. She never sent much in the first place, now she can’t send any at all. Or more likely it’s not Grandma, it’s some random little business with a Windows box running Exchange, same problem.
Of course, this is what should be happening now when Grandma’s computer or the little Exchange server starts sending spam, but far too few ISPs do anything about it, because fielding Grandma’s certain phone call is much more expensive than dealing with the possible outside spam complaints.
Finished the (not even half decently proofread) paper. Grandma can still send mail! Grandma’s PC will just have to do a 6 min. POW to send each spam or other new-recipient email.
Well, that will give Grandma plenty of time to wait on hold when she calls to complain.
Grandma’s PC isn’t likely to have trouble with her email, because it isn’t set up to send her email directly - either it’s using her ISP’s SMTP server, or it’s using some webmail server, or it’s using whatever AOL morphs into, and either way it’s that commercial mail server that’ll be doing the delivery, not her box. Now, if the rate she’s spamming is low enough to fit through those servers, and those servers aren’t filtering outbound spam, she may still have trouble, but in either case there’s an admin to fix things - and the proof-of-work is unlikely to be running on her pc.
Not only can Grandma still send email, as you’ve conceded, John, but existing improvements besides the one in this paper mean that her emails to her grandchildren don’t require her PC to compute a hash.
http://www.fussp.info/Topic21.html
Bill: It’s not clear that ISP MTAs will be generating ePostage instead of their clients’ PCs. It could work either way. I assume the ISPs mostly refuse to shoulder the responsibility.
When you talk about the ISP’s server, you’re missing the point.
The ISP has no more idea than anyone else what mail from Grandma’s PC is real and what is bot spam. You’re conceding that the ISP is going to rate limit mail from her PC and guess what—the whole point of POW is rate limiting.
I suppose that if the only place Grandma can send is the ISP’s relay they can limit with a timer rather than POW, but POW advocates have usually claimed that it will make other sorts of limiting and filtering unneccessary. If you have to do all the same stuff anyway, why bother?