PhishTank is operated by OpenDNS, a free service that makes your Internet safer, faster, and smarter. Get started today!

'Voting' Posts

The Tank is bubbling

posted by John Roberts on April 11th, 2007 in PhishTank, Members, Community, Voting

By now, most of the PhishTank community has seen the dramatic surge in submissions. It’s not malicious, but it is quite noticeable.

In the last few days, two different organizations decided, independently, to start submitting the suspicious URLs they receive to PhishTank. They benefit because the data is further validated and distributed far and wide. PhishTank benefits from some high-quality submissions, and broader coverage in its free data distribution.

Clearly, though, the new volume is dramatic.

And it didn’t help that one of the feeds went awry. The submissions were still phish (or possible phish), but the filter wasn’t tight enough. Those have been removed. Still, lots to verify at the moment.

The community has some work to do in catching up. Thank you for your patience. We are digging on small, immediate steps we can take to speed things up and make the volume manageable. Also, we’re revisiting the thorny problem of how to judge a domain.TLD combination (example.com) as a phish, so that all the wildcarded submissions which match that domain.TLD combo gets the same designation. We know this would help dramatically.

This is not simple, but it has been discussed before, so we’re not starting from scratch. The community’s time and attention is valuable; we do not want to waste it. We also don’t want to lose the collaborative human judgment that makes PhishTank so useful to the Internet at large.

Please don’t stop telling us where we can get better, and don’t stop voting/submitting/flagging. I’d remind you all about the mailing lists, especially the user list.

Please do invite your friends to join this fight. We can always use some more help. ;-)


Note: the organizations in question would like to remain discreet for now; that’s fine with us, although we like to share where possible. If your organization would like to submit suspected phishing URLs/emails to PhishTank at a higher volume, please let us know.

The case of the mysterious hostname

posted by John Roberts on February 9th, 2007 in PhishTank, Community, Voting, Verifying phishes, Moderators, Hosts

The following post was written by funchords Moderator. If you don’t recognize the username, check the stats page. Without further ado…


Question: What do the following web addresses have in common?

  1. http://66.135.40.79/
  2. http://1116153935/
  3. http://0×42.0207.10319/
  4. http://0102.8857679/

Answer: Don’t look here — try them out and see! (Caveat: In most browsers and operating systems — all four URLs will work. If your computer had trouble with a link, see “Something Not Working” below to understand why.)

So why did that happen?

We websurfers are trained to think of Internet sites as Double-U, Double-U, Double-U, Dot, Google, Dot, Com — because that is easier to remember than http://1208941928/. The network translates those names into numbers, so we don’t have to. But, every computer accessible on the Internet has a long and unique number as an address. It’s like a telephone number — uniquely yours.

The hostnames in the four web addresses at the top of this page are all different ways of expressing the same Internet address number.

Just as websurfers use a method that is easy to remember, programmers do, too. If they’re working in a system or programming language that prefers base-16 or hexadecimal numbers, they’re likely to express a 3 like 0×3 and a 12 like 0xC. An octal system would likely replace those with base-8 numbers, expressed as 03 and 014.

Why do this, when the rest of the world speaks in base-10 (decimal)? You’ll see in a moment — multiplication and division are much easier when you’re speaking the same language as the system.

The third example at the top of the page begins with 0×42, which is a hexadecimal number (66 in decimal). The next segment of example 3 is 0207, an octal number equal to 135. But what about that third number?

The “dots” in the address are meant for organization. Twenty-five years ago, our internet founders segmented the IP space into 255 (0xFF) segments. Those segments were split between five address types — large, medium, small, private, and special-use/future. The number before the first dot indicates this segment.

Knowing this, you can begin to do the math to make the above conversions.

If there is a first dot, the number before it is multiplied by 0×1000000 (or 16777216 to us Base-10 users). The number after it is not multiplied. This would work just fine for a very large organization, they would have their unique organizational number and over 16 million IP addresses that they could use on the Internet.

A second dot would help mid-size organizations — the first two segments would be assigned to the business and the final segment was theirs to divide as they pleased. And so on, for smaller businesses and the fourth segment. That sounded good back in the early 1980s, and it worked for a while. But, more importantly for our topic, it set the stage for how IP addressing works.

Let’s untwist our 4th example. 0102 is the octal equal to 66. This means that http://66.8857679/ should work? Does it? So we multiply that 66 by 16777216, and we get 1107296256. We add the last half of example 4 to that. 1107296256 plus 8857679 is 1116153935. That number is hard to remember, but it is the same number we tried in Example 2, above! So, the unique network address to PhishTank is 1116153935!

If there are two or three dots, the first number is multiplied by 0×1000000, the second by 0×10000, and the last is not multiplied. If there are four segments, the third segment is multiplied by 0×100.

Remember that the dots are there for organization — for human convenience. Computers do not need them (as we have shown here).

Now you can turn any dotted decimal (what most would call “normal”) IP address into its actual single-integer address, and back again! Reverse the process using division…

1116153935 ÷ 16777216 (0×1000000) = 66, with a remainder of 8857679
8857679 ÷ 65536 (0×10000) = 135, with a remainder of 10319
10319 ÷ 256 (0×100) = 40, with a remainder of 79
79 ÷ 1 (0×1) = 79

… and that leaves us back at 66.135.40.79, the dotted-decimal IP address that we used in Example 1.

Something not working, or working differently? In twenty-five years, programmers and administrators have grown accustomed to the four-segment dotted-decimal IP addresses, even in the largest organizations. While most network software still accepts these other forms of an address, some do not.

Although these forms of addressing are valid, almost nobody is used to them. Spammers and Phishing Fraudsters are taking advantage of this. They attempt to get around detection by changing the IP address into something other than a dotted-decimal form. It also tends to make a Phishing URL more legitimate. Here are some examples:

So when you see such an address, don’t panic. Know that the address is a number, and not a name that can be resolved in DNS. Submit the Phishing Site to the PhishTank “As-Is,” using the same style address that the Phisher put in his spam email. Then, if you want, deconstruct the dotted decimal IP address and submit the site using the more “normal” form. Doing this will help remove some of the confusion for verifiers, down-stream users, and others who aren’t as smart as you!

Isn’t that cool?


Like to write a post for PhishTank? Let us know.

PayPal wants to wish you a Merry Christmas

posted by John Roberts on December 9th, 2006 in PhishTank, Voting, Verifying phishes

Submission 40965 is NOT a phish.

The PhishTank community is slowly reaching the right conclusion. Emphasis on slowly. But it’s hardly the community’s fault.

The site is http://www.paypalchristmas.co.uk/. It is not operated by PayPal, as you can tell on the Technical Details tab of #40965, adding to the confusion!

But the site is affiliated with and approved by PayPal.

Given their high profile (#2 in November 2006, for example), PayPal should think very carefully about using alternate URLs for anything with their name on it. Submissions 42483 and 42482 are additional examples where the site is legitimately affiliated with PayPal, but it is very hard to know that without digging deep.

But a company’s domains are their choice. I simply wanted to draw the attention of the PhishTank community to this example, as I’ve done with other examples.

Firefox 2.0 improperly calls this site a phish. IE 7 is confused, some times saying it’s a phish, some times saying it doesn’t know. I’d like to encourage PhishTank to get it right.

So, vote wisely. Vote NOT a phish. Please. ;-)

P.S. eBay (parent company of PayPal) hosts images and other, well, static content at the genuine domain ebaystatic.com is a genuine domain, so submission 46522 is also NOT a phish.

P.P.S. 42482, 42483 and 40965 were submitted by MASA as tests, with approval: they were known to be confusing, but legitimate. The community is passing the test, but I wanted to hurry the process along. Just wanted to make it clear that MASA is not polluting the Tank here; in fact, MASA is a moderator.

Another real bank site which confuses people: nwolb.com

posted by John Roberts on November 30th, 2006 in PhishTank, Voting, Verifying phishes, Banks

Four weeks ago, I shared the interesting case of 53.com, a real bank website whose numerical domain name confuses some members of the PhishTank community (not easy… discerning bunch!). The submission cited in that post remains undecided, although it’s (correctly) leaning toward “NOT a phish.”

I want to call attention to another example today.

The submission is 36895. There are nearly 250 votes on this submission, with a slight majority correctly recognizing that this is NOT a phish.

Why the confusion? The website is branded as NatWest, a major bank in the United Kingdom, but the domain name is nwolb.com (go to the submission to see the entire URL submitted).

The registrant for nwolb.com is:

The Royal Bank of Scotland Group plc
Waterhouse Square
138-142 Holborn
London EC1N 2TH
UK

NatWest was purchased by Royal Bank of Scotland Group in 2000, so this is legit.

You can also simply start at NatWest.com. Click the button at the top right titled “Log in.” The link redirects to…you guessed it…https://www.nwolb.com/ (with lots of other session/security stuff on the end of the URL).

I’m sure there are technical reasons, or historical business reasons, why the online bank lives on a different URL than the corporate website, but it’s certainly led to some confusion among an ever-more cautious online crowd.

If you have not yet voted on 36895, please vote “NOT a phish.”

Related note

In the comments about 53.com, some asked why we (the PhishTank administrators) don’t go ahead and decide this submission once and for all. My answer remains the same: as long as this is undecided, we will not step in. PhishTank administrators will step in to overrule false positives, if necessary. It rarely has been: maybe three times in nearly 25,000 submissions as I write this post.

The moderators are instrumental in flagging confusing submissions and drawing attention to possible problems, though they don’t overrule the community.

Money Mules: laundering out the phish smell

posted by John Roberts on November 10th, 2006 in PhishTank, Members, Voting, Safety, Verifying phishes, Mules

The following post was written by PhishTank member funchords, a very active member of the community, and currently the top submitter to PhishTank.


Submission 22779 is such a professional-looking employment ad, one might even wonder why it was submitted as a suspected phish site. Most likely, redpriest realized that the ad was looking for a Money Mule — a person who launders phishy money through their personal accounts and moves it overseas.

It’s both illegal and risky — and most Money Mules end up getting burned as soon as the phish-site victims realize that their credit cards or identities have been compromised. In addition to possible trouble with the police, the Money Mule gets to pay back the banks and institutions that were involved in the fraud. Money Mules take all the heat while the real crooks disappear into anonymity.

So why was Submission 22779 marked “Verified: Is NOT a phish?” Because, even though it probably is related to phishing, it really is not a phish. It isn’t masquerading as an institution one already trusts in order to obtain financial information.

While PhishTank endeavors to quickly and accurately identify Phish, our friends at CastleCops.com specialize in working with government and internet concerns to shut these criminals down. CastleCops has an e-mail address to report suspected Money Mule advertisements: mules@castlecops.com.

Got a phish? As always, throw it in the PhishTank. But if the crooks are “fishing” for a Money Mule, then report it to mules@castlecops.com.

53.com is a real bank

posted by John Roberts on October 31st, 2006 in PhishTank, Voting, Verifying phishes

Submission 19715 continues to await final judgment from the community. The phish URL is:

http://www.53.com/wps/portal/contenttype/secure/confirm_context.id

The screenshot shows Fifth Third Bank.

The technical details give the strongest evidence. Admittedly, the technical details tab did not exist when this was submitted on October 17, 2006.

Registrant:
Fifth Third Bank
38 Fountain Square Plaza
Cincinnati, OH 45263-0001
US

There are 250+ votes so far, with 60% saying “Is NOT a phish.”

Hint: This bank exists, and this site is real. If you have not voted, please vote Is NOT a phish.

The lesson is that number-only domain names do not inspire trust, but don’t dismiss them out of hand.

Better screenshots running on PhishTank

posted by John Roberts on October 24th, 2006 in PhishTank, Voting, Site changes, Screenshots

Site screenshots are mugshots for phish URLs. So, I’m happy to say that miked has just improved the PhishTank “camera” — the software that takes screenshots.

The results? More screenshots. Faster screenshots. Better screenshots.

This is a leap forward since a good screenshot, in concert with a close examination of the phish URL, is enough to judge “phishiness” right there and then, without needing to visit a potentially shady site.

We haven’t re-taken every site’s screenshot, as it’s impossible for those that are down and may be confusing for those already judged, but all new submissions (and most of the “living” ones from the past) should now be represented.

Please continue to “flag” bad or missing screenshots — it’s been helpful in debugging. Site admins can now retake screenshots more easily, too.

When the community doesn’t reach a consensus

posted by John Roberts on October 10th, 2006 in PhishTank, Community, Voting

We set up community voting at PhishTank because we think multiple insights make for a better community judgment. This is similar to “Linus’s Law,” as formulated by Eric Raymond: “Given enough eyeballs, all bugs are shallow.”

We’re not the first to re-word that concept, but here’s the PhishTank version:

Given enough eyeballs, all phishes can be identified.

In a related post, Jeff Veen wrote about bloggers and the media and ways of reacting to changing forces:

Or will [organizations] find inspiration in, say, the Digg model, harnessing countless tiny points of participation to harness the collective intelligence of their audience and feeding it back into their product?

PhishTank is certainly about collective intelligence.

But sometimes it’s not that easy. Intelligent people can disagree!

Suspected phish ID 11983 is the first really challenging submission, where the community has not reached consensus yet despite over a week of vigorous voting. As we approach midnight UTC on Tuesday, October 10, this submission has over 315 votes, and it’s nearly 50-50 as to whether this is a phish or not. (Note: The # of votes is never shown publicly to non-admins.)

To me, this is not a phish, and I voted that way. My thinking? The URL is greatstudentloanpayoff.com, and when you get there… it’s for Great Student Loan Payoff. This looks less than beneficial, and I’m not going to give my information, but there is no attempt to pretend to be something other than what it is: an attempt to legally get your Social Security Number and permission to email you marketing messages.

My take? Don’t do it. But it’s not a phish.

For the terminally undecided among you, we have some site changes now live which I’ll talk about in a separate post shortly. While you wait for those words, go ahead and vote.

More details about how PhishTank works and what is coming next

posted by John Roberts on October 6th, 2006 in PhishTank, API, Community, Voting, Email, RSS, ASN

We’ve been thrilled with the enthusiastic embrace of PhishTank by an active community. Check those stats! Despite our unspoken office contest to submit and verify as many phishes as possible, all the OpenDNS employees are being blown off the Top Submitters/Verifiers lists (or soon will be) by active individuals around the Internet. That’s a good sign!

This is day five. We’ve been making adjustments and changes all week in response to comments and learnings. We’re not done, so keep telling us how to improve.

There are a lot of different questions we’ve fielded, and ideas we’ve heard. Here are some answers and comments and a quick look ahead on PhishTank.

Screenshots

We know that screenshots of suspected phish sites are valuable in judging a submitted URL, and help avoid visiting a potential phishing site (which should be done with care!). We also know that sometimes the screenshot doesn’t work very well. Please use the “Something wrong with this submission?” link on the right-hand side to alert us. We’ll add a specific choice for “Screenshot problem” shortly. The development team has a ticket for improving this key feature. It’s not a binary issue, but it will get better.

Duplicate URLs

There should not be any…but there are some as I type this. We know why this mistake happened, and it’s being fixed today. My apologies.

Wrong URL picked from email submissions

With some phish submissions via email, the PhishTank software chooses the wrong URL as the phish URL to judge. We’re working to improve our choice, of course. If we’ve got it wrong, please tell us via the “Something wrong with this submission?” link, rather than voting on an obviously biffed URL.

Redirects

Some phishing sites mask their final destination URL by using open redirect URLs at legitimate services. The final destination should certainly be marked as a phish, but the phish URL being judged is often the masked URL. Our take, for now, is that both the full original URL, including the redirect, and the final destination URL are phishing. The point? If someone can click on the URL and get to a phishing site, it’s bad news. This is an understandably grey area, and we’re happy to revisit as the data tells new stories.

Flags

Flags are what we call the notes appended to individual phish IDs via the “Something wrong with this submission?” link. These are read with interest, and help us as PhishTank administrators know where to focus our attention. Please continue to use them!

We are considering whether or not to make them visible to more than just administrators. They are informative, but wondering whether they will bias votes or not. PhishTank doesn’t tell you how others have voted on a submission until you vote because we hope you make your own judgment.

We’re undecided here. Thoughts on making these notes visible?

Judging a site that is offline

We’re continuing to tweak our code for judging (and re-checking) whether a submission is online or offline. We know it’s not 100% accurate, in part due to the normal volatility of phishing sites. If a site is offline, please do not vote. Instead, flag it for review via the “Something wrong with this submission?” link. We use these examples to test and improve our software for checking online status. Our belief is that it’s inappropriate to vote on a site that is not available. Of course, some URLs on their own show phishing intent and no possibility of mistakenly hurting legit folks if identified as phishes; there are grey areas. Help us work to define them further.

Making a mistake

I’ve made a few mistakes already where I mistakenly judged a submission as a phish (or “NOT a phish”) because my mouse finger was moving faster than my brain.

The good news? The community gets it right, and a single mistake vote won’t damage the overall judgment.

There is no need to notify us if you make a mistake. We’re not going to change individual votes. Your choices do matter: better choices will increase the “weight” of your future votes. Still, we’re also going to bake in a (small) allowance for this kind of mistake when judging an individual’s contributions.

We’re going to modify the two links (Is a phish / NOT a phish) to try and make them more distinct and less prone to mistakes.

Displaying suspected phish emails

Several people have asked why we don’t display the suspected phish emails, too. We do store the submitted email, and try to append extra information based on headers where possible. Viewing the email might help in making a better judgment, but there are two elements holding us up.

First, we’re concerned about usability. Before launch, some of the email information was displayed. The individual phish detail page was cluttered. We didn’t solve that problem before launch, but it is solvable.

Second, under no circumstances should PhishTank display personal information about the submitter. With email submissions, that requires extra care. Until we get it right, we will leave the source of the email (for example) behind the scenes.

We are considering screenshots of the emails, although the rendering in different email clients is notably more varied even than web browsers.

MTA (Mail Transfer Agent) information from the email is something we hope to break out, too, for display and API query.

In any event, we know the email itself has valuable information for PhishTank beyond just the phishing URL, and we’re thinking it through.

whois and ASN data

We are adding whois and ASN (Autonomous System Number) data to the submissions, although not currently displayed, primarily because the output of these two fields (especially whois) is so varied. We’ll figure it out.

Coming sooner, probably, are RSS feeds by ASN, so webhosts, ISPs and other organizations can subscribe to notifications about verified phishes on their networks. PhishTank doesn’t do takedowns, but certainly hopes that the data proves useful for those in a position to act.

RSS feeds

The focus for sharing information has been the API (check out the new diagram). But we believe in the simplicity of RSS feeds, too. Beyond the RSS feed for this blog, the site already offers individuals a personal feed to track their contributions. Find it on the My Account page.

We will offer more RSS feeds over time, like the ASN feeds noted above.

Text file of all verified phishes

The API does not offer a way to pull every single verified phish, purposefully. It would not be efficient for developers or PhishTank. However, we’ve heard many requests for a straightforward text file, updated frequently, that lists every verified phish.

We will offer such a file. Goal is to have this up and running sometime next week, barring other interruptions. Availability will be announced on this blog (http://www.phishtank.com/blog/) and in the API documentation.

More API calls coming

There’s more to come with the API. Most immediately, the API will offer calls to submit an email or URL to PhishTank, in addition to check them, as it does now. All that’s needed is some documentation. Stay tuned. If you want something else from the API, just ask. We’ll try to say yes to all reasonable requests; we don’t want to build applications, we want to enable application building.

A few people have written in asking about API limits. I’ll just quote the specific section of the FAQ:

There is no set usage limit. Extreme use will be noted, and we would ask that you contact PhishTank if you plan to use the API heavily. We welcome such usage, but would prefer to hear about it before it begins. PhishTank reserves the right to terminate API usage for accounts which abuse the free privilege.

As we learn more, we’ll get more specific.

Phew… more than enough for now. Comments invited and expected.

Server: pt2