Simple developer method for checking individual URLs
posted by miked on October 30th, 2006 in PhishTank, API, Developers
This post was updated November 15, 2006 with the POST method to work around a limit of the original method.
When launching PhishTank, one goal was to release reliable verified phishing data to the community free of charge in an open and easily accessible format. Over the past weeks, I have had the privilege of working with many committed developers and integrators to whom we owe a great deal of gratitude for supporting this effort and helping to make PhishTank an amazing success.
Building on the API we have exposed and the downloadable data file we publish, these developers have implemented protection at layers from the mail server to the web browser (coming soon!).
However, there is still work to be done. Today we are releasing a simplified interface for checking URLs against the PhishTank database. This new interface could be used for anything from mitigating new threats on mobile platforms to easing development of check-only plugins for browsers and mail clients.
Usage is simple and straightforward, in either of two ways: POST or Base 64 encoded.
1. POST
This method is preferred, as POST eliminates the limit on URL length imposed by the original Base 64 encoded method.
- Start with the URL you would like to check.
http://www.evil.com/ - Base 64 encode the URL string.
http://www.evil.com/becomesaHR0cDovL3d3dy5ldmlsLmNvbS8= - Send a POST to http://checkurl.phishtank.com/checkurl/ with the Base 64 encoded string as the
urlparameter
The response will be in XML, in an identical format to that returned by the API check.url action.
2. Base 64 encoded
Originally, this was the only method. However, some URLs may end up too long when Base 64 encoded and included in the URL. So, while this method is still supported and live, consider it deprecated: use the first method if you’re starting from scratch.
- Start with the URL you would like to check.
http://www.evil.com/ - Base 64 encode the URL string.
http://www.evil.com/becomesaHR0cDovL3d3dy5ldmlsLmNvbS8= - Make the Base 64 string URL safe (aka, URL encode it to remove illegal characters).
aHR0cDovL3d3dy5ldmlsLmNvbS8=becomesaHR0cDovL3d3dy5ldmlsLmNvbS8%3D - Access http://checkurl.phishtank.com/checkurl/<string>
http://checkurl.phishtank.com/checkurl/aHR0cDovL3d3dy5ldmlsLmNvbS8%3D
The response will be in XML, in an identical format to that returned by the API check.url action.
Let us know how you use it.


astrogeek
What’s the scoop on maybe getting a Firefox add-on for PhishTank that uses the data? Of course the bad side might be that if I’m using Firefox (which I do) and have an add-on in to block Phishes it might make being a validator a problem.
I’d like to hear about what’s in the works.
Later,
astrogeek
— posted by astrogeek on October 31st, 2006 at 12:16 am
MASA
Dear astrogeek,
The extension for firefox is in progress/done. It blocks phishing sites from loading and is really cool.
The url to the extension is coming very soon.
— posted by MASA on November 1st, 2006 at 1:30 am
John Roberts
http://www.phishtank.com/sitechecker leads to the first Firefox extension to use PhishTank data that we know about.
— posted by John Roberts on November 9th, 2006 at 11:49 pm
Ilgaz
http://en.wikipedia.org/wiki/Bookmarklet
Can’t the URL check functionality done via javascript “bookmarklet”? There are very advanced ones there such as Alexa out there.
Very simple example is TinyURL “toolbar” button, http://tinyurl.com/#toolbar , note it is a simple javascript to send current URL to tiny url to make it small.
So, a bookmarklet will do the necessary encoding, make it “safe” and pass to phishtank. When user in doubt, he/she will basically click button.
Good things are:
1) It is very simple install, just via drag/drop and fail safe
2) Near universal (will work in any browser supporting javascript)
— posted by Ilgaz on November 15th, 2006 at 7:21 pm
John Roberts
Ilgaz, absolutely… we encourage anyone to write one! We’ll promote it.
— posted by John Roberts on November 15th, 2006 at 8:37 pm
Moike
This works much better! I’m still having a problem verifying 2 URLs - for example, try the URLS from #11912 and #29228 - but I get a ‘false’ for present_in_database instead of ‘true’.
— posted by Moike on November 16th, 2006 at 2:35 am
Amit Chakradeo
Here are the bookmarklets as wished (tested only on Firefox 2.0)
http://amit.chakradeo.net/2006/11/17/check-if-a-site-is-phishing-site/
— posted by Amit Chakradeo on November 17th, 2006 at 7:56 pm
my public notepad
PhishTank bookmarklet…
On Friday Thomas discovered OpenDNS - clicking around we soon discovered PhishTank, and boy - that service looks awesome. I’m already thinking about it how I can talk Marc into integrating it our with CGPro mailserver. So being a fan……
— posted by my public notepad on November 19th, 2006 at 7:30 pm
Ram
I have tried a number of valid phishes such as 93684 and 93656 and they all return
in_database as false. However http://www.google.com returned as in_database true.
HELP!!!
— posted by Ram on February 2nd, 2007 at 7:01 pm
EP
I’m wondering if there is any way to search PhishTank’s archive of verified active and verified inactive. I’m doing a little research project for school and wouldn’t mind finding something along those lines. Thanks much.
EP
— posted by EP on February 23rd, 2007 at 12:06 pm
John Nagle
Possible bug in API query:
The mechanics of querying the API work fine, but the “valid” flag may have problems. I tried querying:
“http://pardonus.nl/update/”,
which is listed as a “valid phish”, and got back this XML:
2007-03-05T15:07:04-08:00
966f835b
71.139.204.250.45eca29866e790.42437624
true
119617
false
Note that “verified” is “false”. That’s wrong. If you check the PhishTank details URL returned, that page says that this is a verified phish.
Sometimes I do get back “true” for the “verified” value, but the XML doesn’t consistently agree with the details page.
Is there an out-of-sync server somewhere?
— posted by John Nagle on March 6th, 2007 at 6:28 am
John Nagle
Re previous message: posting the reply XML into blog comments did not work.
What I got back was, essentially
in_database = true
verified = false
with correct url and details links.
— posted by John Nagle on March 6th, 2007 at 6:31 am
Eric Pugh
I wanted to comment on the API, I have found it very complex to use. I am working on Rails app http://www.fish4brains.com, and I looked at the chunk of code provided here http://www.edazzle.net/#phishtank, as well as using the legacy method:
def phish?
enc = Base64.encode64(@uri).chop
res = Net::HTTP.post_form(URI.parse(’http://checkurl.phishtank.com/checkurl/’),
{’url’=>enc})
doc = Hpricot(res.body)
if doc.search(”//errortext”).size > 0 then
raise “Found error in url: #{@uri}”
end
return doc.search(”//verified”).inner_html==”true”
end
The SSL based api with Frobs is very difficult to implement, and I am not sure why, at least for checking data, that it is so complex. Same applies for url encoding using the legacy developer method, still too complex.. I don’t understand why it isn’t as simple as http://api.phishtank.com/url?url=http://someurl.com
It would reduce the complexity of integration, and boost adoption. At least right now I’ve attempted to use the service twice, and both times run out of steam. I thought the simple url encoding would work, and it seemed to the first time around, but a couple weeks later the results started failing with in_database=false for everything.
— posted by Eric Pugh on December 2nd, 2007 at 4:54 pm
Angsuman Chakraborty
The POST method doesn’t work for most url’s. The problem is the base64 encoded string. Sometimes it contains = (equals) which doesn’t work well within post context. The solution is to urlencode the base64 string. However PhishTank POST method processing code doesn’t appear to be recognizing it resulting in malformed url in either case (with or without urlencode).
However the older GET method works fine and is recommended.
We are using PhishTank API in Comment Guard Pro, to be released on 28th January 2008, to detect spams (one of the modules). Any feedback would be appreciated.
— posted by Angsuman Chakraborty on January 20th, 2008 at 4:49 am