PhishTank is operated by OpenDNS, a free service that makes your Internet safer, faster, and smarter. Get started today!

Simple developer method for checking individual URLs

posted by miked on October 30th, 2006 in API, Developers, PhishTank

This post was updated November 15, 2006 with the POST method to work around a limit of the original method.

When launching PhishTank, one goal was to release reliable verified phishing data to the community free of charge in an open and easily accessible format. Over the past weeks, I have had the privilege of working with many committed developers and integrators to whom we owe a great deal of gratitude for supporting this effort and helping to make PhishTank an amazing success.

Building on the API we have exposed and the downloadable data file we publish, these developers have implemented protection at layers from the mail server to the web browser (coming soon!).

However, there is still work to be done. Today we are releasing a simplified interface for checking URLs against the PhishTank database. This new interface could be used for anything from mitigating new threats on mobile platforms to easing development of check-only plugins for browsers and mail clients.

Usage is simple and straightforward, in either of two ways: POST or Base 64 encoded.

1. POST

This method is preferred, as POST eliminates the limit on URL length imposed by the original Base 64 encoded method.

  1. Start with the URL you would like to check.
    http://www.evil.com/
  2. Base 64 encode the URL string.
    http://www.evil.com/ becomes aHR0cDovL3d3dy5ldmlsLmNvbS8=
  3. Send a POST to http://checkurl.phishtank.com/checkurl/ with the Base 64 encoded string as the url parameter

The response will be in XML, in an identical format to that returned by the API check.url action.

2. Base 64 encoded

Originally, this was the only method. However, some URLs may end up too long when Base 64 encoded and included in the URL. So, while this method is still supported and live, consider it deprecated: use the first method if you’re starting from scratch.

  1. Start with the URL you would like to check.
    http://www.evil.com/
  2. Base 64 encode the URL string.
    http://www.evil.com/ becomes aHR0cDovL3d3dy5ldmlsLmNvbS8=
  3. Make the Base 64 string URL safe (aka, URL encode it to remove illegal characters).
    aHR0cDovL3d3dy5ldmlsLmNvbS8= becomes aHR0cDovL3d3dy5ldmlsLmNvbS8%3D
  4. Access http://checkurl.phishtank.com/checkurl/<string>
    http://checkurl.phishtank.com/checkurl/aHR0cDovL3d3dy5ldmlsLmNvbS8%3D

The response will be in XML, in an identical format to that returned by the API check.url action.

Let us know how you use it.

14 Responses to “Simple developer method for checking individual URLs”

  1. astrogeek says:

    What’s the scoop on maybe getting a Firefox add-on for PhishTank that uses the data? Of course the bad side might be that if I’m using Firefox (which I do) and have an add-on in to block Phishes it might make being a validator a problem.

    I’d like to hear about what’s in the works.

    Later,
    astrogeek

  2. MASA says:

    Dear astrogeek,

    The extension for firefox is in progress/done. It blocks phishing sites from loading and is really cool.

    The url to the extension is coming very soon.

  3. John Roberts says:

    http://www.phishtank.com/sitechecker leads to the first Firefox extension to use PhishTank data that we know about.

  4. Ilgaz says:

    http://en.wikipedia.org/wiki/Bookmarklet

    Can’t the URL check functionality done via javascript “bookmarklet”? There are very advanced ones there such as Alexa out there.

    Very simple example is TinyURL “toolbar” button, http://tinyurl.com/#toolbar , note it is a simple javascript to send current URL to tiny url to make it small.

    So, a bookmarklet will do the necessary encoding, make it “safe” and pass to phishtank. When user in doubt, he/she will basically click button.

    Good things are:
    1) It is very simple install, just via drag/drop and fail safe
    2) Near universal (will work in any browser supporting javascript)

  5. John Roberts says:

    Ilgaz, absolutely… we encourage anyone to write one! We’ll promote it.

  6. Moike says:

    This works much better! I’m still having a problem verifying 2 URLs – for example, try the URLS from #11912 and #29228 – but I get a ‘false’ for present_in_database instead of ‘true’.

  7. Here are the bookmarklets as wished (tested only on Firefox 2.0)

    http://amit.chakradeo.net/2006/11/17/check-if-a-site-is-phishing-site/

  8. PhishTank bookmarklet…

    On Friday Thomas discovered OpenDNS – clicking around we soon discovered PhishTank, and boy – that service looks awesome. I’m already thinking about it how I can talk Marc into integrating it our with CGPro mailserver. So being a fan……

  9. Ram says:

    I have tried a number of valid phishes such as 93684 and 93656 and they all return
    in_database as false. However http://www.google.com returned as in_database true.

    HELP!!!

  10. EP says:

    I’m wondering if there is any way to search PhishTank’s archive of verified active and verified inactive. I’m doing a little research project for school and wouldn’t mind finding something along those lines. Thanks much.

    EP

  11. John Nagle says:

    Possible bug in API query:

    The mechanics of querying the API work fine, but the “valid” flag may have problems. I tried querying:

    “http://pardonus.nl/update/”,

    which is listed as a “valid phish”, and got back this XML:

    2007-03-05T15:07:04-08:00

    966f835b

    71.139.204.250.45eca29866e790.42437624

    true

    119617

    false

    Note that “verified” is “false”. That’s wrong. If you check the PhishTank details URL returned, that page says that this is a verified phish.

    Sometimes I do get back “true” for the “verified” value, but the XML doesn’t consistently agree with the details page.

    Is there an out-of-sync server somewhere?

  12. John Nagle says:

    Re previous message: posting the reply XML into blog comments did not work.

    What I got back was, essentially

    in_database = true
    verified = false

    with correct url and details links.

  13. Eric Pugh says:

    I wanted to comment on the API, I have found it very complex to use. I am working on Rails app http://www.fish4brains.com, and I looked at the chunk of code provided here http://www.edazzle.net/#phishtank, as well as using the legacy method:

    def phish?
    enc = Base64.encode64(@uri).chop
    res = Net::HTTP.post_form(URI.parse(’http://checkurl.phishtank.com/checkurl/’),
    {’url’=>enc})
    doc = Hpricot(res.body)
    if doc.search(”//errortext”).size > 0 then
    raise “Found error in url: #{@uri}”
    end

    return doc.search(”//verified”).inner_html==”true”
    end

    The SSL based api with Frobs is very difficult to implement, and I am not sure why, at least for checking data, that it is so complex. Same applies for url encoding using the legacy developer method, still too complex.. I don’t understand why it isn’t as simple as http://api.phishtank.com/url?url=http://someurl.com

    It would reduce the complexity of integration, and boost adoption. At least right now I’ve attempted to use the service twice, and both times run out of steam. I thought the simple url encoding would work, and it seemed to the first time around, but a couple weeks later the results started failing with in_database=false for everything.

  14. The POST method doesn’t work for most url’s. The problem is the base64 encoded string. Sometimes it contains = (equals) which doesn’t work well within post context. The solution is to urlencode the base64 string. However PhishTank POST method processing code doesn’t appear to be recognizing it resulting in malformed url in either case (with or without urlencode).

    However the older GET method works fine and is recommended.

    We are using PhishTank API in Comment Guard Pro, to be released on 28th January 2008, to detect spams (one of the modules). Any feedback would be appreciated.

Server: pt5.phishtank.com