<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.0.4" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: PhishTank improvements, including a third choice and new API calls</title>
	<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/</link>
	<description>A blog about and from PhishTank, a collaborative clearinghouse for data about phishing.</description>
	<pubDate>Tue, 07 Oct 2008 11:47:45 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.4</generator>

	<item>
		<title>by: jaded</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-106</link>
		<pubDate>Thu, 19 Oct 2006 21:58:00 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-106</guid>
					<description>One thing I've noticed is many people have submitted "spam" instead of "phish."  Perhaps if there were a link to flag them as "spam" we could properly identify them as legit instead of having people mark them as phish because they don't like spam.

On a related note, is there a way for a voter to add a note regarding a particular phish?  This site: http://www.phishtank.com/phish_detail.php?phish_id=20327
was a phish, but it was not obvious until I checked the URL:
http://israelibrokerageserviceslimired.com/index.php?sect_id=6&#38;form_id=1&#38;position=Financial+manager+for+cooperation+with+private+individuals&#38;country=usa
Note the R substituted for a T in the word "limired".  And after I voted, I discovered that 50% of the voters have been taken in by this phish.  A simple note might have helped them recognize phish from legit.</description>
		<content:encoded><![CDATA[<p>One thing I&#8217;ve noticed is many people have submitted &#8220;spam&#8221; instead of &#8220;phish.&#8221;  Perhaps if there were a link to flag them as &#8220;spam&#8221; we could properly identify them as legit instead of having people mark them as phish because they don&#8217;t like spam.</p>
<p>On a related note, is there a way for a voter to add a note regarding a particular phish?  This site: <a href='http://www.phishtank.com/phish_detail.php?phish_id=20327' rel='nofollow'>http://www.phishtank.com/phish_detail.php?phish_id=20327</a><br />
was a phish, but it was not obvious until I checked the URL:<br />
<a href='http://israelibrokerageserviceslimired.com/index.php?sect_id=6&amp;form_id=1&amp;position=Financial+manager+for+cooperation+with+private+individuals&amp;country=usa' rel='nofollow'>http://israelibrokerageserviceslimired.com/index.php?sect_id=6&amp;form_id=1&amp;position=Financial+manager+for+cooperation+with+private+individuals&amp;country=usa</a><br />
Note the R substituted for a T in the word &#8220;limired&#8221;.  And after I voted, I discovered that 50% of the voters have been taken in by this phish.  A simple note might have helped them recognize phish from legit.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: micha</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-100</link>
		<pubDate>Wed, 18 Oct 2006 15:44:03 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-100</guid>
					<description>Hello,

Nice work! One minor detail which would be helpful is: see all timestamps with offset, adjusting to users timezone.

Greetings</description>
		<content:encoded><![CDATA[<p>Hello,</p>
<p>Nice work! One minor detail which would be helpful is: see all timestamps with offset, adjusting to users timezone.</p>
<p>Greetings
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: John Roberts</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-95</link>
		<pubDate>Tue, 17 Oct 2006 16:22:23 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-95</guid>
					<description>Blain, we're working on the email parsing, and improving our whitelisting so that w3.org (for example) doesn't crop up.

We go through the flags, and often change the URL based on the feedback there.</description>
		<content:encoded><![CDATA[<p>Blain, we&#8217;re working on the email parsing, and improving our whitelisting so that w3.org (for example) doesn&#8217;t crop up.</p>
<p>We go through the flags, and often change the URL based on the feedback there.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Blain</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-94</link>
		<pubDate>Tue, 17 Oct 2006 15:44:07 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-94</guid>
					<description>The email submission process is pretty seriously flawed.  I've looked at my last few email submissions, and the uri that's being grabbed for verification is innocuous.  The last one is for the w3c site, while the phish I submitted was about a credit union, complete with an easy-to-see fake link (when you view the phish as text).  if email submission is going to be viable, we need a human intervention step that verifies which uri is the problem, or the automagic process has to be able to strip out obviously okay uris (like w3.org).</description>
		<content:encoded><![CDATA[<p>The email submission process is pretty seriously flawed.  I&#8217;ve looked at my last few email submissions, and the uri that&#8217;s being grabbed for verification is innocuous.  The last one is for the w3c site, while the phish I submitted was about a credit union, complete with an easy-to-see fake link (when you view the phish as text).  if email submission is going to be viable, we need a human intervention step that verifies which uri is the problem, or the automagic process has to be able to strip out obviously okay uris (like w3.org).
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: funchords</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-92</link>
		<pubDate>Sun, 15 Oct 2006 20:51:55 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-92</guid>
					<description>&lt;i&gt;If you flag a suspected phish via the “Something wrong with this submission?” link, that suspected phish should not show up via the “Next Unverified Phish” link until the flag is resolved. This is useful for power users.&lt;/i&gt;

I don't think this is working.  It seems to me that it keeps coming up until I finally give up and mark "I don't know."

Here's one issue.  The Phishers are buying domain names, and they take 0-48 hours to broadcast through and clear caches and etcetera.  So when a phish URL fails DNS lookup, it might be that my DNS servers don't have the new data yet.  Failing DNS lookup (unknown host) my response should be "I don't know."  Yet the URL has obvious phish poop in it, such as paypal/https/update -- I worry that I should mark these as "Phish" even though the site won't come up.  

Lead us, oh great ones!!</description>
		<content:encoded><![CDATA[<p><i>If you flag a suspected phish via the “Something wrong with this submission?” link, that suspected phish should not show up via the “Next Unverified Phish” link until the flag is resolved. This is useful for power users.</i></p>
<p>I don&#8217;t think this is working.  It seems to me that it keeps coming up until I finally give up and mark &#8220;I don&#8217;t know.&#8221;</p>
<p>Here&#8217;s one issue.  The Phishers are buying domain names, and they take 0-48 hours to broadcast through and clear caches and etcetera.  So when a phish URL fails DNS lookup, it might be that my DNS servers don&#8217;t have the new data yet.  Failing DNS lookup (unknown host) my response should be &#8220;I don&#8217;t know.&#8221;  Yet the URL has obvious phish poop in it, such as paypal/https/update &#8212; I worry that I should mark these as &#8220;Phish&#8221; even though the site won&#8217;t come up.  </p>
<p>Lead us, oh great ones!!
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Char</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-86</link>
		<pubDate>Thu, 12 Oct 2006 09:47:01 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-86</guid>
					<description>Maybe a Bug???  I've noticed the site will go a very long period of time with no phishes to verify, and then Bam! there are hundreds.  Is this intentional?  some glitch?  wouldn't it be easier to verify as they came in?</description>
		<content:encoded><![CDATA[<p>Maybe a Bug???  I&#8217;ve noticed the site will go a very long period of time with no phishes to verify, and then Bam! there are hundreds.  Is this intentional?  some glitch?  wouldn&#8217;t it be easier to verify as they came in?
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Blain</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-85</link>
		<pubDate>Thu, 12 Oct 2006 03:52:27 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-85</guid>
					<description>Okay, all of these improvements are to the good.  A couple more suggestions:

1.  Automagic screening of obvious phish.

That is, when the uri includes the name of an obvious phish-target like Ebay or Paypal, but in a way that attempts to mask that it's not actually that target.  If this is the case, then automagically pinging the document to see if it loads should be enough to get it listed -- there is no way you're going to find legitimate sites in those situations.

2.  Bayesian Analysis for pre-screening.
A la POPfile for email classification.  This could be done on the text of the emails submitted as well as on the source code of the documents themselves.  A little bit of training can go a long way in teaching a classifier what is and isn't phish.  This would make it easier for project members who would only need to confirm or reject the program's classification.  

The latter would take a different kind of programming, but there are open source projects using this kind of classification to draw upon.  It could also spawn phishers throwing lots of bloating code into sites to try to confuse the classifiers, but it would be less likely to be successful in this environment than it would in email, and it's not very effective in the world of email as it is.  And, if you GPLish it, it could make the basis for a number of products used to classify bad websites of various kinds.</description>
		<content:encoded><![CDATA[<p>Okay, all of these improvements are to the good.  A couple more suggestions:</p>
<p>1.  Automagic screening of obvious phish.</p>
<p>That is, when the uri includes the name of an obvious phish-target like Ebay or Paypal, but in a way that attempts to mask that it&#8217;s not actually that target.  If this is the case, then automagically pinging the document to see if it loads should be enough to get it listed &#8212; there is no way you&#8217;re going to find legitimate sites in those situations.</p>
<p>2.  Bayesian Analysis for pre-screening.<br />
A la POPfile for email classification.  This could be done on the text of the emails submitted as well as on the source code of the documents themselves.  A little bit of training can go a long way in teaching a classifier what is and isn&#8217;t phish.  This would make it easier for project members who would only need to confirm or reject the program&#8217;s classification.  </p>
<p>The latter would take a different kind of programming, but there are open source projects using this kind of classification to draw upon.  It could also spawn phishers throwing lots of bloating code into sites to try to confuse the classifiers, but it would be less likely to be successful in this environment than it would in email, and it&#8217;s not very effective in the world of email as it is.  And, if you GPLish it, it could make the basis for a number of products used to classify bad websites of various kinds.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Char</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-83</link>
		<pubDate>Wed, 11 Oct 2006 20:37:05 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-83</guid>
					<description>Can you guys as the admins, ban the introduction of certain sites?  One user in particular is constantly submitting his personal home page.  Yet I have never seen him submit an actual phish.</description>
		<content:encoded><![CDATA[<p>Can you guys as the admins, ban the introduction of certain sites?  One user in particular is constantly submitting his personal home page.  Yet I have never seen him submit an actual phish.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: John Roberts</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-81</link>
		<pubDate>Wed, 11 Oct 2006 18:15:19 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-81</guid>
					<description>Amit, not yet. I've emailed you privately.

Others with API question... please reach out to me at my first name at opendns dot com.

We'll consider a mailing list or forum.</description>
		<content:encoded><![CDATA[<p>Amit, not yet. I&#8217;ve emailed you privately.</p>
<p>Others with API question&#8230; please reach out to me at my first name at opendns dot com.</p>
<p>We&#8217;ll consider a mailing list or forum.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Amit Chakradeo</title>
		<link>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-80</link>
		<pubDate>Wed, 11 Oct 2006 17:50:33 +0000</pubDate>
		<guid>http://www.phishtank.com/blog/2006/10/11/improvements-third-choice-new-api-calls/#comment-80</guid>
					<description>Hi, 

  Is there a group/mailing list to discuss API issues ?  I am trying to hack up a ruby api library for this, but am constantly getting a response "Cant get shared secret."

--Amit</description>
		<content:encoded><![CDATA[<p>Hi, </p>
<p>  Is there a group/mailing list to discuss API issues ?  I am trying to hack up a ruby api library for this, but am constantly getting a response &#8220;Cant get shared secret.&#8221;</p>
<p>&#8211;Amit
</p>
]]></content:encoded>
				</item>
</channel>
</rss>
