Researchers: Ms Rizka Purwanto, Dr Alan Blair, Prof Sanjay Jha (UNSW) with Dr Arindam Pal (Data61)
Funding body: Cybersecurity CRC PhD topup
Our goal is to detect and predict phishing websites before they can do any harm to the users. Previous phishing detection methods employed machine learning algorithms. They used traditional classification techniques like naive Bayes, logistic regression, k-nearest neighbours, support vector machines, decision trees and artificial neural networks. These algorithms are not able to cope with the dynamic nature of phishing, as the fraudsters are constantly changing the webpage design and hyperlink every couple of hours.
Unlike Machine-based learning models, PhishZip's approach does not require model training or HTML parsing. Instead, we compress the HTML file to determine whether it is a phishing website. Thus, classification with compressed algorithms is faster and simpler.
The project has a significant impact on phishing and spamming emails and websites. We have used this algorithm on several phishing websites which are clones of PayPal, Facebook, Microsoft, ING Direct and other popular websites.
Reference Paper: PhishZip: A new Compression-based Algorithm for Detecting Phishing Websites, Rizka Purwanto, Arindam Pal, Alan Blair, and Sanjay Jha, IEEE Conference on Communications and Network Security (CNS2020), Avignon, France.