SHED: Spam Ham Email Dataset

dc.contributor.authorSharma, Upasana
dc.contributor.authorKhurana, Surinder Singh
dc.date.accessioned2018-01-08T05:53:35Z
dc.date.accessioned2024-08-14T05:05:37Z
dc.date.available2018-01-08T05:53:35Z
dc.date.available2024-08-14T05:05:37Z
dc.date.issued2017
dc.description.abstractAutomatic filtering of spam emails becomes essential feature for a good email service provider. To gain direct or indirect benefits organizations/individuals are sending a lot of spam emails. Such kind emails activities are not only distracting the user but also consume lot of resources including processing power, memory and network bandwidth. The security issues are also associated with these unwanted emails as these emails may contain malicious content and/or links. Content based spam filtering is one of the effective approaches used for filtering. However, its efficiency depends upon the training set. The most of the existing datasets were collected and prepared a long back and the spammers have been changing the content to evade the filters trained based on these datasets. In this paper, we introduce Spam Ham email dataset (SHED): a dataset consisting spam and ham email. We evaluated the performance of filtering techniques trained by previous datasets and filtering techniques trained by SHED. It was observed that the filtering techniques trained by SHED outperformed the technique trained by other dataset. Furthermore, we also classified the spam email into various categories.en_US
dc.identifier.citationSharma, U., & Khurana. S.S. (2016). SHED Ham Email Dataset. International Journal of Advanced Research in Computer Science, 5(6), 1078-1082en_US
dc.identifier.issn2321-8169
dc.identifier.urihttps://kr.cup.edu.in/handle/32116/507
dc.language.isoen_USen_US
dc.publisherInternational Journal on Recent and Innovation Trends in Computing and Communication (en_US
dc.subjectSpam Emailen_US
dc.subjectNon-spam emailsen_US
dc.subjectWEKAen_US
dc.subjectfeature selectionen_US
dc.subjectclassifiersen_US
dc.subjectParametersen_US
dc.titleSHED: Spam Ham Email Dataseten_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
6.pdf
Size:
400.9 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: