Hate Speech and Offensive Language Detection in Twitter Data Using Machine Learning Classifiers

dc.contributor.authorShah, Seyed Muzaffar Ahmad
dc.contributor.authorSingh, Satwinder
dc.date.accessioned2024-01-21T10:48:41Z
dc.date.accessioned2024-08-14T05:05:35Z
dc.date.available2024-01-21T10:48:41Z
dc.date.available2024-08-14T05:05:35Z
dc.date.issued2023-05-03T00:00:00
dc.description.abstractSocial media is rapidly growing in popularity and has its advantages and disadvantages. Users posting their daily updates and opinions on social media may inadvertently hurt the feelings of others. Detecting hate speech and harmful information on social media is critical these days, lest it led to calamity. In this research, machine learning classifiers such as Na�ve Bayes, support vector machines, logistic regression, and pre-trained models BERT and RoBERTa, developed by Google and Facebook, respectively, are used to detect hate speech and offensive content from Twitter data on a newly created dataset that included tweets and articles/blogs. The sentiments were obtained using the VADER sentiment analyzer. The results depicted that the pre-trained classifiers outperformed the machine learning classifiers utilized in this study. An accuracy score of 96% and 93% was scored by BERT and RoBERTa, respectively, on the tweet dataset, whereas on a dataset of articles/blogs, accuracy of 97% and 98%, respectively, was achieved by both the classifiers outperforming other classifiers used in this work. Further, it can also be depicted that neutral content is shared more in articles/blogs, hate content is mostly shared equally in both the tweets and article/blogs, whereas offensive content is shared higher in tweets than articles/blogs. � 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.en_US
dc.identifier.doi10.1007/978-981-19-7455-7_17
dc.identifier.isbn9789811974540
dc.identifier.issn23673370
dc.identifier.urihttps://kr.cup.edu.in/handle/32116/3921
dc.identifier.urlhttps://link.springer.com/10.1007/978-981-19-7455-7_17
dc.language.isoen_USen_US
dc.publisherSpringer Science and Business Media Deutschland GmbHen_US
dc.subjectBERTen_US
dc.subjectHate speechen_US
dc.subjectOffensive languageen_US
dc.subjectRoBERTaen_US
dc.subjectTweetsen_US
dc.subjectVADERen_US
dc.titleHate Speech and Offensive Language Detection in Twitter Data Using Machine Learning Classifiersen_US
dc.title.journalLecture Notes in Networks and Systemsen_US
dc.typeConference paperen_US
dc.type.accesstypeClosed Accessen_US

Files