【Data Science Project】 Wikipedia Toxic Comments Classification with Convolutional Neural Network (CNN)

btd
14 min readNov 25, 2023

I. Introduction

Welcome to this hands-on introduction to Text Classification using 1D Convolutions with Keras. By the end of this project, you will be able to apply word embeddings for text classification, use 1D convolutions as feature extractors in natural language processing (NLP), and perform binary text classification using deep learning.

As a case study, we will work on classifying a large number of Wikipedia comments as being either toxic or not (i.e. comments that are rude, disrespectful, or otherwise likely to make someone leave a discussion). This issue is especially important, given the conversations the global community and tech companies are having on content moderation, online harassment, and inclusivity. The data set we will use comes from the Toxic Comment Classification Challenge on Kaggle.

II. Load and Explore the Data

--

--

btd
btd

No responses yet