In the world of today's internet the people all around the globe have the right of open speech through online forums.There are a few topics on which one cannot post or write on online forms we call it Hate speech.
So this blog gives you info about how to build an AI model that recognises hate speech on your forum.
Before diving into the actual technical aspects. Artificial intelligence has evolved to a new form when it comes to language models for example chatgpt,DallE2 etc.So these products use natural language processing and language understanding to provide results.
The model that we are building also uses language understanding which understands the text prompt and gives the negativity score that classifies whether it is hate speech or not.
We will be using python module called Tensorflow to train our language model.
Importing modules
import matplotlib.pyplot as plt
import os
import re
import shutil
import string
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import losses
print(tf.__version__)
Downloading dataset
We will be using sentiment analysis data from Stanford University this is a dataset of movie reviews along with the labels labeling them as postive and negative review.
url = "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"
dataset = tf.keras.utils.get_file("aclImdb_v1",url,untar=True,cache_dir='.',cache_subdir='')
dataset_dir=os.path.join(os.path.dirname(dataset),'aclImdb')
train_dir= os.path.join(dataset_dir,'train')
os.listdir(train_dir)
remove_dir =os.path.join(train_dir, 'unsup')
shutil.rmtree(remove_dir)
os.listdir(dataset_dir)
batch_size=32
seed = 42
raw_train_ds = tf.keras.utils.text_dataset_from_directory(
'aclImdb/train',batch_size=batch_size,validation_split = 0.2,subset='training',seed=seed
)
The above code downloads the dataset from the website and divides it into train and test data.
Preprocessing the data
You can view the dataset using the below code
for text_batch, label_batch in raw_train_ds.take(1):
for i in range(3):
print("Review",text_batch.numpy()[i])
print("Label",label_batch.numpy()[i])
we will group out the train and test datasets into variables
raw_val_ds = tf.keras.utils.text_dataset_from_directory(
'aclImdb/train',batch_size=batch_size,
validation_split=0.2,
subset='validation',
seed = seed
)
raw_test_ds=tf.keras.utils.text_dataset_from_directory('aclImdb/test',
batch_size=batch_size)
Then as we derive data from webpages we need to remove html tags and other symbols like emojis etc so we define a custom standardization function that standardizes the data. Also we implement the function as a vectorization layer so that it can be used in our model.
def custom_standardisation(input_data):
lowercase=tf.strings.lower(input_data)
stripped_html= tf.strings.regex_replace(lowercase,'<br />',' ')
return tf.strings.regex_replace(stripped_html,'[%s]' %re.escape(string.punctuation),
'')
max_features = 10000
sequence_length=250
#creating a layer that applies custom standardization
vectorize_layer=layers.TextVectorization(
standardize=custom_standardisation, max_tokens=max_features,
output_mode='int',
output_sequence_length=sequence_length
)
you can view the results of vector layer using below code.(optional)
print("Review",first_review)
print("label", raw_train_ds.class_names[first_label])
print("vectorized Review", vectorize_text(first_review, first_label))
print("1287 ----->",vectorize_layer.get_vocabulary()[1287])
print(" 313 ---->",vectorize_layer.get_vocabulary()[313])
print("vocabulary size:{}".format(len(vectorize_layer.get_vocabulary())))
Defining the model
Now we define our actual language model by taking the standardized input by using custom layer that automates the pre processing.
train_ds = raw_train_ds.map(vectorize_text)
val_ds = raw_val_ds.map(vectorize_text)
test_ds=raw_test_ds.map(vectorize_text)
AUTOTUNE= tf.data.AUTOTUNE
train_ds=train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds=val_ds.cache().prefetch(buffer_size=AUTOTUNE)
test_ds = test_ds.cache().prefetch(buffer_size=AUTOTUNE)
#building the model with custom layer
embedding= 16
model = tf.keras.Sequential([
#vectorize_layer,
layers.Embedding(max_features + 1,embedding),
layers.Dropout(0.2),
layers.GlobalAvgPool1D(),
layers.Dropout(0.2),
layers.Dense(1),
#layers.Activation('sigmoid')
])
model.summary()
We compile the model with the loss function optimizer and the number of epochs.
model.compile(
optimizer='adam',
loss=losses.BinaryCrossentropy(from_logits=True),
metrics=['BinaryAccuracy'])
epochs=10
history= model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
loss , accuracy= model.evaluate(test_ds)
print("loss: ",loss)
print("Accuracy: ",accuracy)
We will modify and finalize the model.
final_model = tf.keras.Sequential([
vectorize_layer,
model,
layers.Activation('sigmoid')
])
final_model.compile(loss=losses.BinaryCrossentropy(from_logits=False),optimizer='adam',metrics=['accuracy'])
Hurray you have created a model that automates the process of detecting hate speech.
You can view the learning curve of the model through the below code.
history_dict=history.history
history_dict.keys()
acc = history_dict['binary_accuracy']
val_acc= history_dict['val_binary_accuracy']
loss=history_dict['loss']
val_loss=history_dict['val_loss']
epochs=range(1, len(acc)+1)
#bo for blue dot
plt.plot(epochs, loss, 'bo', label='Training loss')
#b for blue solid line
plt.plot(epochs, val_loss,'b',label='Validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.show()
#bo for blue dot
plt.plot(epochs, acc, 'bo', label='Training loss')
#b for blue solid line
plt.plot(epochs, val_acc,'b',label='Validation loss')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.show()
User Interface
In this blog we have developed a basic user interface that asks for the prompt by the user and returns the negativity score.Below code is the implementation of the same.
import numpy as np
examples = []
q=input("enter a review: \n")
examples.append(q)
p=final_model.predict(examples)
print('negativity score',end=' ')
print(p[-1])
You can further deploy the model in your website that provides you a alert whenever it detects hate speech on your forum.
Thank you for reading the blog.
For more such informative blogs do follow.
share it with your friends and family.😁
Top comments (0)