Methodology | Classification of fake news article titles using character-level convolutional neural networks

Computer Information

Experiments were performed on Ubuntu 20.04 running on Windows Subsystem for Linux on an 8-core, 16-thread AMD Ryzen 3700X and NVIDIA RTX 2070 SUPER, with 32GB maximum virtual memory usage. The memory limit was quickly reached with analysis of full article content, so future experimentation on a virtual machine or on upgraded hardware should be carried out to extend upon the methods in this paper.

Software

Python 3.7

Linear Algebra and Data: NumPy, pandas

Neural Network Architecture: Tensorflow, keras backend

Statistics, Classification, and Evaluation: scikit-learn

Other Packages: re, time

Preprocessing

For experiments where the input length was constrained, all titles of at least 30 and no more than 96 characters in length were retained, while all others were discarded. All experiments had data that were balanced by finding the greatest multiple of 100 less than or equal to the size of either class, and capping the number of samples per class to that number. From there, the quotation marks are cleaned to standard nondirectional single and double quotes, and each character is assigned an ID such that every non-whitespace character is assigned an ID, with unknown characters being assigned UNK; since the length of this ID table depends on the character set used, uppercase and lowercase characters are not assigned uniform IDs. These strings of IDs are padded with 0 characters to achieve a uniform length so that the model may take the information as input.

Classification

Experimentation was performed initially using a classifier with internals nearly identical to that of Zhang, Zhao, and LeCun (2016). From here, we varied the input length, character set, and number of nodes within the fully-connected layers. The activation layers are governed by the \(\mathrm{ReLU}(x)= \max \{0,x\}\) and batch size for training was set to 64 samples.

Table 1: Table of Constant Parameters. Key: L=lowercase, U=Uppercase, C=C1D=Convolution, M=MP=Max Pooling,FC=D=Fully Connected
Layer.Order	C1D.Filters	Kernel.Size	MP.Factor
C-M-C-M-C-C-C-C-M-D-D	256	6-6-2-2-2-2	3

Table 2: Comparative table of models tested.
X.Constants.	Samples	Case	Input	FC.Size	Params
Constants	42800	L	339	1024	4786143
Constants	42800	U	339	1024	4837025
Constants	42800	L	339	256	2031327
Constants	42800	U	339	256	2082209
Constants	29800	L	123	256	1507039
Constants	29800	U	123	256	1557921
Constants	29800	L	123	64	1395871
Constants	29800	U	123	64	1446753