Empirical Study on the Effect of Zero-Padding in Text Classification with CNN

Henry Cheng

MAS, 2020

Wu, Yingnian

In CNN-based text classification tasks, where a CNN model is trained on top of pre-trained word vectors, padding is applied to ensure the input dimension is consistent, which is a requirement for CNN architecture. Traditionally, there are no set rules on how padding should be applied and padding is usually applied to the bottom of the text to achieve uniform length. Borrowing from the idea in computer vision, we show that there is no significant difference between applying zero-padding to the bottom of text embeddings and to both sides of the text embeddings.

2020

Download Link