Incorporating Pre-Training in Long Short-Term Memory Networks for Tweets Classification
Training, Logistics, Data models, Twitter, Semantics, Tagging, Logic gates, Artificial intelligence, Pattern classification, Recurrent neural nets, Regression analysis, Social networking, LSTM, Tweets classification, Pre-training, Deep learning, Long short-term memory networks, Binary classification, Long-term dependencies, Semantic tweet representation, Logistic regression, Tweet label, LSTM-TC model, Well-labeled training data, Weakly-labeled data, Hashtag information, Tweet representation, Logistic regression classifier, Weakly-labeled tweets
The paper presents deep learning models for tweets binary classification. Our approach is based on the Long Short-Term Memory (LSTM) recurrent neural network and hence expects to be able to capture long-term dependencies among words. We develop two models for tweets classification. The basic model, called LSTM-TC, takes word embeddings as input, uses the LSTM layer to derive semantic tweet representation, and applies logistic regression to predict tweet label. The basic LSTM-TC model, like other deep learning models, requires a large amount of well-labeled training data to achieve good performance. To address this challenge, we further develop an improved model, called LSTM-TC*, that incorporates a large amount of weakly-labeled data for classifying tweets. We present two approaches of constructing the weakly-labeled data. One is based on hashtag information and the other is based on the prediction output of some traditional classifier that does not need a large amount of well-labeled training data. Our LSTM-TC* model first learns tweet representation based on the weakly-labeled data, and then trains the logistic regression classifier based on the small amount of well-labeled data. Experimental results show that: (1) the proposed method can be successfully used for tweets classification and outperform existing state-of-the-art methods, (2) pre-training tweet representation, which utilizes weakly-labeled tweets, can significantly improve the accuracy of tweets classification.
Yuan, S., Wu, X., & Xiang, Y. (2016). Incorporating pre-training in long short-term memory networks for tweets classification doi:10.1109/ICDM.2016.0181