Computer Science and Computer Engineering Faculty Publications and Presentations

Wikipedia Vandal Early Detection: From User Behavior to User Embedding

Shuhan Yuan, University of Arkansas, Fayetteville
Panpan Zheng, University of Arkansas, Fayetteville
Xintao Wu, University of Arkansas, FayettevilleFollow
Yang Xiang, University of Arkansas, FayettevilleFollow

Document Type

Conference Proceeding

Publication Date

2017

Keywords

Usual embedding, US edition, vandalism detection, benign users, state reverts

Abstract

Wikipedia is the largest online encyclopedia that allows anyone to edit articles. In this paper, we propose the use of deep learning to detect vandals based on their edit history. In particular, we develop a multi-source long-short term memory network (M-LSTM) to model user behaviors by using a variety of user edit aspects as inputs, including the history of edit reversion information, edit page titles and categories. With M-LSTM, we can encode each user into a low dimensional real vector, called user embedding. Meanwhile, as a sequential model, M-LSTM updates the user embedding each time after the user commits a new edit. Thus, we can predict whether a user is benign or vandal dynamically based on the up-to-date user embedding. Furthermore, those user embeddings are crucial to discover collaborative vandals. Code and data related to this chapter are available at: https://bitbucket.org/bookcold/vandal_detection.

Comments

Principal Investigator: Xintao Wu

Acknowledgements: The authors acknowledge the support from the 973 Program of China (2014CB340404), the National Natural Science Foundation of China (71571136), and the Research Projects of Science and Technology Commission of Shanghai Municipality (16JC1403000, 14511108002) to Shuhan Yuan and Yang Xiang, and from National Science Foundation (1564250) to Panpan Zheng and Xintao Wu. This research was conducted while Shuhan Yuan visited University of Arkansas.

Citation

Yuan S., Zheng P., Wu X., Xiang Y. (2017) Wikipedia Vandal Early Detection: From User Behavior to User Embedding. In: Ceci M., Hollmén J., Todorovski L., Vens C., Džeroski S. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2017. Lecture Notes in Computer Science, vol 10534. doi: https://doi.org/10.1007/978-3-319-71249-9_50

This document is currently not available here.

COinS

Computer Science and Computer Engineering Faculty Publications and Presentations

Wikipedia Vandal Early Detection: From User Behavior to User Embedding

Document Type

Publication Date

Keywords

Abstract

Comments

Citation

Browse

Links

Search

Computer Science and Computer Engineering Faculty Publications and Presentations

Wikipedia Vandal Early Detection: From User Behavior to User Embedding

Authors

Document Type

Publication Date

Keywords

Abstract

Comments

Citation

Share

Browse

Links

Search