Founder: @Coder-Yu
Main Contributors: @somnussyq @hustzhoutian @DouTong
Code Reviewer: @mingaoo
SDLib: A Python library used to collect shilling detection methods (python 2.7.x). (for academic research)
- 1.Configure the **xx.conf** file in the directory named config. (xx is the name of the method you want to run)
- 2.Run the **main.py** in the project, and then input following the prompt.
Entry | Example | Description |
---|---|---|
ratings | ../dataset/averageattack/ratings.txt | Set the path to the dirty recommendation dataset. Format: each row separated by empty, tab or comma symbol. |
label | ../dataset/averageattack/labels.txt | Set the path to labels (for users). Format: each row separated by empty, tab or comma symbol. |
ratings.setup | -columns 0 1 2 | -columns: (user, item, rating) columns of rating data are used;
-header: to skip the first head line when reading data |
MethodName | DegreeSAD/PCASelect/etc. | The name of the detection method |
evaluation.setup | -testSet ../dataset/testset.txt | Main option: -testSet, -ap, -cv -testSet path/to/test/file (need to specify the test set manually) -ap ratio (ap means that the user set (including items and ratings) are automatically partitioned into training set and test set, the number is the ratio of test set. e.g. -ap 0.2) -cv k (-cv means cross validation, k is the number of the fold. e.g. -cv 5) |
output.setup | on -dir ./Results/ | Main option: whether to output recommendation results -dir path: the directory path of output results. |
- 1.Make your new algorithm generalize the proper base class.
- 2.Rewrite some of the following functions as needed.
- printAlgorConfig()
- initModel()
- buildModel()
- saveModel()
- loadModel()
- predict()
- 1.Configure the **xx.conf** file in shillingmodels/config/.
- 2.Modify /shillingmodels/generateData.py as needed and run it.
Entry | Example | Description |
---|---|---|
ratings | ../dataset/averageattack/ratings.txt | Set the path to the recommendation dataset. Format: each row separated by empty, tab or comma symbol. |
ratings.setup | -columns 0 1 2 | -columns: (user, item, rating) columns of rating data are used;
-header: to skip the first head line when reading data |
attackSize | 0.01 | The ratio of the injected spammers to genuine users |
fillerSize | 0.01 | The ratio of the filler items to all items |
selectedSize | 0.001 | The ratio of the selected items to all items |
linkSize | 0.01 | The ratio of the users maliciously linked by a spammer to all user |
targetCount | 20 | The count of the targeted items |
targetScore | 5.0 | The score given to the target items |
threshold | 3.0 | Item has an average score lower than threshold may be chosen as one of the target items |
minCount | 3 | Item has a ratings count larger than minCount may be chosen as one of the target items |
maxCount | 50 | Item has a rating count smaller that maxCount may be chosen as one of the target items |
outputDir | ./data/ | User profiles and labels will be output here |
Algorithm | Paper |
---|---|
DegreeSAD | Wentao Li and Min Gao, Shilling Attack Detection in Recommender Systems via Selecting Patterns Analysis,IEICE Transactions on Information and System (2016) |
PCASelectUsers | Mehta, Bhaskar, and Wolfgang Nejdl. "Unsupervised strategies for shilling detection and robust collaborative filtering." User Modeling and User-Adapted Interaction (2009) |
SemiSAD | Cao, Jie, et al. "Shilling attack detection utilizing semi-supervised learning method for collaborative recommender system." World Wide Web 16.5-6 (2013) |
FAP | Zhang, Yongfeng, et al. "Catch the Black Sheep: Unified Framework for Shilling Attack Detection Based on Fraudulent Action Propagation." IJCAI (2015) |
CoDetector | Tong Dou, Junliang Yu et al. "Collaborative Shilling Detection bridging Factorization and User Embedding." COLLABORATECOM (2017) |