xinleihe / medium-open-data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

An anonymized dataset of 50135 users of Medium

Identifying influential users in “cold start” scenarios, i.e., predicting whether a newly registered user or a current inactive user on an emerging OSN would become an influential one.

Licensed under Creative Commons Attribution Share Alike 4.0.

File

you can download the medium_open_dataset.csv.
Each line contains a user’s information, including his Twitter information and Medium information.

Feature discription

  • Index features
    • First column: Index of this user;
    • mid: Anonymous medium userid;
    • tid: Anonymous Twitter userid;


  • Twitter features:
    • bio_words_num: Number of words of the user’s biography in Twitter;
    • has_location: Weather the user add location in his/her Twitter;
    • utc_offset: UTC offset in Twitter;
    • has_extended_profile: Whether the user add other homepage;
    • t_account_age: Age of the account in Twitter;
    • default_profile_img: Whether the user has changed the default profile image in Twitter;
    • has_profile_bg_img: Whether the user has changed the default profile background image in Twitter;
    • verified: Weather the user has verified by Twitter;
    • follower_count: Number of follower in Twitter;
    • following_count: Number of following in Twitter;
    • total_tweets_num: Total tweets number in Twitter;
    • geo_enabled: Weather the user has geo tags in Twitter;
    • listed_count: Number of lists subscribed to in Twitter;
    • t_num: Number of original tweets in Twitter;
    • rt_num: Number of retweets in Twitter;
    • t_favourite: Number of “likes” received in Twitter;
    • t_rt: Number of “retweet” of original tweets in Twitter;
    • avg_favourite: Average number of “likes” received of original tweets in Twitter;
    • avg_rt: Average number of “retweet” of original tweets in Twitter;


  • Medium features:
    • follower: Number of follower in Medium;
    • following: Number of following in Medium;
    • latest_num: Number of the latest stories published by the user in Medium;
    • bio_words:Number of words of the user’s biography in Medium;
    • account_age:Age of the account in Medium;
    • has_facebook: Weather the user has link to Facebook account in Medium;
    • has_p_img: Weather the user has profile image in Medium;
    • has_pb_img: Weather the user has background image in Medium;
    • interest_tag_num: Number of interesting tags in Medium;
    • claps_num: Number of stories the user claps for in Medium;

BibTex Entry

@inproceedings{Gong_Medium18,
	author = {Qingyuan Gong, Yang Chen, Xinlei He, Fei Li, Yu Xiao, Pan Hui, Xin Wang and Xiaoming Fu},
	title = {Identification of Influential Users in Emerging Online Social Networks Using Cross-Site Linking.}},
	booktitle = {Proc. of the 13th CCF Chinese Conference on Computer Supported Cooperative Work (ChineseCSCW’18)},
	year = {2018},
} 

About