Reducing Repetition Rate: Unbiased Delay Sampling in Online Social Networks

Author(s): Bingxian Chen, Lianggui Liu*, Huiling Jia, Yu Zhang.

Journal Name: Recent Patents on Computer Science

Volume 10 , Issue 4 , 2017

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Background: Due to the large network scale, nowadays, it is hard to get extensive data from online social networks (OSN). Moreover, a large number of social nodes and links have made network data analysis a time-consuming task. Therefore, to sample the large-scale online social networks and restore the topological properties of original network become a problem. The purpose of this paper is to study an unbiased sampling method that can extract a representative sample from the social graph.

Methods: We propose an improved algorithm based on MHRW, called Unbiased Delay sampling (UD algorithm). Then we compare it with some recent patents on sampling method to evaluate our method.

Results: Different sample methods extract subnet with different topological properties. We find that UD can adapt to all kinds of different network connectivity. On the one hand, UD has a better degree distribution when the sample does not consider repeated nodes; on the other hand, UD algorithm can reduce the probability of reiterated nodes selected to sample and improve the ability of network discovery.

Conclusion: We get the first, to the best of our knowledge, unbiased sampling method which has a good degree of distribution when the sample set does not have duplicate nodes. More specifically, we add parameter α to sampling process, and the value of α can control the repetition rate of the sample set.

Keywords: Social network, MHRW, twitter, degree distribution, independent sample, unbiased sampling.

Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 10
ISSUE: 4
Year: 2017
Page: [308 - 314]
Pages: 7
DOI: 10.2174/2213275911666180403110851
Price: $58

Article Metrics

PDF: 9
HTML: 2