Global and Local Feature Learning for Ego-Network Analysis

by   Fatemeh Salehi Rizi, et al.
Universität Passau

In an ego-network, an individual (ego) organizes its friends (alters) in different groups (social circles). This social network can be efficiently analyzed after learning representations of the ego and its alters in a low-dimensional, real vector space. These representations are then easily exploited via statistical models for tasks such as social circle detection and prediction. Recent advances in language modeling via deep learning have inspired new methods for learning network representations. These methods can capture the global structure of networks. In this paper, we evolve these techniques to also encode the local structure of neighborhoods. Therefore, our local representations capture network features that are hidden in the global representation of large networks. We show that the task of social circle prediction benefits from a combination of global and local features generated by our technique.


page 1

page 2

page 3

page 4


Deep Representation Learning for Social Network Analysis

Social network analysis is an important problem in data mining. A fundam...

Is a Single Embedding Enough? Learning Node Representations that Capture Multiple Social Contexts

Recent interest in graph embedding methods has focused on learning a sin...

Revisiting the Design Issues of Local Models for Japanese Predicate-Argument Structure Analysis

The research trend in Japanese predicate-argument structure (PAS) analys...

Community Aware Random Walk for Network Embedding

Social network analysis provides meaningful information about behavior o...

Local Deep-Feature Alignment for Unsupervised Dimension Reduction

This paper presents an unsupervised deep-learning framework named Local ...

Learning Topological Representation for Networks via Hierarchical Sampling

The topological information is essential for studying the relationship b...

Network Embedding via Deep Prediction Model

Network-structured data becomes ubiquitous in daily life and is growing ...

I Introduction

With the exponential growth of social networks, extracting features for nodes and detecting distinct neighborhood patterns become increasingly impractical for the full network. One effective way to succinctly describe certain aspects of large networks is breaking up the network into smaller sub-networks [1]. This is accomplished by considering certain node or subgraph level locality statistics specified on local regions of a network. These local regions are defined as neighborhoods around a focal node (called ego). Therefore, ego-networks are social networks made up of an ego along with all the social ties he has with other people (called alters). Usually an ego categorizes his alters into different groups (called social circles) such as family members, friends, colleagues, etc. Figure 1 shows an exemplary ego-network.

Figure 1: An ego-network with four social circles

Ego-networks are an important subject of investigation in anthropology, as several fundamental properties of social relationships can be characterised by studying them. In particular, it has been shown that neighborhoods around egos can exhibit different patterns. Based on prototypes of interactions between alters, prototypical neighborhood trends around egos can be dense, complete, star, etc. [27]. Therefore, finding vector representations which convey the local neighborhood structure of egos is very helpful for ego-network analysis. These vector representations can easily exploited by statistical models for tasks such as social circle detection and prediction. Indeed, the local neighborhood analysis of nodes can reveal patterns and features of the network which are concealed when only the global analysis is considered [24].

There has been several studies to learn vector representations for nodes based on global features. For instance, DeepWalk [9] learns global representations for nodes in the social graph utilizing deep learning techniques. In DeepWalk, first, nodes are sampled from the underlying network to turn a network into a ordered sequence of nodes same as way a document is an ordered sequence of words. Then, the Skip-gram model [7] is applied to learn feature representations for nodes by optimizing a neighborhood preserving likelihood objective. Similarly, node2vec [25] learns a mapping of nodes to a low-dimensional space of features which preserves the flexible notion of nodes’ neighborhoods. Node2vec samples nodes using Breadth First Search or Depth First Search strategies. However, applying node2vec local sampling over ego-networks can not capture different structures since it exceeds the ego neighborhood.

Since an ego-network consists of certain number of alters, doing random walk over an ego-network can build an artificial paragraph. On the other hand, Paragraph Vector [10] is an unsupervised framework which learns continuous distributed vector representations for pieces of texts. In this paper, we exploit the Paragraph Vector to learn neighborhood structures of egos in the social graph. Therefore, we investigate the interplay of global and local representations and make the following contributions.

  • We introduce local vector representations for nodes in ego-networks to complement the global representations for capturing the neighborhood structure; learning relations in a small neighborhood instead of relations in the entire graph. (section II)

  • We apply global and local feature learning to the circle prediction problem. (section III)

  • We replace global representations by local representations to improve the performance. (section III)

The remainder of the paper is organized as follows. In section II, we elaborate on global and local feature learning of nodes in ego-networks. In section III, we describe the problem of circle prediction and our approach. In section IV, we evaluate our approach on three datasets from real-world social networks. We conclude in section V.

Ii Global and Local Feature Learning

An undirected graph is denoted by , where is a set of nodes and is a set of edges. Furthermore, let contain egos . For an ego , we have the ego-network as sub-graph of , where is the neighborhood of and is the intersection with . We call a node in an alter for and denote the set of alters of by . Sets of alters for different egos may overlap.

In this section, we apply the techniques which have been used to model sentences and paragraphs of natural languages to model community structure in networks. Therefore, we capture information on the global and local network topology as follows:

Ii-a : Learning global representation for each node

According to DeepWalk, global feature learning consists of two main components; first a random walk generator and second an update procedure. Assume are all nodes in the graph , the idea is doing random walks started from every single node. Then, having sequences of nodes such as with a context length

, we update the representations to maximize the average log probability:


Therefore, we have a mapping function , where is the embedding size.

Ii-B : Learning local representation for each ego

Inspired by Paragraph Vector, we learn a vector representation for every ego . Given ego , first, we do random walks on to compose an artificial paragraph which is called an ego-walk. This means an ego-walk is a stream of short random walks started at every . Then, having the ego-walk for ego , we aim to update the representations in order to maximize the average log probability:


Where is the length of the ego-walk with , and is the context length. Therefore, we introduce a mapping function , where is the embedding size.

In our technique (see Figure 2), every ego is mapped to a unique vector, represented by a column in ego matrix D and every alter is also mapped to a unique vector, represented by a column in matrix W . The ego vector and alter vectors are concatenated to predict the next alter in a context.

Figure 2: A technique for learning an ego vector. The concatenation of an ego vector with the context of three alters is used to predict the fourth alter.

Iii circle prediction

In online social networks, users need to organize their personal social networks to cope with the information overload generated by their friends [1]. However, this manual process is laborious, error-prone and inadaptable to changes. It is meaningful and essential to study how to automatically organize user’s friends into social circles when they are added to the network. These organized social circles could help solve many practical problems. For example, it can preserve user’s privacy by showing updates and information only to some friends belong to the specific circles allowed by the user. It also can help a user who wants to read the latest news from his colleagues instead of scrolling through all the news from other users.

However, most of current social circle identification methods [1, 2, 3, 4, 5, 11, 12, 13]

are unsupervised learning methods which lacks emphasis on dataset quality and they could not predict well when there is a missing value in the query. The main supervised approach is proposed by McAuley & Leskovec


which trained a binary classifier for each circle. Their probabilistic model discriminates members from nonmembers based on node features. Node features are the information from both network topological structure and users’ profiles. Although their model deals with weak supervision to predict the circle for a new alter, it fails to refit the model for every new alter that is added to the network. In this section, we study the problem of social circle prediction exploiting the global and local neighborhood structures.

Iii-a Approach

We formulate the problem of circle prediction as a classification task on a new added alter into the graph. We thus leverage the topological structure of the alter and also his profile information. Indeed, Alters’ and egos’ profile information help with the circle prediction task. For example, if the ego and the alter both go to the same university, probably this alter belongs to the university friends circle. Therefore, we add the common profile features vector between an ego and its alter to the topological representations to perform a more accurate circle prediction. More formally, we denote profile feature of the alter as , and the ego as . Given ego , and alter , we formulate the ego and alter profile similarity as , where


Therefore, for each pair of ego and alter, we have the binary vector where is the number of profile features.

Iii-B Classifier

Since some alters are the member of several circles, we need to use a multi-label classifier. Neural network classifiers have the ability to detect all possible interactions between predictor variables. Furthermore, they need less formal statistical training to develop


. In particular, feed-forward neural networks are appropriate for modeling relationships between a set of input variables and one or more output variables. In fact, they are suitable for any functional mapping problem where we want to know how a number of input variables affect the output variable

[18]. We thus define our classifier as a multi-layer feed-forward neural network with the following possible input layers:

  • Where the input layer is the concatenation () of the global and local representations:

    • locglo:

    • gloglo:

    • locgloglo:

  • Where the input layer is the the concatenation of global representation, local representation and the profile similarity vector:

    • locglosim:

    • gloglosim:

    • locgloglosim:

Overall, the architecture of our classifier is described as follows:

  • Input layer: It can be one of six possible inputs which were described above.

  • Hidden layer:

    We have a hidden layer with ReLU activation unit


  • Output layer: The output layer has

    units the same as the number of social circles in the graph with softmax activation function


  • Optimizer:

    we used RMSprop which is an adaptive learning rate method that has found much success in practice

    [19]. RMSprop divides the learning rate for a weight by a running average of the magnitudes of recent gradients for that weight.

Iv Experiments

In this section, we first provide an overview of the datasets that we used in the experiments. We then present an experimental analysis of the proposed approach.

Iv-a Datasets

Since our approach is supervised, we require labeled ground-truth data in order to evaluate its performance. We obtained ego-networks and ground-truth from three major social networking sites: Facebook, Google+, and Twitter available from the University of Stanford [1]. Table I describes the details of the datasets we used in our experiments.

Facebook Twitter Google+
nodes 4,039 81,306 107,614
edges 8,8234 1,768,149 13,673,453
egos 10 973 132
circles 46 100 468
features 576 2271 4122
Table I: Statistics of Social Network Datasets

The number of circles refers to the number of different social circles such as family members, highschool, sport, colleagues, etc.

Iv-B Experimental setup

In order to learn global representations for nodes in Facebook, Google+, and Twitter graphs, we first do random walks to compose three artificial corpus. We then apply word2vec of gensim [23] which is an implementation for the Skip-gram model on our artificial corpuses. We set the embedding size [26], and the context length . Therefore, word2vec scans over the nodes, and for each node it tries to embed it such that the node’s features can predict nearby nodes. The node feature representations are learned by optimizing the likelihood objective using SGD with negative sampling [8].

Similarly, we set the embedding size and to learn local representations for egos in these social graphs. First, we generate ego-walks doing random walks on each ego-network separately. For example, for the Facebook graph with egos, we have a corpus with ego-walks. Then, we apply doc2vec of gensim [23] which modifies the word2vec algorithm to learn continuous representations for paragraphs on our artificial corpuses. Therefore, every ego is represented by a vector which holds the semantics of his neighborhood structure.

To obtain common features for each pair of ego and alter, we select the first features of their profiles include birthday, education, gender, hometown, languages, location, work along with their sub-branches. We then compare the ego’s features to his alters’ features one by one to generate a binary feature vector. This vector will be concatenated to the topological structure vectors as input of the classifier.

We create feature matrices and by concatenation of local and global vectors where . We also create two other feature matrices and considering common profile feature vectors where . The same manner we have and .

Regarding to the ground-truth matrix, we have circle labels for each alter available in the dataset. We need to convert the multi-label ground-truth to the binary form which is more suitable for the classification algorithm.

We finally perform the classification task considering different inputs , , , , and to compare the prediction results. In the multi-label classification setting, every alter is assigned one or more labels from a finite set . During the training phase, we observe a of alters and all their labels. The task is to predict the labels for the remaining

alters. The batch size of the stochastic gradient descent is set to

for Facebook and for both Google+ and Twitter since they have bigger graphs. We consider the learning rate for RMSprop optimizer over iterations. We use

-fold cross-validation approach for estimating test error. The idea is to randomly divide the data into

equal-sized. We leave out part , fit the model to the other parts (combined), and then obtain predictions for the left-out part. This is done in turn for each part , and then the results are combined. We set in our experiments.

Iv-C Results

We classify the alters of Facebook, Google+, and Twitter graphs into respective social circles and report the average performance in terms of -score. To compute the

-score we follow evaluation metrics was described as

[1] with 10-fold cross validation. Table II shows the average performance of the classifier. As can be seen, replacing global representation with local improved the performance of the circle prediction. Moreover, considering the profile similarity between ego and alter affected on the performance of the classifier. However, adding the global representations of egos to the input did not improve the performance.

Approach Facebook Twitter Google+
gloglo 0.37 0.46 0.49
locglo 0.42 0.50 0.52
locgloglo 0.37 0.44 0.48
gloglosim 0.40 0.49 0.51
locglosim 0.45 0.53 0.55
locgloglosim 0.38 0.46 0.47
, McAuley & Leskovec [1] 0.38 0.54 0.59
Table II: Performance (

-score) of different embeddings for circle prediction on three dataset. Standard deviation is less than

for all experiments.

V Conclusion

We described a technique for ego-network analysis based on the concept of local network neighborhoods. We applied new advancements of language modeling to learn latent social representations for egos. This allows analysis on large social networks and can reveal aspects of neighborhood structure that cannot be ascertained in a global network analysis. We provided an example of social circle prediction on different social graphs displaying the ability of our approach to capture local neighborhood structure. As a future work, we tend to study how the local representations can improve the other graph analysis tasks (e.g. link prediction, shortest path, etc).


  • [1] McAuley, Julian, and Jure Leskovec. ”Discovering social circles in ego networks.” ACM Transactions on Knowledge Discovery from Data (TKDD) 8, no. 1, pp.4, 2014.
  • [2] Y. Wang, L. Gao, ”An edge-based clustering algorithm to detect social circles in ego networks”, Journal of Computers, vol. 8, pp. 2575-2582, 2013.
  • [3] H. Qin, T. Liu, Y. Ma, ”Mining user’s real social circle in microblog”, Proceedings of 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’12), pp. 348-352, 2012.
  • [4] J. Zheng, M. N. Lionel, ”An unsupervised learning approach to social circles detection in ego bluetooth proximity network”, Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp’13), pp.721-724, 2013.
  • [5]

    Z. Yongmian, Z. Yifan, E. Swears, et al, ”Modeling temporal interactions with interval temporal Bayesian Networks for complex activity recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 35, Issue 10, pp. 2468-2483, 2013.

  • [6] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 2006.
  • [7] T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In ICLR, 2013.
  • [8] Goldberg, Yoav, and Omer Levy. ”word2vec explained: Deriving mikolov et al.’s negative-sampling word-embedding method.” arXiv preprint arXiv:1402.3722, 2014.
  • [9] B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In SIGKDD, pages 701-710. ACM, 2014.
  • [10]

    Le, Quoc V., and Tomas Mikolov. ”Distributed Representations of Sentences and Documents.” In ICML, vol. 14, pp. 1188-1196. 2014.

  • [11] A. Streich, M. Frank, D. Basin, and J. Buhmann. Multi-assignment clustering for boolean data. JMLR, 2012.
  • [12] Madani, Alaaldin, and Mohammad Marjan. ”Mining social networks to discover ego sub-networks.” In 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), pp. 1-5. IEEE, 2016.
  • [13] Miao, Qiguang, Xing Tang, Yining Quan, and Kai Deng. ”Detecting Circles on Ego Network Based on Structure.” In Computational Intelligence and Security (CIS), 2014 Tenth International Conference on, pp. 213-217. IEEE, 2014.
  • [14]

    Tu, Jack V. ”Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes.” Journal of clinical epidemiology 49, no. 11 (1996): 1225-1231.

  • [15] J. B. Tenenbaum, V. De Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319-2323, 2000.
  • [16] M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373-1396, 2003.
  • [17] Cao, Shaosheng, Wei Lu, and Qiongkai Xu. ”Grarep: Learning graph representations with global structural information.” In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 891-900. ACM, 2015.
  • [18] Svozil, Daniel, Vladimir Kvasnicka, and Jiri Pospichal. ”Introduction to multi-layer feed-forward neural networks.” Chemometrics and intelligent laboratory systems 39, no. 1 (1997): 43-62.
  • [19]

    Tieleman, Tijmen, and Geoffrey Hinton. ”Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude.” COURSERA: Neural networks for machine learning 4, no. 2, 2012.

  • [20] Zwicklbauer, Stefan, Christin Seifert, and Michael Granitzer. ”DoSeR-a knowledge-base-agnostic framework for entity disambiguation using semantic embeddings.” In International Semantic Web Conference, pp. 182-198. Springer International Publishing, 2016.
  • [21]

    Nair, Vinod, and Geoffrey E. Hinton. ”Rectified linear units improve restricted boltzmann machines.” In Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807-814. 2010.

  • [22]

    Bishop, C. ”Pattern Recognition and Machine Learning (Information Science and Statistics), 1st edn. 2006. corr. 2nd printing edn.” Springer, New York, 2007.

  • [23] Radim Rehurek and Petr Sojka ”Software Framework for Topic Modelling with Large Corpora”, Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45-50, 2010.
  • [24] Porter, Michael D., and Ryan Smith. ”Network neighborhood analysis.” In Intelligence and Security Informatics (ISI), 2010 IEEE International Conference on, pp. 31-36. IEEE, 2010.
  • [25] Grover, Aditya, and Jure Leskovec. ”node2vec: Scalable feature learning for networks.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855-864. ACM, 2016.
  • [26] Lau, Jey Han, and Timothy Baldwin. ”An empirical evaluation of doc2vec with practical insights into document embedding generation.” arXiv preprint arXiv:1607.05368, 2016.
  • [27] Muhammad, Syed Agha, and Kristof Van Laerhoven. ”DUKE: A Solution for Discovering Neighborhood Patterns in Ego Networks.” In ICWSM, pp. 268-277. 2015.