The past decade has been marked by the explosion of social networks in the online world. Such networks contain information either in their structure, or in the content posted on them by their users. Extracting this information is valuable for understanding these networks, but also for various applications. In this talk, we will consider two problems in the area of Online Social Network mining.
The first problem we consider regards the understanding of the strength of relationships in online social networks. To this end we will use the principle of Strong Triadic Closure which states that it is not possible for two individuals to have a strong relationship with a common friend and not know each other. We consider the problem of labeling the ties of a social network as strong or weak so as to enforce the Strong Triadic Closure property. We formulate the problem as a novel combinatorial optimization problem, and we study it theoretically. Although the problem is NP-hard, we are able to identify cases where there exist efficient algorithms with provable approximation guarantees. Experiments on real data indicate that our labeling agrees with practical measures of tie strength.
The second problem we consider regards the summarizing of micro-reviews posted on Location Based Social Networks such as FourSquare. Micro-reviews are bite-size reviews (usually under 200 characters), that capture the immediate reaction of users. They are rich in information, concise, and to the point. However, the abundance of micro-reviews and their telegraphic nature makes it difficult for users to extract the useful information from them, especially when going through them on a mobile device. We consider the problem of producing a summary for a micro-review collection for an entity that is representative, compact, and readable. We define the problem, as the problem of synthesizing a new ``review'' using snippets of full-text reviews. To balance compactness and representativeness, we formulate our problem using the Minimum Description Length principle. We propose approximation and heuristic algorithms, and we evaluate them on real-life data collected from Foursquare and Yelp. We demonstrate that our summaries outperform individual reviews, as well as existing summarization approaches.