The wide adoption of social media and networks has recently given rise to a new type of social networks that focus on online event management, called Event-based Social Networks (EBSN) . In the most predominant EBSN platforms, such as Meetup, Eventbrite or Whova, users organize, manage and share social events and activities. In conjunction with the events’ organizers in EBSNs, several entities such as event planning and marketing companies (e.g., Jack Morton, GPJ), organizations (e.g., IEEE), as well as venues (e.g., theaters, night clubs), organize and manage a variety of social events (music concerts, conferences, promotion parties). A major challenge for event organizers is attracting the maximum number of participants, since it has great impact on the success of the event, and consequently, on the expected gains from it, for all involved (e.g., revenue, artist/brand publicity).
Consider the following real-world scenario. A company is going to organize the Summerfest festival. Summerfest is an 11-day music festival, featuring 11 stages and attracting more than 800K people each year. Throughout the festival, in addition to the music concerts, numerous multi-themed events take place (e.g., theatrical performances). Assume that Alice enjoys listening to Pop music, and is a fashion lover. On Monday from 7:00 to 10:00pm, a concert of a famous Pop band is scheduled to take place at the festival. At the same day, on a different stage, a fashion show is taking place from 7:00 to 9:00pm. Furthermore, from 6:00 to 8:00pm on that day, a music concert of a Pop singer has been organized by a nearby (competing) venue. Despite the fact that Alice is interested in all three events, she is only able to attend one of them. In another scenario, assume that a Pop concert is hosted by the festival on Tuesday evening, but Alice is not capable of attending this event, because on Tuesdays she works until late at night.
The above example illustrates the major aspects that should be considered in events scheduling scenarios. In order to attract as many attendances as possible, organizers have to carefully select the events that are going to take place during the festival, possibly picking among from numerous candidate events, as well as the date/time on which each event is going to take place. During the event scheduling process, at least the following aspects have to be considered: user preferences, user habits (e.g., availability), spatiotemporal conflicts between scheduled events, and possible third parties events (e.g., organized by a third party company) which might attract potential attendees (i.e., competing events).
In this work, we introduce the Social Event Scheduling (SES) problem, which considers the aforementioned aspects and the goal is to maximize the overall number of participants in the scheduled events. In short, given a set of events, a set of time periods and a set of users, our objective is to determine how to assign events on the time periods, so that the maximum participant enrollment is achieved.
Recently, a number of works have been proposed in the context of event-participant planing [6, 12, 13, 11, 15, 14, 5, 2]. These works examine a problem from a different perspective: given a set of pre-scheduled events, they focus on finding the most appropriate assignments for the users (i.e., participants) attending the events. The determined user-event assignments aim at maximizing the satisfaction of the users. However, these works fail to consider a crucial issue in event management, which is the “satisfaction” (e.g., revenue, publicity) of the entities involved in the event organization (e.g., organizer, artist, sponsors, services’ providers). Here, in contrast to existing works, our objective is to maximize the satisfaction of the event-side entities. To this end, instead of assigning users to events, we assign events to time intervals, so that the number of events’ attendees is maximized. Briefly, we study an “event-centric” problem, while the existing approaches focus on “user-centric” problems.
Therefore, our objective is substantially different compared to the existing works. The same holds for the solution; in our problem, the solution is a set of event-time assignments, while in existing works is a set of user-event assignments. Additionally, in order to solve our problem we have to find a subset from a set of candidate events (i.e., some events may not be included in the solution), while in other works the solution contains all the users (i.e., each user is assigned to events). Finally, beyond the user and event entities which are also considered in existing works, in our problem more core entities are involved (e.g., event organizer, competing events). Thus, overall, the objective, the solution and the setting of our problem substantially differ from existing works.
Ii Social Event Scheduling Problem
In this section we first introduce the Social Event Scheduling (SES) problem; and then we study its complexity. Before we formally introduce our problem, we present some necessary definitions.
Organizer & Time intervals. We assume that the event organizer (e.g., company, venue) is associated with a number of (available) resources . For example, as resources we can consider the agents (i.e., staff) which are responsible to setup and coordinate the events. Let be a set of candidate time intervals, representing time periods that are available for organizing events. Note that the intervals contained in are disjoint.
Candidate Events. Assume a set of available events to be scheduled, referred as candidate events. Each is associated with a location representing the place (e.g., a stage) that is going to host the event. Further, each event requires a specific amount of resources for its organization, referred as required resources.
Schedule & Assignment. An assignment denotes that the candidate event is scheduled to take place at . An event schedule is a set of assignments, where there exist no two assignments referring to the same event. Given a schedule , we denote as the set of all candidate events that are scheduled by ; and the candidate events that are scheduled by to take place at (i.e., assigned to ). Formally, and . Further, for a candidate event , we denote as the time interval on which assigns .
Feasibility. A schedule is said to be feasible if the following constraints are satisfied: (1) holds that with (location constraint); and (2) holds that (resources constraint). In analogy, an assignment is said to be feasible if the aforementioned constraints hold for . Further, we call valid assignment, an assignment when the assignment is feasible and .
Competing Events. Let be a set of competing events. As competing events we define events that have already been scheduled by third parties (e.g., organized by a third party marketing company), and will possibly attract potential attendees of the candidate events. Based on its scheduled time, each competing event is associated with a time interval . Further, as we denote the competing events that are associated with the time interval ; i.e., .
Users. Consider a set of users , for each user and event , there is a function , denoted as , that models the interest of user over
. The interest value (i.e., affinity) can be estimated by considering a large number of factors (e.g., preferences, social connections).
Moreover, for each user and time interval a social activity probability
is considered, representing the probability of userparticipating in a social activity at . Formally we have . This probability can be estimated by examining the user’s past behavior (e.g., number of check-ins). Note that, user data can either be gathered by analyzing organizer data (e.g., registered users profiles) or be provided by a market research company.
Attendance. Assume a user and a candidate event that is scheduled by to take place at time interval ; denotes the probability of attending at . Considering the Luce’s choice theory, the probability is influenced by the social activity probability of at , and the interest of over , and . We define the probability of attending at as111Event-based mining methods can be used to compute this value, e.g., [17, 1, 18, 10, 4, 16, 3]. However, this is beyond the scope of this work. :
Furthermore, considering all users , we define the expected attendance for an event scheduled to take place at as:
Total Utility. The total utility for a schedule , denoted as , is computed by considering the expected attendance over all scheduled events. Thus, we have:
We formally define the Social Event Scheduling (SES) problem as follows:
Social Event Scheduling Problem (SES). Given an integer , a set of candidate time intervals ; a set of competing events ; a set of candidate events ; and a set of users ; our goal is to find a feasible schedule that determines how to assign candidate events such that the total utility is maximized; i.e., and .
Next, we show that even in highly restricted instances the SES problem is strongly NP-hard.
Theorem 1. The SES problem is strongly NP-hard.
Proof Sketch. Our reduction is from the Multiple Knapsack Problem with Identical bin capacities (MKPI), which is known to be strongly NP-hard . In MKPI there are multiple items and bins. Each item has a weight and a profit and all bins have the same capacity. In the reduction, we use the following associations between the MKPI and the SES: bins to time intervals; capacity to number of available resources; items to events; weight to number of required resources; item profit to likeness; and total profit to expected attendance. Further, in the proof, we consider the following restricted instance of SES: the users are as many as the candidate events; there is only one competing event in each time interval; all users have the same interest value over the competing events; each user likes only one event and each event is liked only by one user; the interest function is , where is item profit; the social activity probability is the same for each user and time interval; and there are no location constraints.
Iii Greedy Algorithm (GRD)
First, we define the assignment score; and then we present the GRD algorithm.
Assignment Score. Assume a schedule and an assignment , where is not previously assigned by (i.e., ). As assignment score (also referred as score) of an assignment , denoted as , we define the gain in the expected attendance by including in . The assignment score (based on Eq. 2) is defined as:
Note that, the expected attendance of each event after assigning , differs from the expected attendance before the assignment. Also, based on Eq. 2 & 4, it is apparent that the score of an assignment referring to interval is determined based on all the events assigned to . Finally, given a set of assignments, the term top assignment refers to the assignment with the largest score.
Algorithm Outline. Here we describe a simple greedy algorithm, referred as Greedy algorithm (GRD). The basic idea of GRD is that the assignments between all pairs of event and interval are initially generated. Then, in each step/iteration, the assignment with the largest score is selected. After selecting an assignment, a part of the potential assignment’s scores have to updated. Recall that the assignment’s score is defined w.r.t. the events assigned in the assignment’s interval (Eq. 4). Thus, when an assignment is selected, we have to recompute (update) the scores of the assignments referring to interval.
Algorithm Description. Algorithm LABEL:algo:bsc presents the pseudocode of GRD. At the beginning the algorithm calculates the score, for all possible assignments (line 3). The generated assignments are inserted into list (line 4). Then, in each step the assignment with the largest score is found and popped (line 6). If the popped assignment is feasible and the event is not previous assigned (i.e., assignment is valid), is inserted into schedule (line 8). Until a valid assignment is selected, the assignment with the largest score is popped and checked. After selecting , the algorithm traverses , updating the appropriate assignments (loop in line 11). Finally, the algorithm terminates when assignments are selected.
Complexity Analysis. Initially, the GRD computes the assignments for all event-interval pairs (loop in line 2), which requires . Note that, each assignment score (Eq. 4) is computed in . In the next phase (loop in line 5), the GRD performs iterations. In each iteration, the operation (line 6) traverses the list of size , where . Thus, in sum, the cost for traversing is . Additionally, in each iteration (except the last), assignment updates are performed, in the worst case. Hence, the overall cost for updates is . Therefore, the overall computation cost of GRD in the worst case is . Finally, the space complexity is .
Iv Experimental Analysis
Data. In our experiments we use the largest Meetup dataset from , which contains data from California. Adopting the same approach as in [11, 12, 13, 15], in order to define the interest of a user to an event, we associate the events with the tags of the group who organize it. Then, we compute the likeness value using Jaccard similarity over the user-event tags. After preprocessing, we have the Meetup dataset containing , users and about K events.
Parameters. Adopting the same setting as in the related works [6, 11, 12, 13, 15], we set the the default and maximum value of the of scheduled events , to and , respectively. Also, we vary the number of time intervals , from up to , with default value set to . Further, the number of candidate events is set to .
In order to select the values for the number of competing events per interval, we analyze the two Meetup datasets from . From the analysis, we found that, on average, events are taking place during overlapping intervals. Therefore, the number of competing events per interval
is selected by a uniform distribution havingas mean value.
Regarding the value for the number of available events’ locations, we consider the percentage of pairs of events that are spatio-temporally conflicting, as specified in . As a result, we set the number of available locations to . The social activity probability is defined using a Uniform distribution.
The performance and the effectiveness of the examined methods are marginally affected by the available/required resources parameters (here as resources we consider organizer’s staff). Hence, we choose a reasonable number based on our scenario, setting the number of available resources to . Also, the number of required resources is selected by a uniform distribution defined over the interval .
Methods. In our evaluation we study our method GRD, as well as two baselines. The first baseline method, TOP, computes the assignment scores for all the events and selects the events with - score values. The second, denoted as RAND assigns events to intervals, randomly. Note that, since the objective, the solution and the setting of our problem are substantially different (see Sect. I) from the related works [6, 12, 13, 11, 15, 14, 5, 2], the existing methods cannot be used to solve the SES problem. All algorithms were written in C++ and the experiments were performed on an 2.67GHz Intel Xeon E5640 with 32GB of RAM.
In the first experiment, we study the effect of varying the number of scheduled events . In terms of utility (Fig. (a)a), we observe that, in all cases, GRD outperforms significantly both baselines. The difference between RAND and GRD increases as increases. This is expected considering the fact that the larger the , the larger the number of “better”, compared to random, selected assignments. Finally, TOP reports considerably low utility scores in all cases.
The results regarding the execution time are depicted in Figure (b)b. Note that the computations that are performed due to updates increase with , while the number of initially computed scores is the same for all . Also, TOP performs only the initial scores’ computations (there are no score updates). That’s why the difference between the GRD and the TOP increases with .
In the next experiment, we vary the number of time intervals . Regarding utility (Fig. (c)c), we observe that, as the number of intervals increases, the utility of GRD and TOP methods increases too. This happens because the increase of available intervals results to a smaller number of events assigned in the same interval, as well as to a larger number of candidate assignments. In terms of execution time (Fig. (d)d), for the same reason as in the first experiment, the difference between the GRD and the TOP increases with .
This paper introduced the Social Event Scheduling (SES) problem. The goal of SES is to maximize the overall events’ attendance considering several events’ and users’ factors. We showed that SES is strongly NP-hard and we developed a greedy algorithm.
This research has been financed by the European Union through the FP7 ERC IDEAS 308019 NGHCS project, the Horizon2020 688380 VaVeL project and a 2017 Google Faculty Research Award.
-  I. Boutsis, S. Karanikolaou, and V. Kalogeraki. Personalized event recommendations using social networks. In MDM, 2015.
-  Y. Cheng, Y. Yuan, L. Chen, C. G. Giraud-Carrier, G. Wang. Complex event-participant planning and its incremental variant. In ICDE, 2017.
-  R. Du, Z. Yu, T. Mei, Z. Wang, Z. Wang, and B. Guo. Predicting Activity Attendance in Event-based Social Networks: Content, Context and Social Influence. In UbiComp, 2014.
-  K. Feng, G. Cong, S. S. Bhowmick, and S. Ma. In Search of Influential Event Organizers in Online Social Networks. In SIGMOD, 2014.
-  J. Huang, Y. Zhou, X. Jia, and H. Sun. A novel social event organization approach for diverse user choices. The Computer Journal, 2016.
-  K. Li, W. Lu, S. Bhagat, L. V. Lakshmanan, and C. Yu. On Social Event Organization. In KDD, 2014.
-  X. Liu, Q. He, Y. Tian, W. Lee, J. McPherson, J. Han. Event-based social networks: linking the online and offline social worlds. KDD 2012
-  S. Martello and P. Toth. Knapsack Problems: Algorithms and Computer Implementations. John Wiley & Sons, 1990.
-  T. N. Pham, X. Li, G. Cong, Z. Zhang. A General Graph-based Model for Recommendation in Event-based Social Networks. In ICDE, 2015.
-  D. Romero, K. Reinecke, L. Robert. The influence of early respondents: information cascade effects in online event scheduling. WSDM 2017
-  J. She, Y. Tong, and L. Chen. Utility-Aware Social Event-Participant Planning. In SIGMOD, 2015.
-  J. She, Y. Tong, L. Chen, and C. C. Cao. Conflict-aware Event-participant Arrangement. In ICDE, 2015.
-  J. She, Y. Tong, L. Chen, and C. C. Cao. Conflict-aware Event-participant Arrangement and Its Variant for Online Setting. TKDE, 2016.
-  J. She, Y. Tong, L. Chen, and T. Song. Feedback-Aware Social Event-Participant Arrangement. In SIGMOD, 2017.
-  Y. Tong, J. She, and R. Meng. Bottleneck-aware Arrangement Over Event-based Social Networks: The Max-Min Approach. WWWJ, 2015.
-  T.Xu, H.Zhong, H Zhu, H Xiong, E Chen, G Liu. Exploring the impact of dynamic mutual influence on social event participation. In SDM 2015
-  W. Zhang, J. Wang, W. Feng. Combining Latent Factor Model with Location Features for Event-based Group Recommendation. KDD, 2013.
-  X. Zhang, J. Zhao, and G. Cao. Who Will Attend? - Predicting Event Attendance in Event-based Social Network. In MDM, 2015.