Code and Dataset intruction for our paper "Where Could We Go? Recommendations for Groups in Location-Based Social Networks" presented at the ACM WebSci'17


You may find the paper here:

If you use any part of the code or dataset, kindly cite our work as:


ACM Ref:

Frederick Ayala-Gómez, Bálint Daróczy, Michael Mathioudakis, András Benczúr, and Aristides Gionis. 2017. Where Could We Go?: Recommendations for Groups in Location-Based Social Networks. In Proceedings of the 2017 ACM on Web Science Conference (WebSci '17). ACM, New York, NY, USA, 93-102. DOI:

Getting Started

  • Install python and virtualenv
  • virtualenv venv
  • source venv/bin/activate
  • pip install -r requirements.txt

Additional third party software to be installed

  • Install Turi's GraphLab Create: You need to ask for an academic license

Data Collection

Getting data from Twitter

  • source venv/bin/activate
  • cd data_collection
  • Use the template in data_collection/config/template.config as a reference and update it with your data
  • Resolve the checkins using Foursaquare API

Parsing the JSON check-ins file

  • source venv/bin/activate
  • cd data_collection
  • python checkins_file output_dir
    • A script that parses the checkins to pandas Dataframes:
      • df_checkin_group.csv: checkin_id, group_id
      • df_checkins.csv: checkin_beenHere, checkin_created_at, checkin_created_via, checkin_id, checkin_likes, checkin_timeZoneOffset, user_id, venue_id
      • df_checkins_with.csv: checkin_id, user_id, with, group_id
      • df_users.csv: user_firstname, user_gender, user_id, user_screen_name
      • df_venues_categories.csv: category_id, category_name, category_pluralName, category_primary, category_shortName, venue_id
      • df_venues.csv: venue_address, venue_cc, venue_checkinsCount, venue_city, venue_country, venue_id, venue_lat, venue_long, venue_name, venue_state, venue_tipCount, venue_usersCount, venue_verified

Data Analysis

Statistics and charts

  • cd data_analysis/
  • python configuration_file
    • Where:
    • configuration_file: Your configuration for running the analysis


Run Baselines

  • python configuration_file
    • Where:
    • configuration_file: Your configuration for running the baselines


  • For privacy reasons, we cannot share a public link to the dataset. Please contact the main author for further information.


