We aim at providing a comprehensive resource for the multiple research communities that deal with preferences: computational social choice, recommender systems, data mining, machine learning, combinatorial optimization, to name just a few.
The strength of PrefLib is to provide carefully curated data, formatted in a unified format. We encourage the users to read the detailed explanations that we provide regarding the format and the modelisation choices. Once everything is clear, feel free to explore the datasets we are hosting, or to search for specific files that may interest you.
Providing data is only the first step, the next one being actually using the data. To help you with that, we provide a Python library specifically designed to work with PrefLib instances: the PrefLib-Tools. This library is distributed in PyPi.
Data Usage and Citation Policy
Constructing and maintaining this website and its database requires a lot of work. We ask that you provide a reference to our website when publishing research based on data gathered here. Here are some references you can use.
- Nicholas Mattei and Toby Walsh. PrefLib: A Library of Preference Data. Proceedings of Third International Conference on Algorithmic Decision Theory (ADT 2013) — PDF — Bibtex.
- Nicholas Mattei and Toby Walsh. A Preflib.org Retrospective: Lessons Learned and New Directions. Trends in Computational Social Choice — PDF — Bibtex.
In addition, many dataset have specific citation requirements. Make sure to always include them whenever you used a file taken from such a dataset (especially if you downloaded aggregated data files).
Contributing to PrefLib
We rely on the support of the community in order to increase the usefulness and coverage of this site. If you want to donate a new dataset, report an issue with an existing dataset, or suggest changes to the website, several GitHub repositories are at your disposal.
- PrefLib-Contrib: used as a discussion platform around PrefLib.
- PrefLib-Data: hosts the (raw) data and the related scripts.
- PrefLib-Django hosts the Django project for the website.
- PrefLib-Tools hosts the code for the PrefLib-Tools.
If you need anything, have a look at those repositories, open new issues, comments, like, subscribe and share the word!
We currently host:
- 8 types of data
- 61 datasets
- 17772 data files
- More than 4.3 GB of data
Here are some links that you might find relevant as well.
- DEMOCRATIX: A Declarative Approach to Winner Determination
- Pnyx: An Easy to Use Aggregation Tool
- Whale4: Which Alternative is Elected?
- VoteLib: A Library of Voting Behavior
- Pabulib: A Library of Participatory Budgeting Instances
- CRISNER: A Qualitative Preference Reasoner
- Spliddit: Quick and Easy Solutions to Fair Division Problems
- RoboVote: AI Driven Decisions
To find more data check these websites.
- UC Irvine Machine Learning Repository
- University of Minnesota GroupLens Data Sets
- CSPLib: A Problem Library for Constraints
- Microsoft Learning to Rank Datasets
- SATLib: The Satisfiability Library
- Toshihiro Kamishima's Sushi Preference Dataset
- MAX-SAT Evaluations and Datasets
- Stanford Network Analysis Project