{PrefLib}: A Library for Preferences

CD-00001: Trip Advisor Data

This dataset contains 675,069 reviews of 1,851 hotels across the world scraped from Trip Advisor. The data was scraped and donated by Hongning Wang.

One file contains the numerical aspect ratings provided by the users, along with other information about the hotel. The second file contains the text of the users review. These reviews have been slightly modified, all excess spaces and tabs have been removed and all commas have bene changed to semi-colons.

Both files are zipped due to their size. Both files are encoded in the dat format and the first line of each file explains the fields within the file. Some of the usernames are encoded in Unicode so please be careful when parsing the files!

Required Citations

Selected Citations Using This Dataset

Supported By:

DescriptionTypeModificationFile NameFile Size
Ratings Zipped Data File Original CD-00001-00000001.zip 2.3M
Review Texts Zipped Data File Original CD-00001-00000002.zip 75M