Restaurant Recomendation
Care food?
Care service?
Care price?
Pick a restaurant JUST FOR YOU.

Represented by:
Huayi Zhang
Jiaming Di
Weiqing Li
Yingnan Han
We fulfilled a restaurant recommendation system based on users' preference of different aspect of a restaurant with Yelp dataset.
Firstly, we used Hierarchical Latent Tree Analysis(HLTA), it can cluster the words into several topics. Then, we labled the review sentence-wisely. If the word in one sentence is in a certain topic A, we add 1 in the count table to this review in A. This sentence's sentiment score is added in the score table to this review in A.
Secondly, we generated two tables which contain every restaurant's mean scores and every user's total count in these topics.
Thus, given a user_id, we could calculate every restaurant's score for him. The score is the linear combination of the socres in all topics. The weight is this user's log-scaled count. We gave out top k restaurants as recommendations.
Orignially get from Yelp Dataset Challenge
The original dataset is 6.53 GB in total. We mainly used the review.json(4.2G) in it. When implemented in local computer, we use the first 50,000 reviews. After generated all the tables, the database is 137.8 MB.
Use HLTA to get words cluster
We use 50,000 reviews to run the HLTA and generated the word cluster model. The words are allocated into 18 clusters.
Build a recommendation system based on users' preference in 18 topics.
The system is like a content-based system, however we took user's profile and restaurant profile based on 18 topics we got.
Use NDCG to compare the ranking outcome between our system and collaborative filtering.
Normally, our system reach an NDCG score of 0.95.
Iaculis ac volutpat vis non enim gravida nisi faucibus posuere arcu consequat
Used python for the local version. Used nltk, pandas, numpy packages for calculation.
In local, use Sqlite3 for data storage. Also, use azure's databricks as big data computation.
Use github for version control and code share. Use google drive for document share. Plese check codes here.
This is a course project for CS595. Thank you for reading.
And special thanks to Professor Kyumin Lee.