Food is essential for human life. Beyond sustaining one’s well-being and providing nourishment, it is also an integral part of human culture. Studies have been conducted on food from various perspectives, to identify people’s perception and consumption using traditional methods of analysis. The advent of social media and mobile phones has allowed users to create, share and document food images, recipes and reviews, among others. Numerous apps exist that allow users to track their food consumption. All this has led to a rich source of accessible information which fosters food-related studies.  introduces the concept of food computing. The notion of food computing involves obtaining food data and identifying areas where it can be applied effectively, such as health, food sciences and human behaviour. Food computing collects data from multiple sources(food logging) and involves tasks such as perception, recognition, retrieval, recommendation, prediction and monitoring of food. One of the key outcomes of food computing is understanding the relationship between dietary choices and one’s health status.
A healthy diet promotes overall well-being and lowers the risk of chronic diseases. To aid in building a healthy diet,  discusses an algorithm to compute health scores for food items based on the user’s health status, day-to-day activities and other environmental factors. However, healthy food suffers from the adoption problem since healthy food is generally not well received.  puts forth the idea of -
“Perceiving food as tasty is important. It’s not good enough simply to tell people what is healthy if they don’t think those foods are also tasty.”
This perception results in people consuming food without regard for its effects on their overall health. This in turn demands for data and technology to be combined with expert knowledge to guide users towards optimal health and food choices.
In this work we describe a recommendation system that incorporates  alongside users’ taste preferences to help them make health-aware dietary choices. We determine to what extent the characteristics of dishes, namely flavour and cuisine, and user inclination affects the quality of food recommendations, potentially solving the initial adoption problem.
Ii Supplementary work
The multimedia community is well versed with logging systems which collect textual, auditory and visual data.  talks about food logging with multi modal signals using smart devices and how it plays an important role in understanding the user’s eating habits. This will in turn serve as an important prerequisite for the recommendation system in this work. Food logging generally involves creating a personal model by logging the names, calories, nutritional values and in some cases, recipes of the dishes consumed by a user. Taste is a very important component of human senses which hasn’t received much attention from the research community. We developed a custom food logging system which tracks user’s intake based on calories, nutritional value and flavour profile of their meal. While the exact details of the architecture extend beyond the scope of this paper, we used Faster-RCNN object detector followed by a One-shot classifier to detect the dishes being consumed.
Iii Related Work
Recommendation systems proactively identify and provide a user with suggestions it deems interesting to them based on their previous interactions and/or information previously gleaned from the user. Particularly, food recommendation engines like  use TF-IDF to generate vectors from food items while taking into account food database information. An input to these systems is generally a question like “what’s for lunch”. The evaluation of these systems, however, is using accuracy, precision and recall which fail to capture the quality of recommendations effectively. ,  and  extend the food recommendation to incorporate the healthiness criteria and talks about the initial adoption problem of these health aware food recommendation systems. They introduce the concept of collaborative filtering to generate more personalized recommendations. Their evaluation method consists of not only traditional quantitative analysis like accuracy but also some form of qualitative analysis like crowd-sourcing.  suggests an approach wherein the users are clustered into distinct groups and recommendations are made for the group as a whole, effectively eliminating the personalized recommendation component of the recommendation system. Such systems, though personalize the recommendation, do it considering the entire user cluster as the smallest unit, and as such do not offer truly personalized recommendations. The approach taken by  and  involves some level of customization for a user, but they have different areas of focus. For instance, the model in  is evaluated more on usability and appeal of the recommendations themselves over the quality of recommendations, while  provides a personalized menu based on ingredients the user previously specified an interest in. Another work,  follows a similar approach, wherein it builds a meal plan for the users, attempting to tackle malnutrition among the elderly. However, its primary focus is on meeting dietary and nutritional constraints.
Like most other recommendation systems, ours uses two components to form the input, the first one being a curated database of food items containing their ingredient list, nutritional value and cuisine. These food items were scraped from websites such as AllRecipes and Yummly while focusing on food items that reflected the diet of the Indian audience. Additionally, to account for the regional variety, food items were crowd-sourced by sending out surveys to roughly 200 users. This resulted in a food database containing 1381 items. To account for missing values, we scraped MyFitnessPal to fill in the gaps in nutritional values, while the cuisines were manually tagged.
The second component is the set of user profiles for which the recommendations would be generated. This was constructed by obtaining user reviews for the food items that were previously scraped from AllRecipes and crowd-sourcing ratings for the most recent items consumed. User reviews help us understand a user’s preferences, which in turn helps build user profiles for recommending food items.
|User Database, Total reviews||30,193|
|User Database, Unique Users||22,625|
|User Database, Users with greater than 5 reviews||466|
|Food Database, Total Dishes||1381|
|Food Database, Indian Dishes||1051|
For each food item in the curated database, we estimate the intensity of sweet, salty, rich, umami and bitter on a scale of 1 to 10, 10 indicating highest intensity. This is done by identifying and quantifying the most influential chemicals for each flavour.
Once the flavour scores for all food items have been generated, we consider it as an additional feature for the recommendation engine. This means that each food item will have the five flavour scores as five extra dimensions. We then apply a similarity score to predict how much a user will rate other food items based on previous ratings.
V-a Flavour Computing
The most prominent example of an objective flavour metric, i.e. a scale that looks purely at the content of the food item without considering any external factors, such as user/cooking preferences is an objective flavour metric. A prime example of this is the Scoville scale, which measures the spiciness of chilli peppers based on the amount of capsaicin present in it. However, no such metric exists for other flavours.  discusses the relationship between nutrients and taste. We follow a similar approach and attempt to identify the elements that influence each flavour the most. In all the calculations mentioned below, the total weight considered is the total active nutrient weight (TANW) of the dish.
The quantity of sodium indicates the saltiness of a dish. To highlight the prominence of sodium in the dish relative to its weight, we identify the ratio between the total sodium present in the dish and the total nutritional weight. This value is then normalized, by using 100g of table salt as a basis, which contains about 39g of sodium.
The carbohydrate content in foods consists of monosaccharides, disaccharides and polysaccharides. Monosaccharides (fructose) and disaccharides (sucrose) contribute positively towards the sweetness of a dish, while polysaccharides (dietary fibres) negate their effects. We do a weighted addition of the two components to compute the sweetness score.
where x and y are 0.85 and 0.1, respectively.
Bitterness is indicated by the calcium and iron content in the dish. To combat the lack of data available for these in Indian dishes, we maintained a list of ingredients that were manually tagged as ‘bitter’ or ‘too bitter’. The rank of each ingredient for each group is added and a weighted addition is performed, taking into account the iron content.
where x, y, and z are weights with the values 0.8, 2.4 and 1.3 respectively.
The umami taste is determined by the glutamic acid content, which is prominent in protein-rich dishes. The ingredients were divided into groups like meats, vegetables, umami enhancers (MSG), and protein supplements (whey), sorted in order of their glutamate and protein content. A multiplier was assigned to each ingredient group which was then added up to get the final group score. The fractional protein content and group score were again subjected to a weighted addition in obtaining the final umami score.
where group multiplier is the weight of each of the following groups - Protein supplements, Vegetables, meat and savoury phrases and their multipliers are 0.8, 7, 3, 9.45 respectively.
The richness score is computed by considering saturated fats, cholesterol and total fats. The saturated fat content is used as a fraction of the total fat content present in the food. The ratio of the total fat content to the total active nutritional weight and the amount of cholesterol in the dish relative to its weight is also taken into account while calculating the richness score. The final score is a linear combination of the aforementioned factors.
The weights x, y and z are 0.5, 0.7, 50 respectively, were arrived at via experimentation.
|curried green bean salad||0.961||0.7||2.63||2.47||2.534|
|cilantro jalapeno pesto with lime||0.604||4.45||0.904||0.57||2.198|
Sample scores are shown in Table II. We have 5 flavour scores, for each of the five food items. The scale used for the scores ranges from 1 to 10.
To validate this system, we conducted a survey of about 150 users. We built a website where the users had to assign flavour scores for dishes that were randomly sampled. The survey entries were used as an input to the validator along with the flavour scores generated by the system. The system then computes the error which indicates the difference between the system-generated and the user-provided scores.
is the upper quartile of the list of results obtained from the survey. ACTION THRESHOLD is a tunable value above which the error correction is activated per taste. This is done to account for minor user-to-user variations. This process is repeated per taste to obtain scores adjusted for user feedback.
Here, the variance is computed differently than the conventional procedure - it is obtained on the data list obtained by computing the difference between the generated and the surveyed taste scores. This, therefore, provides the actual variation between the generated scores and what the surveyed users expect. Including the upper quartile ensures the majority (75%) of user responses are accounted for while avoiding considering the responses that are outliers, such as responses that may go against the general consensus. An example of such a response is a user whose taste preference is significantly skewed towards a particular flavour.
The error in Table III has been computed over the food dataset for all users. The values as indicated in the table above shows the sweet, salty, bitter and rich flavour scores generated by our system are in line with the general consensus of the surveyed users. However, there is a significant deviation from the user-rated scores and the system-generated scores umami. This could be explained by the fact that a good understanding of the umami flavour is lacking among the general populace, and the scores also seem to reflect this. If this data is looked at on a per-user basis, it behaves as a sensitivity factor for each flavour. This way, a profile of the flavour sensitivity can be developed, that can be used to personalize the recommendation of food items even further.
V-B Recommendation Engine
In our work we explore two types of recommendation systems - Collaborative Filtering and Content Based Filtering. With an aim to incorporate food flavours to improve the quality of recommendations, we compare and contrast the effects of including flavour when making food recommendations.
Content Based Filtering algorithm takes into consideration the ‘content’ or ‘properties’ of an item, such as the ingredients, cuisine and flavour in case of a food recommendation system. Conversely, in Collaborative filtering predictions about a user’s interests are made by compiling preferences from similar users, and therefore does not take into consideration item-specific properties.
In this work, we incorporate flavour scores in content-based filtering approach while using collaborative filtering as a baseline for comparing the performance of food recommendation systems.
The Collaborative Filtering approach for recommendation looks to make predictions regarding a user’s preference by collecting preferences from multiple similar users. The assumption in Collaborative Filtering is that people who view and evaluate items in a similar fashion are likely to assess other food dishes in a similar manner.
Matrix Factorisation is a Collaborative Filtering algorithm that takes as input a User-Item Rating Matrix. This matrix is sparse since it is not likely that a user has rated all dishes in the food database. The approach aims to break down the User-Item matrix into two matrices of latent user and item representation. The intent of this approach is to reform the original User-Item matrix while filling in the missing ratings. Figure 2 depicts a User-Item Matrix and latent matrices, which when multiplied, yield predicted scores for items a user has not rated while trying to generate scores as close to the original score for items the user has rated.
Tags Potato Spinach Flour Paneer … Aloo Paratha 0.877 0 0.685 0 … Palak Paneer 0 0.819 0 0.841 … TABLE IV: Sample vector (truncated) Tags Potato Spinach Flour … Bitter … Aloo Paratha 0.877 0 0.685 … 2.29 … Palak Paneer 0 0.819 0 … 3.72 … TABLE V: Same vector as table IV, now including taste scores Dish Name Rating Chole Bhature 4 Paneer tikka masala 5 Veg Biryani 4 Bisi Bele bath 2 Aloo paratha 3 Vegetarian Korma 2 Veg momos 4 Veg fried rice 4 Rajma 3 Naan 4 Dal Makhani 5 Masala Dosa 5 Palak paneer 3 Khakhra 2 Malai Kofta 3 TABLE VI: Dishes reviewed by the user Recommendations based on MatF TF-IDF TF-IDF with flavour Dish Name Predicted Ratings Dish Name Predicted Ratings Dish Name Predicted Ratings Spiced Rice 4.51 Sooji Upma 3.54 Spiced Rice 3.76 Carrot Halwa 4.4 Bisi Bele Bath 3.52 Pumpkin Curry With Lentils And Apples 3.4 Rajma Chawal 4.37 Carrot 3.49 Dal Khichdi 3.38 Red Velvet Cake 4.21 Rajma 3.47 Mango Pickle 3.37 Dosa 4.03 Dal Tadka 3.46 Toasted Ravioli 3.31 Chickpea Curry 3.93 Carrot And Cilantro Soup 3.44 Carrots, Peas And Potatoes 3.3 Roasted Grapes And Carrots 3.92 Tomato Soup 3.43 Rajma 3.3 Tomato Soup 3.92 Matar Pulao With Nuts 3.42 Spiced Eggplant 3.3 Red Lentil Curry 3.91 Veg Hakka Noodles 3.4 Coconut Vegetarian Curry 3.29 Vegetable Bhaji 3.9 Dal Makhani 3.39 Chai Tea 3.28 Cilantro Chutney 3.87 Green Beans 3.39 Masala Idli 3.27 TABLE VII: Predicted Ratings computed by various recommendation techniques Method RMSE MAE CF Matrix Factorisation 1.030 0.805 0.531 TF-IDF 1.040 0.799 0.541 TF-IDF with flavour 1.041 0.807 0.547 TABLE VIII: Results on training data Method RMSE MAE CF Matrix Factorisation 1.712 1.289 1.146 TF-IDF 1.454 1.163 1.125 TF-IDF with flavour 1.394 1.111 1.017 TABLE IX: Results of online A/B testing
When making recommendations for a particular user, the Collaborative Filtering Algorithm only considers other similar users. It does not take into account the content or features of an item, hence, food flavor cannot be incorporated when using this method to make recommendations.
Content Based Filtering:
Content-Based Recommender systems stem from the idea of using the content, properties or description of an item for recommendation purposes. Items are described with a set of descriptor terms or tags which form the basis for item-based comparison. TF-IDF is a Content-based filtering approach that we utilized in this work.
Term Frequency Inverse Document Frequency (TF-IDF) has its roots in Information Retrieval but finds its application in Content Based Recommendation Systems. We describe food dishes using their ingredients, the cuisine and whether the dish is vegetarian or non-vegetarian. A few examples of tags associated with some dishes are:
“Aloo Paratha” - [vegetarian, cumin, flour, ginger, lemon, masala, oil, paratha, potato, salt, wheat, north indian, punjab]
“Palak Paneer” - [vegetarian, clove, coriander, cumin, curry, garlic, ginger, masala, paneer, salt, spinach, tomato, turmeric, north indian, punjab]
We associated the 1381 dishes in our database with 397 unique tags as described above. Next, for each tag associated with a dish, the TF-IDF scores for the tag was calculated using the standard TF-IDF formula.
x is the set of tags and y is the set of dishes
= 1 if dish y has tag x else 0
= number of dishes containing tag
N = total number of dishes
For each dish the TF-IDF calculation results in the formation of a vector of length 397. A slice of such a vector is shown in table IV.
For a given user, the formula (7) is used to calculate the preference score for an unrated dish i, using the scores for all dishes j that the user has rated, by calculating the similarity between dish i and j, and weighing the cosine similarity with the dish score j. Similarly, as table V depicts, user preference for dishes is also calculated after including 5 dish flavors and performing a weighted average of the cosine similarity between the dish’s ingredient descriptors and flavour descriptors. Assumesuch that dishes rated by the user and to be CosineSimilarity(i,j). Then the TF-IDF score will be:
In our work, we have considered three recommendation systems, Matrix Factorization which is an implementation of collaborative filtering and content based filtering using Term Frequency - Inverse Document Frequency (TF-IDF) with and without considering flavour as a property. Table VI shows the ratings a particular user has assigned to various food items. Table VII shows the predicted ratings of all the three algorithms under consideration.
To evaluate the recommendation systems, we consider three metrics - Root Mean Squared Error (RMSE) (8), Mean Average Error (MAE) (9) and Cost Function (CF) (10).
Here, y and are actual and predicted values, respectively. Table VIII shows the RMSE, MAE and CF on the training data which fail to decisively showcase which recommendation method is the best. To clearly distinguish between the three algorithms, we set up an A/B test and sent out a survey to 112 users requesting them to rate a set of predefined dishes based on their preference on a scale of 1-5 (5 being the highest). On receiving the ratings, the user base was divided into three groups one for each algorithm and food predictions for a user was determined using one algorithm based on the preference indicated in the survey. The top 10 and bottom 5 dishes indicated by the recommendation algorithm were sent out to the user and they were again asked to rate the dish on a scale of 5. We received a response back from only 93 of the original 112 users. The user scores were then matched with the score predicted by the algorithm and evaluated based on the three metrics. Table VIII depicts the ratings after conducting an online A/B test based on the three algorithms on a live audience. As seen in Table VIII the Content Based Recommendation algorithm outperforms the other two algorithms based on all three metrics. Therefore, we can conclude that using TF-IDF with flavour improves recommendations.
The values seen here include the cuisine of a dish. We had developed a rudimentary cuisine classifier using Naive-Bayes algorithm to assign cuisines to dishes based on their ingredients. However, the set of dishes we were working with had a heavy bias towards North Indian dishes, and thus had a very insignificant impact on the quality of recommendations.
Vii Future Work
The quality of recommendations could be significantly improved with the incorporation of a cuisine element. However, this will require our dataset to expand to a multitude of cuisines. The preparation techniques will also need to be considered during classification, as it varies from cuisine to cuisine.
A seasonal sensitivity factor could be incorporated into the system, that adds an element of personalization. For example, a novelty function could be incorporated which considers the seasonal trend of a user’s preferences which can then be used to fine-tune the flavour scores.
Significant improvements can be made to the flavour profiler, with the availability of complete and accurate nutritional data for store-bought foods. Existing regulations do not mandate suppliers to report such data in detail. However, other flavour scores could be refined if the data is present. Our database of foods can grow considerably larger as a result of this.
-  Freyne, Jill, and Shlomo Berkovsky. ”Intelligent food planning: personalized recipe recommendation.” Proceedings of the 15th international conference on Intelligent user interfaces. ACM, 2010.
-  Freyne, Jill, Shlomo Berkovsky, and Gregory Smith. ”Recipe recommendation: accuracy and reasoning.” International conference on user modeling, adaptation, and personalization. Springer, Berlin, Heidelberg, 2011.
-  Svensson, Martin, et al. ”A recipe based on-line food store.” Proceedings of the 5th international conference on Intelligent user interfaces. ACM, 2000.
-  Elahi, Mehdi, et al. ”Interaction design in a mobile food recommender system.” CEUR Workshop Proceedings. CEUR-WS, 2015.
-  Kuo, Fang-Fei, et al. ”Intelligent menu planning: Recommending set of recipes by ingredients.” Proceedings of the ACM multimedia 2012 workshop on Multimedia for cooking and eating activities. ACM, 2012.
-  Aberg, Johan. ”Dealing with Malnutrition: A Meal Planning System for Elderly.” Aaai spring symposium: Argumentation for consumers of healthcare. 2006.
-  Elsweiler, David, et al. ”Bringing the” healthy” into Food Recommenders.” DMRS. 2015.
-  van Dongen, Mirre Viskaal, et al. ”Taste–nutrient relationships in commonly consumed foods.” British Journal of Nutrition 108.1 (2012): 140-147.
-  Nag, Nitish, et al. ”Pocket Dietitian: Automated Healthy Dish Recommendations by Location.” International Conference on Image Analysis and Processing. Springer, Cham, 2017.
-  Weiqing Min, Shuqiang Jiang, Linhu Liu, Yong Rui, and Ramesh Jain. 2018. A Survey on Food Computing. ACM Comput. Surv. 1, 1 (September 2018)
-  Forwood, Suzanna E., et al. ”Choosing between an apple and a chocolate bar: the impact of health and taste labels.” PloS one 8.10 (2013): e77500.
-  Nag, Nitish, Vaibhav Pandey, and Ramesh Jain. ”Live Personalized Nutrition Recommendation Engine.” Proceedings of the 2nd International Workshop on Multimedia for Personal Health and Health Care. ACM, 2017.