Endogenous and Exogenous Multi-Modal Layers in Context Aware Recommendation Systems for Health

by   Nitish Nag, et al.

People care more about the solutions to their problems rather than data alone. Inherently, this means using data to generate a list of recommendations for a given situation. The rapid growth of multi-modal wearables and sensors have not made this jump effectively in the domain of health. Modern user content consumption and decision making in both cyber (e.g. entertainment, news) and physical (eg. food, shopping) spaces rely heavily on targeted personalized recommender systems. The utility function is the primary ranking method to predict what a given person would explicitly prefer. In this work we describe two unique layers of user and context modeling that can be coupled to traditional recommender system approaches. The exogenous layer incorporates factors outside of the person's body (eg. location, weather, social context), while the endogenous layer integrates data to estimate the physiologic or innate needs of the user. This is accomplished through multi-modal sensor data integration applied to domain-specific utility functions, filters and re-ranking weights. We showcase this concept through a nutrition guidance system focused on controlling sodium intake at a personalized level, dramatically improving upon the fixed recommendations.




Intrinsic and Extrinsic Motivation Modeling Essential for Multi-Modal Health Recommender Systems

Managing health lays the core foundation to enabling quality life experi...

Do recommender systems function in the health domain: a system review

Recommender systems have fulfilled an important role in everyday life. R...

PinnerSage: Multi-Modal User Embedding Framework for Recommendations at Pinterest

Latent user representations are widely adopted in the tech industry for ...

Two Birds with One Stone: Unified Model Learning for Both Recall and Ranking in News Recommendation

Recall and ranking are two critical steps in personalized news recommend...

Incorporating Domain Knowledge into Health Recommender Systems using Hyperbolic Embeddings

In contrast to many other domains, recommender systems in health service...

Why Do We Click: Visual Impression-aware News Recommendation

There is a soaring interest in the news recommendation research scenario...

Multi-Modal Subjective Context Modelling and Recognition

Applications like personal assistants need to be aware ofthe user's cont...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Recommendation systems are becoming an increasingly important part in various domains of daily life. The large sets from which users must make decisions from can be overwhelming from too much choice. In areas such as online shopping, entertainment, and information search there has been considerable effort to streamline choice through recommendations. These traditional recommendation systems consider two main components, the user and the items. In more recent work, context has become the third valuable component. Context is very complex and difficult to characterize, but usually has an affect to alter the recommendations to fit a given user situation. Building real-time user models from context has been a key component of advanced recommender systems. Generally, the contextual information of a given user is divided into 3 classes of fully observable, partially observable, or unobservable [1]. Incoming data to determine context is divided into another dimension of static or dynamic data. Wearable devices, sensors, mutli-media, social networks, and many more multi-modal data streams carry information that can better characterize the user model to enhance recommendations in both static and dynamic data domains. Applying these multi-modal data sources for health applications remains a challenge. The growth momentum of wearable devices that monitor health metrics via sensors continues to promise better health for users. How will the data generated by these sensors become useful in daily life? Ideally an important application would be to improve personalized recommendations pertaining to health decisions. In this work we extend a new dimension to contextual data by separating the distinction between information in the environmental situation of a user (exogenous to the user) and the inherent biological needs of a user under the skin (endogenous to a user). After an explanation of the concept, we show this in real-time prediction of sodium needs for food recommendations.

Figure 1: Increasing aggregation of mutli-modal user data streams can more accurately predict the health status and needs of a user.

2 Related Work

Context-Aware Recommendation Systems (CARS) in various domains have been used to improve recommendation performance increasingly over the past decade. Basic techniques of pre/post-filtering and context modeling have been well described in reviews and texts [1]. Furthermore, the contextual data has been usually split into dynamic and static components that can be observed at various fidelity levels (X and Y axis on Figure 2).

Individual context is a major category of CARS as described by Villegas et. al. [2]. Within individual context, Villegas describes a sub category of "natural" events which are characteristics human intervention does not control. This is in reference to weather or pollution. The "human" subcategory describes preferences and behavior. One critical aspect these classifications do not address is the natural biological / health context of the individual, which is the main goal of this work.

Using multimedia and multi-modal data to power health recommendations has been echoed as a need in previous work by several groups in the multimedia and recommender communities [3] [4]. A big part of this is understanding what the user needs through either direct understanding of intent [5] or through implicit understanding of user needs, especially in the area of health from sensors and wearable devices [6]. Health recommender systems have two main user groups that are well described: The patient and the health-care provider [7] [8]. Both of these user groups need to know biological context to apply recommendations that affect health.

3 Biological Context: Exogenous and Endogenous Layers

Recommendation in health is inherently different from the popular recommendation systems most users interact with that try to estimate user preference or ratings (Eg. Netflix, Amazon etc.). We need to consider at least two aspects while recommending an item to a user in the context of health. 1) How item affects the users health, 2) How does the item align with user’s preferences (traditional recommender approach). The Endogenous and Exogenous Layers in this work considers this first component of health in an organized fashion. The same item may affect a user in a different way given different situations. The layers in this concept are clearly delineated between the environmental factors external to the skin and physiologic factors beneath the skin. Combined, this gives a clean separation of situations in the biological body in order to use physiological science to predict needs (Figure 2).

3.1 Exogenous Context

Exogenous context defines the realm of the situation outside the body. The environmental factors that can be determined include the physical, social, and information external to the user. The sensors providing the data about this may not come from a user device, and from stations in the environment that are monitoring the aforementioned components. For example, pollution monitors, weather stations, traffic data and events in the external space are monitored by specialized fixed sensors. Using the user location from the device we can derive which sensors to pull data from to estimate the user environment. Some considerations to take are if the user is inside or outside a building. The GPS location may give a general area, but if the user is inside the building, we must know the environmental status inside the building, not the outside weather. Sensors on the user may also capture information about the environment since most wearables and smartphone used to capture data are external to the user body.

Figure 2: The traditional context data dimensions of data change rates and context knowledge compose the X and Y axis. In health we expand the Z axis dimension with exogenous and endogenous layers. An example to consider would be glucose monitoring: A single finger prick reading would be a partially observable endogenous context data type, whereas a continuous glucose monitoring device would be a fully observable endogenous context data type. Location based services would fall in the exogenous layer.

3.2 Endogenous Context

Endogenous context defines the physiological status of the user physical body. Many sensors and wearable devices are focused on capturing this data. As an example, heart rate monitors, thermometers, glucose monitors, blood pressure readings, accelerometers for movement, skin galvanic resistivity sensors just to name a few. For sport specific and military applications this is further expanded with embedded sensors, power meters, and mutli-location accelerometers on the body. These sensors are all trying to estimate some physiological parameter of the body. Integration of these sensors into a meaningful "body / health context" for use in recommender systems is essential for these sensors to become meaningful for daily life [9]. The context will be used in domain specific applications determined by the use case by the user or health provider. For example, in diabetes, glucose monitoring along with the other sensors has a goal of reducing the global total plasma glucose levels to control the disease. Yet these sensors are still unable to provide common daily life recommendations for diabetes management like what a user should be eating. In the case of heart failure (most expensive readmission hospital cost in the United States [10]), blood pressure monitoring along with other factors does not control the primary risk factor in readmission to the hospital. A primary driver of readmission remains high sodium food intake, which results in fluid retention and exacerbation of the heart failure. Endogenous context in this case could be used to make recommendations to the patient to eat foods that have the appropriate level of sodium in the food. This is the specific aim we tackle in the application of this concept.

Figure 3: A hybrid of two paradigms used in CARS. Pre-filtering and multi-dimensional modeling can be an effective method for health contextual understanding. The relevant items are reduced in space through the pre-filter of exogenous and endogenous static data. This reduced set can then be incorporated into a multi-dimensional model that takes into account dynamic data streams from multi-modal data to give the final utility matrix of items.

3.3 Integration of Biological Context Layers in CARS

A method of bringing these layers into CARS is shown in Figure 3. The initial work flow uses the CARS tools of pre-filtering with static data parameters to reduce the item space only to relevant and safe to consume items. Following this we use a multi-dimension approach to integrate both the exogenous and endogenous dynamic data. The reason we must combine these two layers in this phase is because exogenous factors affect the endogenous state. In the case of sodium needs, the temperature and humidity will affect the sweat rate of an individual. This situation modeling portion must be customized to the domain specific use case and goals of the recommender system. The end goal of this work flow is not to have a guessed preference score like in traditional recommender systems (hence no reference of Ratings in the data space), rather to estimate the user health needs.

4 Application to Health: Nutrition Recommendations

4.1 System Goals

To capture the user attributes necessary to elucidate the exogenous and endogenous context, we create a multi-modal user vector using different sources of data such as smartphones and sensors (body weight, accelerometer based motion, barometric altitude gain/loss, timestamps) and environmental data (altitude, temperature). We need to incorporate expert knowledge in the system to bring the user vector and item vector in the same space in order to compute the utility matrix. We are doing that using a combination of various algorithms in Table

1, to convert the relevant aspects of the user vector to the space of the item vectors (i.e. nutritional sodium requirements) and matching them to generate a score for how well the corresponding item satisfies the user’s sodium nutritional needs [11]. As mentioned earlier, this is a great need for heart failure patients to have tailored sodium intake, yet the gold standard is a single uniform number (1500mg) given by the American Heart Association [12].

Component Author
Temperature Bates et. al. [13]
Altitude Hannon et. al. [14]
Walking Howley et. al. [15]
Stairs Aziz et. al. [16]
Basal Metabolism Schofield et. al. [17]
Age Stapleton et. al. [18]
Table 1: Scientific Resources to Understand Sodium Needs

This is a real-world problem as users use electronic review sources very often to look for food. These services (e.g. Yelp, Zomato, OpenTable etc.) do not take any health factors into consideration. The big problem for users when deciding on where to eat is that they search on the granularity level of the restaurant. This is not the appropriate way to match a user to a meal, as the restaurant may have various offerings that are healthy or unhealthy. The correct way to match a user to a meal would be to match at the item level of the meal. This is parallel to shopping online for a store versus an item. The problem with the online resources for food is that they are so large, a user cannot browse through all the options. Thus a recommender system should provide relevantly matched items to reduce browsing burden. This problem is commonly known as the Long Tail Problem (Figure 4). The goal of this system is to capture the healthy items in the long tail and match them to the personal needs of a user based on sodium needs.

Figure 4: In consumer decisions, the long tail problem makes it difficult to find items that are not popular but may be relevant to the user. By understanding the health status of the user, CARS can match health relevant items that a user would not usually browse through. This essentially allows the user to discover healthy items that are relevant but would have been difficult to find through popularity based rating systems.

4.2 System Architecture

We adapt the paradigms of pre-filtering and multi-dimensional analysis from Figure 3 for the meal recommendation problem in Figure 5. This system uses the smartphone as a sensor capture device and for the recommendation delivery.

In reference to the work flow described in Figure 3, we use a database crawled from publicly available restaurant menus on the internet to gather nutrition facts and restaurant information to generate the item space. The exogenous pre-filter takes into account static data pertaining to availability of meal resources including timing, distance away and traffic conditions. The endogenous pre-filter takes into account static data pertaining to the users food allergens and dietary preferences (vegan, vegetarian). The Multi-Dimensional situation model uses an algorithm (below) derived from biological scientific studies in measuring sodium loss and retention in various situations as described in Table 1.

4.3 Experiments and Results

The primary aim of our application system is to find the healthiest (defined by correct sodium needs) meal for purchase given a location. We have three different synthetic scenarios (Table 3) that we test three artificial users (Table 2). After identifying the sodium loss from calculations below derived from literation, we get a sodium need per user-context situation (Table 4), we try to match this to the most appropriate meal within a 30km radius. The ranking of the foods is then carried out by the food ranking algorithm Elixir [11] on the subset of pre-filtered items.

Figure 5: The conceptual architecture of the nutrition recommendation system. The Physio Module estimates the user physiological response to the endogenous and exogenous factors. These are mapped to the same dimensions as the nutrition features (sodium in this case) to allow for combination using the Elixir algorithm to rank foods [11].


Health Parameters
Users 167 cm 125 lbs Male N 29
190 cm 290 lbs Male O 37
155 cm 85 lbs Female MA 18


Table 2: Synthetic Users. Health Status: N = Normal, O=Obese, MA=Muscle Atrophy


Situation Sensor Parameters
Steps Floors Altitude Temp
Workday 2,400 12 20 feet 70 F
Hiking 30,650 207 10,700 feet 42 F
Beach Picnic 7,430 31 0 feet 92 F


Table 3: User scenarios. For generating latitude/longitude we use the following: Workday in Los Angeles, California inside office Building. Hiking in Yosemite National Park, USA. Beach refers to Newport Beach, California.


Health Parameters


Workday 2767 3320
Hiking 3798 4558
Beach 2826 3391
Workday 3712 4455
Hiking 6022 7277
Beach 3923 4708
Workday 2183 2620
Hiking 2894 3472
Beach 2215 2658


Table 4: Sodium Need Results

5 Future Work and Conclusions

While we have demonstrated the system with only a health utility function for the healthy meal recommendation problem, this approach could be generalized to include other utility functions and CARS techniques which capture different aspects of the user’s decision making process affecting health. Future work to advance health needs profiling of a user will need to take into account many factors including the following:

5.1 Behavioral and Social Understanding

Humans are creatures of habit. Understanding the behavioral reasons for decisions made using recommender systems is critical to effectively engage the user in choosing the healthiest choices from a health recommender system. Behavioral modification remains a complex and difficult task in health applications. For example, a user may exhibit different behavior when in presence of friends and family due to peer pressure which can be captured by identifying their choices in presence of different people. This approach can also be adapted to solve problems in other domains, we would need to replace the expert knowledge module with a relevant expert system or a learning system which could model the behavioral utility function.

5.2 Multi-Criteria

Once we have computed the utility matrices capturing different aspects of the user’s decision process, we need to combine them to obtain better real-world final recommendations. This can be done by treating health and traditional rating systems as different recommender systems producing their own utility functions. Combining these utility functions can be accomplished in different ways, such as using a weighted average of different utility matrices where the weight of each matrix captures the extent to which the user exhibits the behavior aspect captured by the utility function, or using a threshold based system and chaining the output of one system to another system’s input.

5.3 Cross-Domain Integrations

There are multiple avenues where recommender systems can apply for users. Knowledge from various domains may need to be combined in the case of certain health situations. Furthermore, knowledge from a source domain may have implications for a different target domain. There are some established approaches for general cross-domain recommender in health. Applying these to the endogenous and exogenous layers remains to be explored. Two broad categories for this exist: 1. Multi-domain: Approaches may be used to understand the way various health source domain factors interact to various target recommendations. In the endogenous setting, this is closely mimicked through the way various organ systems interact and affect each other. For example, readings from a glucose monitor, heart rate monitor, and blood pressure monitor all are in different domains, but may all affect recommendations that result in changes in all domains. 2. Linked and Cross Domains: Knowledge about one domain may inform the best recommendation in a different domain. For example, atmospheric exogenous domain may inform what the user would find most tasty (hot weather may increase the likelihood of saltier food increasing in utility in both the health and user preference matrix).

In conclusion, this work introduces the concept of extending traditional CARS approaches in health through the dimension of exogenous and endogenous components. This distinction may be a critical step organizing the integration of wearable and multi-modal data into recommender systems. This layering allows mapping of the needs for both consumer and health professional situations. These two components also depart from the traditional approach of estimating a user rating, which is a subjective measure of user preference, to an objective measure of user needs. We demonstrate this in a system that is useful for heart failure patients who need customized sodium guidance. Ultimately recommender systems for health are best utilized when they can fulfill these user needs most accurately.


  • [1] Gediminas Adomavicius and Alexander Tuzhilin, “Context-Aware Recommender Systems,” in Recommender Systems Handbook, pp. 191–226. Springer US, Boston, MA, 2015.
  • [2] Norha M. Villegas, Cristian Sánchez, Javier Díaz-Cely, and Gabriel Tamura, “Characterizing context-aware recommender systems: A systematic literature review,” Knowledge-Based Systems, vol. 140, pp. 173–200, 1 2018.
  • [3] Nitish Nag, Vaibhav Pandey, and Ramesh Jain, “Health Multimedia: Lifestyle Recommendations Based on Diverse Observations,” Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 99–106, 2017.
  • [4] Hanna Schäfer, Santiago Hors-Fraile, Raghav Pavan Karumur, André Calero Valdez, Alan Said, Helma Torkamaan, Tom Ulmer, and Christoph Trattner, “Towards Health (Aware) Recommender Systems,” in Proceedings of the 2017 International Conference on Digital Health - DH ’17, New York, New York, USA, 2017, pp. 157–161, ACM Press.
  • [5] Christoph Kofler, Martha Larson, and Alan Hanjalic, “User Intent in Multimedia Search,” ACM Computing Surveys, vol. 49, no. 2, pp. 1–37, 2016.
  • [6] Jie Lu, Dianshuang Wu, Mingsong Mao, Wei Wang, and Guangquan Zhang, “Recommender system application developments: A survey,” Decision Support Systems, vol. 74, pp. 12–32, 2015.
  • [7] Martin Wiesner and Daniel Pfeifer, “Health Recommender Systems: Concepts, Requirements, Technical Basics and Challenges,” International Journal of Environmental Research and Public Health, vol. 11, no. 3, pp. 2580–2607, 3 2014.
  • [8] Emre Sezgin and Sevgi Ozkan, “A systematic literature review on Health Recommender Systems,” E-Health and Bioengineering Conference (EHB), pp. 1–4, 2013.
  • [9] Nag Nitish, Vaibhav Pandey, Hari Bhimaraju, Srikanth Krishnan, Preston J Putzel, and Ramesh Jain, “Cross-Modal Health State Estimation,” ACM Multimedia, 2018.
  • [10] Anika L Hines, Marguerite L Barrett, H Joanna Jiang, and Claudia A Steiner, “Conditions With the Largest Number of Adult Hospital Readmissions by Payer, 2011,” 2011.
  • [11] Nitish Nag, Vaibhav Pandey, Abhisaar Sharma, Jonathan Lam, Runyi Wang, and Ramesh Jain, “Pocket Dietitian : Automated Healthy Dish Recommendations by Location,” Proceedings of the 3rd International Workshop on Multimedia Assisted Dietary Management - MADiMa ’17, pp. 1–9, 2017.
  • [12] Divya Gupta, Vasiliki V. Georgiopoulou, Andreas P. Kalogeropoulos, Sandra B. Dunbar, Carolyn M. Reilly, Jeff M. Sands, Gregg C. Fonarow, Mariell Jessup, Mihai Gheorghiade, Clyde Yancy, and Javed Butler, “Dietary sodium intake in heart failure,” Circulation, vol. 126, no. 4, pp. 479–485, 2012.
  • [13] Graham P Bates and Veronica S Miller, “Sweat rate and sodium loss during work in the heat,” Journal of Occupational Medicine and Toxicology, vol. 3, no. 1, pp. 4, 1 2008.
  • [14] J P Hannon, K S Chinn, and J L Shields, “Alterations in serum and extracellular electrolytes during high-altitude exposure.,” Journal of applied physiology, vol. 31, no. 2, pp. 266–73, 8 1971.
  • [15] E T Howley and M E Glover, “The caloric costs of running and walking one mile for men and women.,” Medicine and science in sports, vol. 6, no. 4, pp. 235–7, 1974.
  • [16] Abdul Rashid Aziz and Kong Chuan Teh, “Physiological responses to single versus double stepping pattern of ascending the stairs,” Journal of physiological anthropology and applied human science, vol. 24, no. 4, pp. 253–257, 7 2005.
  • [17] W N Schofield, “Predicting basal metabolic rate, new standards and review of previous work.,” Human nutrition. Clinical nutrition, vol. 39 Suppl 1, pp. 5–41, 1985.
  • [18] J. M. Stapleton, N. Fujii, R. McGinn, K. McDonald, and G. P. Kenny, “Age-related differences in postsynaptic increases in sweating and skin blood flow postexercise,” Physiological Reports, vol. 2, no. 7, pp. e12078–e12078, 7 2014.