How Google is improving recommendation systems

Google has published a paper dedicated to improving recommender systems through a deeper understanding of users’ semantic intent. The research is aimed at enhancing the quality of personalization in products such as Google Discover, YouTube, and Google News.

The objective of the proposed approach is to overcome the limitations of modern recommender systems, which largely rely on superficial behavioral signals, and to provide a more accurate understanding of individual user preferences regarding content to read, watch, or listen to.

Limitations of Traditional Recommender Approaches

Recommender systems are used to predict content that may be of interest to users, including within media, news, and video platforms, as well as in e-commerce environments. Traditionally, such systems analyze data related to clicks, views, ratings, and purchases in order to generate subsequent recommendations.

Within the scope of the research, these signals are defined as primitive user feedback, as they do not adequately capture users’ subjective evaluations, such as perceptions of humor, aesthetic appeal, or level of interest.

Personalized Semantics as a New Approach

The authors note that advances in large language models (LLMs) create new opportunities to use natural language as a more informative source of user feedback. This enables recommender systems to better interpret user intent through semantic analysis of queries, descriptions, and interactions.

The paper emphasizes that interactive recommender systems allow users to express preferences, constraints, and context in a richer form, including through dialog-based interfaces and faceted search. At the same time, a key challenge remains the accurate inference of semantic intent, particularly given the open-ended and informal nature of natural language.

The Role of Semantic Intent in Personalization

The research stresses that the ability of recommender systems to accurately interpret semantic intent is a critical prerequisite for maintaining intuitive user interactions with digital platforms. This approach allows systems to adapt recommendations more effectively to users’ real expectations rather than relying solely on behavioral history.

The Challenge of “Soft” Attributes in Recommender Systems

The authors distinguish between hard attributes and soft attributes. Hard attributes include objective content characteristics such as genre, artist, or director, which have unambiguous interpretations and can be directly utilized by recommender systems.

In contrast, soft attributes are subjective in nature and are not associated with clearly defined ground-truth values. The paper notes that such attributes are characterized by the absence of a definitive source of truth, imprecise interpretations, and dependence on individual user perception. This challenge served as the primary motivation for research into discovering personalized semantics for soft attributes in recommender systems.

Using Concept Activation Vectors to Interpret User Intent

The study proposes a novel application of Concept Activation Vectors (CAVs), a method for analyzing internal vector representations in machine learning models. Traditionally, CAVs are used to interpret how a model encodes specific concepts. In this research, however, the direction of application is reversed: CAVs are used to interpret users rather than the model itself.

Specifically, the method enables the transformation of subjective soft attributes expressed in natural language into mathematical representations suitable for recommender systems. The authors demonstrate that this adaptation of CAVs allows models to detect subtle differences in user intent and to account for individual interpretations of subjective characteristics.

Bridging the Semantic Gap Between Humans and Systems

One of the central challenges addressed by the research is bridging the semantic gap between how humans articulate their preferences and how recommender systems process information. Humans tend to think conceptually, using vague and descriptive language, whereas recommender systems operate on numerical vectors within high-dimensional embedding spaces.

The proposed approach reduces this ambiguity without requiring modification or retraining of the underlying recommender model. According to the researchers, the semantics of soft attributes are inferred directly from representations already learned by the model during training.

Advantages of the Proposed Approach

The authors identify four key advantages of the proposed methodology:

The model’s computational capacity is focused on predicting user-item preferences without the need to learn additional auxiliary information that typically does not improve recommendation performance.
The recommender system can incorporate new soft attributes without retraining when new tags, keywords, or phrases emerge.
The method enables evaluation of the relevance of individual soft attributes for predicting user preferences, which is important for recommendation explainability and preference elicitation.
The semantics of soft attributes can be learned using relatively small amounts of labeled data, consistent with pre-training and few-shot learning paradigms.

Overall System Architecture

At a conceptual level, the proposed system is based on two main components: a collaborative filtering recommender model that learns latent representations of users and items, and a limited set of soft attribute labels provided by users for a subset of content.

Applying Concept Activation Vectors to the model’s latent space makes it possible to determine the degree to which a soft attribute is exhibited by each item and to identify personalized differences in how that attribute is interpreted. This capability is critical for accurately inferring users’ true intent and improving recommendation quality.

Evaluation of System Effectiveness

Experimental results confirm the effectiveness of the proposed approach. In particular, testing with an artificially constructed tag lacking real semantic meaning (e.g., “odd year”) yielded accuracy only slightly above random selection. This finding supports the hypothesis that Concept Activation Vectors are effective specifically for identifying attributes related to genuine user preferences rather than arbitrary characteristics.

In addition, the use of CAVs in recommender systems produced positive outcomes in critiquing-based interaction scenarios, where users refine or adjust recommendations through subjective descriptions. In such cases, recommendation quality improved significantly.

Practical Benefits Identified

Within the study, the authors outline four primary practical benefits:

Leveraging collaborative filtering representations to identify attributes most relevant to the recommendation task.
Distinguishing between objective and subjective usage of tags and attributes.
Identifying personalized, user-specific semantics for subjective attributes.
Linking attribute semantics to preference representations, enabling the use of soft attributes in example-based critiquing and other forms of preference elicitation.

The approach demonstrated the strongest performance in contexts where discovering and interpreting soft attributes is critical. The potential application of this methodology in domains dominated by hard attributes, such as product shopping, is identified as a promising direction for future research.

Testing on Real Data and Integration with Production Systems

To validate the methodology, the researchers used the public MovieLens20M dataset, which contains 20 million user ratings, as well as Google’s proprietary recommendation algorithm, Weighted Alternating Least Squares (WALS), which is used in Google Cloud products.

The paper notes that some experiments were conducted using internal Google production code that is not publicly releasable. Nevertheless, the successful application of the approach in a production environment demonstrates its compatibility with existing recommender systems without requiring retraining or architectural changes.

Analytical Conclusions and Implications for the Google Ecosystem

Although the research was published in 2024, it remained largely unnoticed by the broader search and digital marketing community. However, the findings indicate a potentially significant shift in the evolution of recommender systems.

The ability to leverage the semantics of soft attributes enables much more precise personalization based on users’ subjective preferences. Given that Google Discover is considered part of Google’s broader search ecosystem, it is possible that similar approaches could be integrated into Google’s recommender products in the future.

If implemented in practice, this would result in recommendations that are more sensitive to individual user semantics, including how users interpret concepts such as “interesting,” “useful,” or “entertaining.”

Authorship and Research Collaboration

The research was conducted with contributions from several leading industry organizations. The primary contribution came from Google Research (approximately 60%), with additional involvement from Amazon, Midjourney, and Meta AI, underscoring the cross-industry nature of the work and its significance for the recommender systems field.

Read this article in Ukrainian.

Digital marketing puzzles making your head spin?

Say hello to us!
A leading global agency in Clutch's top-15, we've been mastering the digital space since 2004. With 9000+ projects delivered in 65 countries, our expertise is unparalleled.
Let's conquer challenges together!

Digital marketing

google

Read also:

Table of contents

Limitations of Traditional Recommender Approaches

Personalized Semantics as a New Approach

The Role of Semantic Intent in Personalization

The Challenge of “Soft” Attributes in Recommender Systems

Using Concept Activation Vectors to Interpret User Intent

Bridging the Semantic Gap Between Humans and Systems

Advantages of the Proposed Approach

Overall System Architecture

Evaluation of System Effectiveness

Practical Benefits Identified

Testing on Real Data and Integration with Production Systems

Analytical Conclusions and Implications for the Google Ecosystem

Authorship and Research Collaboration

Digital marketing puzzles making your head spin?

Hot articles

Read more

How Google is improving recommendation systems

Table of contents

Limitations of Traditional Recommender Approaches

Personalized Semantics as a New Approach

The Role of Semantic Intent in Personalization

The Challenge of “Soft” Attributes in Recommender Systems

Using Concept Activation Vectors to Interpret User Intent

Bridging the Semantic Gap Between Humans and Systems

Advantages of the Proposed Approach

Overall System Architecture

Evaluation of System Effectiveness

Practical Benefits Identified

Testing on Real Data and Integration with Production Systems

Analytical Conclusions and Implications for the Google Ecosystem

Authorship and Research Collaboration

Digital marketing puzzles making your head spin?

Hot articles

Read more

Our services