What Are Recommender Systems?
Recommender systems are so commonplace now that many use them without knowing it. Because we can't possibly look through all the products or content on a website, a recommendation system plays an important role in helping us have a better user experience, while also exposing us to more inventory we might not discover otherwise.
We must have seen this while searching for our favourite hoodie, and we see better suggestions than that. Well, that's a sign of a better recommendation system.
An important component of any of these systems is the recommender function, which takes information about the user and predicts the rating that the user might assign to a product, for example. Predicting user ratings, even before the user has provided one, makes recommender systems a powerful tool.
As You can see from the figure.
How Do Recommender Systems Work?
1- Understanding Relationships
Relationships provide recommender systems with tremendous insight, as well as an understanding of customers. Three main types occur:
User-Product Relationship
The user-product relationship occurs when some users have an affinity or preference towards specific products that they need. For example, a cricket player might have a preference for cricket-related items, thus the e-commerce website will build a user-product relation of player-cricket.
Product-Product Relationship
Product-product relationships occur when items are similar, either by appearance or description. Ex- Books or music of the same genre.
User-User Relationship
User-user relationships occur when some customers have similar tastes for a particular product or service. Examples include mutual friends, similar backgrounds, similar age, etc.
2-Data System
User Behavior Data
Users behavior data is useful information about the engagement of the user on the product. It can be collected from ratings, clicks and purchase history.
User Demographic Data
User demographic information is related to the user’s personal information such as age, education, income and location.
Product Attribute Data
Product attribute data is information related to the product itself such as genre in the case of books, cast in the case of movies, and cuisine in the case of food.
How do we provide data for the Recommender System?
1-User Ratings
2-Item-Item Filtering- If the user is browsing or searching for a particular product, they can be shown similar products.
3-User-User Filtering- Users have similar kind of tastes then the recommendation system will treat them as similar kind.
Well, tbh user-user filtering requires a certain number of active users to sample out the relation which is not a better to start with. It will be a better way to start as the user base increases as we see in Netflix. It will be better as the user base increases like we see in Netflix or Spotify but you won't get such accuracy with new apps.
Product similarity is the one we should with as it requires products but not users. If you don't adequately have both, either of them won't work.
Similarity measurements.
Similarity is measured using the distance metric. Nearest points are the most similar and farthest points are the least relevant. The similarity is subjective and is highly dependent on the domain and application
There are some methods or approaches we can go for,
1-Minkowski Distance: When the dimension of a data point is numeric, the general form is called the Minkowski distance.
2-Manhattan Distance: The distance between two points measured along axes at right angles.
3-Euclidean Distance: The square root of the sum of squares of the difference between the coordinates and is given by Pythagorean theorem
4-Cosine Similarity: The cosine of 0 degrees is 1 which means the data points are similar and the cosine of 90 degrees is 0 which means the data points are dissimilar.
There are some more approaches like Pearson Coefficient etc. I won't go into maths but you can go through it for once.
Approaches to Content-Based Recommender Systems
Approach 1: Using Rated Content to Recommend
In this approach contents of the product are already rated and based on the user’s preference, then a rating is predicted for a similar product.
Let's go through an example where I want to make a movie suggestion system. I need some datasets to implement a recommendation system.
- Review data
- Movie Attributes
- Rating by users
With all the systems combined, we can use any mathematical concept mentioned above. The method that gives you the best accuracy will be your model. Since Machine Learning is all about trial and error.
Inception is suggested for Nik because she liked Interstellar and the movies share similar attributes. Kamal is suggested to Bruce because he liked The Shining, which is in the horror genre.
Advantages: Works even when a product has no user reviews.
Disadvantages: Requires descriptive data of all content to recommend and also difficult to implement on large product databases as users have different opinions about each item.
Approach 2: Recommendation through Description of the Content
The description delves into the product details, such as title, summary, taglines, genre, and more, offering comprehensive information about the item. Since these details are in text format (strings), it's essential to convert them into numerical form to facilitate similarity calculations.
Term Frequency-Inverse Document Frequency (TF-IDF)
TF-IDF is used in information retrieval for feature extraction purposes and it is a sub-area of natural language processing (NLP).There are two element dwelved into it Term Frequency and Inverse Document Frequency.
Collaborative Filtering Recommender Systems
Collaborative filtering recommenders make suggestions based on how users rated in the past and not based on the product themselves.The way users responded to a content.
Going back to our movie example earlier.
We have two elements here too
- User Rating
Here we can see kalpesh and Krishna Have same taste.
- redicted User Rating
Kalpesh was recommended Inception by Krishna.
Advantages:
No requirement for product descriptions.
Disadvantages:
- Suffers from the cold start problem.
- Difficult to recommend new users and is inclined to favour popular products with lots of reviews.
- Faces the "grey sheep problem" (i.e., useful predictions cannot be made due to sparsity).
- Difficult to recommend new releases since they have less review
Most collaborative recommender systems perform poorly when dimensions in data increase. It is a good idea to reduce the number of features while retaining the maximum amount of information called dimensionality reduction.
There are many dimensionality reduction algorithms such as principal component analysis (PCA), linear discriminant analysis (LDA), SVD(Mostly Used) etc.
Hybrid Recommender Systems
It is the most used Recommended system that combines the benefits of collaborative filtering and content filtering.
Netflix deploys hybrid recommenders on a large scale. When a new user subscribes to their service they are required to rate content already seen or rate particular genres. Based on that they provide better suggestion
Association Rules Learning
It helps in associating one product with another product and tries to answer which products are associated with one another. It is mostly used in e-commerce as users tend to buy a product paired with the main product.
Some Real-world Examples of Recommendation Systems
1-Netflix- It suggests your favourite content based on your past watchlist and content you have liked.
2-Spotify-It uses hybrid filtering. Based on your song playlists it will suggest songs and create a playlist to that taste. There you will find it difficult to get better suggestions for new users but later it improves exponentially.
Some references to give you a better idea about the terms we discussed here.
I will discuss about coding section in next part.
Top comments (0)