5 mins read

The factorized top-k metric is a machine learning evaluation metric that is often used to measure the performance of recommendation systems. It measures how well a recommendation system predicts which items a user is likely to interact with, based on their previous interactions with the system.

In factorized top-k metric, the recommendation system is evaluated based on the top-k items that it recommends for each user. The metric is computed by factorizing the predicted item-user matrix into two matrices – one representing the user factors and the other representing the item factors. The dot product of these two matrices is used to predict the relevance of items for each user.

To calculate the factorized top k metric, the system first predicts a set of items for each user and then computes the dot product of the user factor matrix and the item factor matrix for each of these items. The top-k items with the highest dot products are then evaluated to see how many of them correspond to items that the user has actually interacted with.

The factorized top k metric is considered to be a more accurate measure of recommendation system performance than simpler metrics such as precision or recall, because it takes into account both the relevance of the recommended items and the diversity of the recommendations.

**Example:**

Suppose we have a recommendation system that suggests movies to users based on their previous movie ratings. Let’s say we have 5 users and 5 movies, and we have already collected the ratings for each user for some of the movies. We want to use this data to evaluate how well the recommendation system is able to predict which movies a user will like.

Here’s the user-item rating matrix:

User/Movie | Movie 1 | Movie 2 | Movie 3 | Movie 4 | Movie 5 |
---|---|---|---|---|---|

User 1 | 5 | 4 | – | – | 2 |

User 2 | – | 1 | 2 | – | 3 |

User 3 | 3 | – | – | 4 | – |

User 4 | – | 5 | 1 | 2 | – |

User 5 | 2 | 3 | – | 1 | 4 |

In this matrix, “-” indicates that the user has not rated that movie yet.

Now, let’s say the recommendation system uses a factorized model to predict the ratings that each user would give to each movie. After training the model, it generates the following predicted user-item matrix:

User/Movie | Movie 1 | Movie 2 | Movie 3 | Movie 4 | Movie 5 |
---|---|---|---|---|---|

User 1 | 4.5 | 3.8 | 2.1 | 1.9 | 2.7 |

User 2 | 3.1 | 1.2 | 2.9 | 3.3 | 3.8 |

User 3 | 3.3 | 3.6 | 1.9 | 4.1 | 2.4 |

User 4 | 4.2 | 4.9 | 1.2 | 2.2 | 3.1 |

User 5 | 1.9 | 3.2 | 3.1 | 0.8 | 3.8 |

Now, let’s say we want to evaluate how well the recommendation system is able to predict which movies a user is likely to enjoy. We’ll use the factorized top-k metric to do this. Let’s say we choose k=2, so we’ll evaluate the top-2 recommended movies for each user.

For User 1, the top-2 recommended movies are Movie 1 and Movie 2, with predicted ratings of 4.5 and 3.8, respectively. Let’s say that User 1 actually watched and enjoyed Movie 1 and Movie 2. In this case, the system correctly recommended both of the movies that the user enjoyed, so the factorized top-k metric for User 1 would be 1.

For User 2, the top-2 recommended movies are Movie 5 and Movie 3, with predicted ratings of 3.8 and 2.9, respectively. Let’s say that User 2 actually watched and enjoyed Movie 5, but did not watch or enjoy Movie 3. In this case, the system only correctly recommended one out of the two movies that the user enjoyed, so the factorized top-k metric for User 2 would be 0.5.

For User 3, the top-2 recommended movies are Movie 4 and Movie 2, with predicted ratings of 4.1 and 3.6, respectively. Let’s say that User 3 actually watched and enjoyed Movie 4, but did not watch or enjoy Movie 2. In this case, the system only correctly recommended one out of the two movies that the user enjoyed, so the factorized top-k metric for User 3 would be 0.5.

For User 4, the top-2 recommended movies are Movie 2 and Movie 1, with predicted ratings of 4.9 and 4.2, respectively. Let’s say that User 4 actually watched and enjoyed Movie 2, but did not watch or enjoy Movie 1. In this case, the system only correctly recommended one out of the two movies that the user enjoyed, so the factorized top-k metric for User 4 would be 0.5.

For User 5, the top-2 recommended movies are Movie 5 and Movie 3, with predicted ratings of 3.8 and 3.1, respectively. Let’s say that User 5 actually watched and enjoyed Movie 5, but did not watch or enjoy Movie 3. In this case, the system only correctly recommended one out of the two movies that the user enjoyed, so the factorized top-k metric for User 5 would be 0.5.

To calculate the overall factorized top-k metric for the recommendation system, we can average the individual metrics across all users. In this case, the average metric would be:

(1 + 0.5 + 0.5 + 0.5 + 0.5) / 5 = 0.6

This means that, on average, the recommendation system correctly recommended one out of the top-2 movies for each user.