DefinePK

DefinePK hosts the largest index of Pakistani journals, research articles, news headlines, and videos. It also offers chapter-level book search.

Contrasting Impact of Start State on Performance of a Reinforcement Learning Recommender System


Article Information

Title: Contrasting Impact of Start State on Performance of a Reinforcement Learning Recommender System

Authors: Sidra Hassan, Mubbashir Ayub, Muhammad Waqar, Tasawer khan

Journal: International Journal of Innovations in Science & Technology

HEC Recognition History
Category From To
Y 2024-10-01 2025-12-31
Y 2023-07-01 2024-09-30
Y 2021-07-01 2022-06-30

Publisher: 50SEA JOURNALS (SMC-PRIVATE) LIMITED

Country: Pakistan

Year: 2024

Volume: 6

Issue: 2

Language: English

Keywords: Collaborative Filteringrecommender systemssimilarity measuresReinforcement LearningStart StateQ-Learning.

Categories

Abstract

A recommendation problem and RL problem are very similar, as both try to increase user satisfaction in a certain environment. Typical recommender systems mainly rely on history of the user to give future recommendations and doesn’t adapt well to current changing user demands. RL can be used to evolve with currently changing user demands by considering a reward function as feedback. In this paper, recommendation problem is modeled as an RL problem using a squared grid environment, with each grid cell representing a unique state generated by a biclustering algorithm Bibit. These biclusters are sorted according to their overlapping and then mapped to a squared grid. An RL agent then moves on this grid to obtain recommendations. However, the agent has to decide the most pertinent start state that can give best recommendations. To decide the start state of the agent, a contrasting impact of different start states on the performance of RL agent-based RSs is required. For this purpose, we applied seven different similarity measures to determine the start state of the RL agent. These similarity measures are diverse, attributed to the fact that some may not use rating values, some may only use rating values, or some may use global parameters like average rating value or standard deviation in rating values. Evaluation is performed on ML-100K and FilmTrust datasets under different environment settings. Results proved that careful selection of start state can greatly improve the performance of RL-based recommender systems.


Research Objective

To investigate the contrasting impact of different start states on the performance of Reinforcement Learning (RL) agent-based recommender systems (RSs) and identify the most effective similarity measure for determining the start state.


Methodology

The research models the recommendation problem as an RL problem using a squared grid environment. Biclusters are generated using the Bibit algorithm and then mapped onto a squared grid. An RL agent navigates this grid to provide recommendations. Seven different similarity measures (ITR, Cosine, Jaccard, Euclidian, Manhattan, PCC, and TMJ) are applied to determine the start state of the RL agent. Evaluation is performed on the ML-100K and FilmTrust datasets under different environment settings (6x6 and 7x7 grid sizes). Performance is assessed using measures such as user coverage, item coverage, precision, recall, F-measure, and return.

Methodology Flowchart
                        graph TD
    A[Generate Biclusters using Bibit] --> B[Choose Grid Size6x6 or 7x7];
    B --> C[Sort Biclusters by Item Overlapping];
    C --> D[Map Sorted Biclusters to Squared Grid];
    D --> E[Apply Similarity Measure to Determine Start State];
    E --> F[RL Agent Navigates Grid];
    F --> G[Collect Recommendations];
    G --> H[Evaluate Performance Metrics];
    E --> E; % Loop for trying each of the 7 similarity measures
    H --> I[Compare Performance of Measures];
    I --> J[Draw Conclusions];                    

Discussion

The study highlights that the choice of similarity measure for determining the start state significantly impacts the performance of RL-based recommender systems. Measures that consider a broader set of items (like ITR) tend to be more robust, especially with sparse data, by ensuring a start state can always be determined. Measures relying on co-rated items can struggle when such items are scarce. The trade-off between grid size and performance metrics like return suggests that while larger grids might improve accuracy, they can also lead to a decrease in the overall reward obtained by the RL agent.


Key Findings

The ITR similarity measure demonstrated the best performance in terms of user and item coverage, particularly on sparse datasets like FilmTrust, and showed the lowest count of failures in determining a start state. Euclidian and Manhattan measures showed similar performance, and along with Cosine and Jaccard, performed reasonably well. PCC and TMJ measures consistently performed poorly across both datasets. Increasing the grid size from 6x6 to 7x7 generally improved precision, recall, and F-measure for most measures on the ML-100K dataset, but decreased the return value. For the FilmTrust dataset, increasing grid size improved accuracy metrics but decreased return.


Conclusion

The selection of an appropriate similarity measure for determining the start state is crucial for optimizing the performance of RL-based recommender systems. The ITR measure is effective, especially for sparse datasets, while Cosine, Euclidean, Manhattan, and Jaccard measures offer a viable alternative. PCC and TMJ are less suitable. The study also indicates that grid size can influence performance metrics, with a trade-off between accuracy and reward.


Fact Check

1. Dataset Sparsity: The ML-100K dataset has a sparsity of 93.70%, and the FilmTrust dataset has a sparsity of 98.86%. (Confirmed in "Dataset Description and Methodology" section).
2. Grid Sizes: The study evaluates 6x6 and 7x7 squared grid environments. (Confirmed in "Introduction" and "Dataset Description and Methodology" sections).
3. Number of Similarity Measures: Seven different similarity measures were applied: ITR, Cosine, Jaccard, Euclidian, Manhattan, PCC, and TMJ. (Confirmed in "Methodology" and "Conclusion" sections).


Mind Map

Loading PDF...

Loading Statistics...