MyMediaLite is mainly a library, meant to be used by other applications.
We provide two command-line tools that offer much of MyMediaLite's functionality.
They allow you to find out how MyMediaLite can deal with a dataset without having
to integrate the library in your application or having to develop your own program.
Rating Prediction
The programs expect the data to be in a simple text format:
user_id item_id ratingwhere user_id and item_id are integers referring to users and items, respectively, and rating is a floating-point number expressing how much a user likes an item. The separator between the values can be either spaces, tabs, or commas. If there are more than three columns, all additional columns will be ignored. A small example dataset:
1 1 5 1 2 3 1 3 4 1 4 3 1 5 3 1 7 4
The general usage of the rating prediction program is as follows:
rating_prediction --training-file=TRAINING_FILE --test-file=TEST_FILE --recommender=METHOD [OPTIONS]METHOD is the recommender to use, which will be trained using the contents of TRAINING_FILE. The recommender will then predict the data in TEST_FILE, and the program will display the RMSE (root mean square error) and MAE (mean absolute error) of the predictions.
You can download the MovieLens 100k ratings dataset from the GroupLens Research website and unzip it to have something to play with. If you have downloaded the MyMediaLite source code, you can do this automatically by entering
make download-movielens
To try out a simple baseline method on the data, you just need to enter
rating_prediction --training-file=u1.base --test-file=u1.test --recommender=UserAveragewhich should give a result like
UserAverage training_time 00:00:00.000098 RMSE 1.063 MAE 0.85019 testing_time 00:00:00.032326
To use a more advanced recommender, enter
rating_prediction --training-file=u1.base --test-file=u1.test --recommender=BiasedMatrixFactorizationwhich yields better result than the user average:
BiasedMatrixFactorization num_factors=10 regularization=0.015 learn_rate=0.01 num_iter=30 init_mean=0 init_stdev=0.1 training_time 00:00:03.3575780 RMSE 0.96108 MAE 0.75124 testing_time 00:00:00.0159740The key-value pairs after the method name represent arguments to the recommender that may be modified to get even better results. For instance, we could use more latent factors per user and item, which leads to a more complex (and hopefully more accurate) model:
rating_prediction --training-file=u1.base --test-file=u1.test --recommender=BiasedMatrixFactorization --recommender-options="num_factors=20" ... ... RMSE 0.98029 MAE 0.76558Wait a second. The RMSE actually got worse!
This may be because we do not train our model with the optimal arguments. One thing to look at is the number of iterations. If we iterate for too long, the learning process overfits the training data, which means the resulting model does not generalize well to predict unknown future data. The options find_iter=A, num_iter=B and max_iter=C help us to find the right number of iterations. Togehter, the three options mean "From iteration B on, give out the evaluation results on the test set every A iterations, until you reach iteration C."
rating_prediction --training-file=u1.base --test-file=u1.test --recommender=BiasedMatrixFactorization --recomender-options="num_factors=20 num_iter=0" --max-iter=25 --num-iter=0 ... RMSE 1.17083 MAE 0.96918 iteration 0 RMSE 1.01383 MAE 0.8143 iteration 1 RMSE 0.98742 MAE 0.78742 iteration 2 RMSE 0.97672 MAE 0.77668 iteration 3 RMSE 0.9709 MAE 0.77078 iteration 4 RMSE 0.96723 MAE 0.76702 iteration 5 RMSE 0.96466 MAE 0.76442 iteration 6 RMSE 0.96269 MAE 0.76241 iteration 7 RMSE 0.96104 MAE 0.76069 iteration 8 RMSE 0.95958 MAE 0.75917 iteration 9 RMSE 0.95825 MAE 0.75783 iteration 10 RMSE 0.95711 MAE 0.75667 iteration 11 RMSE 0.95626 MAE 0.75569 iteration 12 RMSE 0.95578 MAE 0.75501 iteration 13 RMSE 0.95573 MAE 0.75467 iteration 14 RMSE 0.95611 MAE 0.75467 iteration 15 RMSE 0.9569 MAE 0.75499 iteration 16 RMSE 0.95802 MAE 0.75551 iteration 17 RMSE 0.95942 MAE 0.75623 iteration 18 RMSE 0.96102 MAE 0.7571 iteration 19 RMSE 0.96277 MAE 0.75806 iteration 20 RMSE 0.96463 MAE 0.75909 iteration 21 RMSE 0.96656 MAE 0.76017 iteration 22 RMSE 0.96852 MAE 0.7613 iteration 23 RMSE 0.9705 MAE 0.76246 iteration 24 RMSE 0.97247 MAE 0.76364 iteration 25This means that we should probably set --num-iter to around 15 to get better results.
Warning: Depending on the recommender, the choice of arguments (hyperparameters) may be crucial to the recommender's performance. Sometimes you may find suitable values by playing a bit with the arguments, starting from their default values. However, there is no guarantee that this will work! You should use a principled approach to find good hyperparameters, e.g. cross-validation and grid search.
Item Prediction from Positive-Only Feedback
The item recommendation program behaves similarly to the rating prediction program, so we concentrate on the differences here.
item_recommendation --training-file=TRAINING_FILE --test-file=TEST_FILE --recommender=METHOD [OPTIONS]
The third column in the data files may be omitted. If it is present, it will be ignored.
Example:
1 1 1 2 1 3 1 4 1 5 1 7
Instead of RMSE and MAP, the evaluation measures are now prec@N (precision at N), AUC (area under the ROC curve), MAP (mean average precision), and NDCG (normalized discounted cumulative gain).
Let us start again with some baseline methods, Random and MostPopular:
item_recommendation --training-file=u1.base --test-file=u1.test --recommender=Random random training_time 00:00:00.0001040 AUC 0.49924 prec@5 0.02789 prec@10 0.02898 MAP 0.00122 NDCG 0.37214 num_users 459 num_items 1650 testing_time 00:00:02.7115540 item_recommendation --training-file=u1.base --test-file=u1.test --recommender=MostPopular MostPopular training_time 00:00:00.0015710 AUC 0.8543 prec@5 0.322 prec@10 0.30458 MAP 0.02187 NDCG 0.57038 num_users 459 num_items 1650 testing_time 00:00:02.3813790
User-based collaborative filtering:
item_recommendation --training-file=u1.base --test-file=u1.test --recommender=UserKNN UserKNN k=80 training_time 00:00:05.6057200 AUC 0.91682 prec@5 0.52505 prec@10 0.46776 MAP 0.06482 NDCG 0.68793 num_users 459 num_items 1650 testing_time 00:00:08.8362840
Note that item recommendation evaluation usually takes longer than the rating prediction evaluation, because for each user, scores for every candidate item (possibly all items) have to be computed. You can restrict the number of predictions to be made using the options --test-users=FILE and --candidate-items=FILE to save time.
The item recommendation program supports the same options (--find-iter=N etc.) for iteratively trained recommenders like BPRMF and WRMF.