MyMediaLite
3.01
|
Shrunk Pearson correlation for rating data. More...
Public Member Functions | |
void | AddEntity (int entity_id) |
Add an entity to the CorrelationMatrix by growing it to the requested size. | |
override void | ComputeCorrelations (IRatings ratings, EntityType entity_type) |
Compute correlations for given ratings. | |
IMatrix< T > | CreateMatrix (int num_rows, int num_columns) |
Create a matrix with a given number of rows and columns. | |
int[] | GetNearestNeighbors (int entity_id, uint k) |
Get the k nearest neighbors of a given entity. | |
IList< int > | GetPositivelyCorrelatedEntities (int entity_id) |
Get all entities that are positively correlated to an entity, sorted by correlation. | |
void | Grow (int num_rows, int num_columns) |
Grows the matrix to the requested size, if necessary. | |
Pearson (int num_entities) | |
Constructor. Create a Pearson correlation matrix. | |
double | SumUp (int entity_id, ICollection< int > entities) |
Sum up the correlations between a given entity and the entities in a collection. | |
SymmetricMatrix (int dim) | |
Initializes a new instance of the SymmetricMatrix class. | |
IMatrix< T > | Transpose () |
Get the transpose of the matrix, i.e. a matrix where rows and columns are interchanged. | |
void | Write (StreamWriter writer) |
Write out the correlations to a StreamWriter. | |
Static Public Member Functions | |
static float | ComputeCorrelation (IRatings ratings, EntityType entity_type, int i, int j, float shrinkage) |
Compute correlation between two entities for given ratings. | |
static float | ComputeCorrelation (IRatings ratings, EntityType entity_type, IList< Pair< int, float >> entity_ratings, int j, float shrinkage) |
Compute correlation between two entities for given ratings. | |
static CorrelationMatrix | Create (int num_entities) |
Creates a correlation matrix. | |
static CorrelationMatrix | Create (IRatings ratings, EntityType entity_type, float shrinkage) |
Create a Pearson correlation matrix from given data. | |
static CorrelationMatrix | ReadCorrelationMatrix (StreamReader reader) |
Creates a CorrelationMatrix from the lines of a StreamReader. | |
Public Attributes | |
int | dim |
Dimension, the number of rows and columns. | |
Protected Attributes | |
internal T[][] | data |
Data array: data is stored in columns. | |
int | num_entities |
Number of entities, e.g. users or items. | |
Properties | |
override bool | IsSymmetric [get] |
returns true if the matrix is symmetric, which is generally the case for similarity matrices | |
int | NumberOfColumns [get] |
The number of columns of the matrix. | |
int | NumberOfRows [get] |
The number of rows of the matrix. | |
float | Shrinkage [get, set] |
shrinkage parameter, if set to 0 we have the standard Pearson correlation without shrinkage | |
virtual T | this[int i, int j] [get, set] |
T | this[int x, int y] [get, set] |
The value at (i,j) |
Shrunk Pearson correlation for rating data.
The correlation values are shrunk towards zero, depending on the number of ratings the estimate is based on. Otherwise, we would give too much weight to similarities estimated from just a few examples.
http://en.wikipedia.org/wiki/Pearson_correlation
We apply shrinkage as in formula (5.16) of chapter 5 of the Recommender Systems Handbook. Note that the shrinkage formula has changed betweem the two publications. It is now based on the assumption that the true correlations are normally distributed; the shrunk estimate is the posterior mean of the empirical estimate.
Literature:
Pearson | ( | int | num_entities | ) | [inline] |
Constructor. Create a Pearson correlation matrix.
num_entities | the number of entities |
void AddEntity | ( | int | entity_id | ) | [inline, inherited] |
Add an entity to the CorrelationMatrix by growing it to the requested size.
Note that you still have to correctly compute and set the entity's correlation values
entity_id | the numerical ID of the entity |
static float ComputeCorrelation | ( | IRatings | ratings, |
EntityType | entity_type, | ||
int | i, | ||
int | j, | ||
float | shrinkage | ||
) | [inline, static] |
Compute correlation between two entities for given ratings.
ratings | the rating data |
entity_type | the entity type, either USER or ITEM |
i | the ID of the first entity |
j | the ID of the second entity |
shrinkage | the shrinkage parameter, set to 0 for the standard Pearson correlation without shrinkage |
static float ComputeCorrelation | ( | IRatings | ratings, |
EntityType | entity_type, | ||
IList< Pair< int, float >> | entity_ratings, | ||
int | j, | ||
float | shrinkage | ||
) | [inline, static] |
Compute correlation between two entities for given ratings.
ratings | the rating data |
entity_type | the entity type, either USER or ITEM |
entity_ratings | ratings identifying the first entity |
j | the ID of second entity |
shrinkage | the shrinkage parameter, set to 0 for the standard Pearson correlation without shrinkage |
override void ComputeCorrelations | ( | IRatings | ratings, |
EntityType | entity_type | ||
) | [inline, virtual] |
Compute correlations for given ratings.
ratings | the rating data |
entity_type | the entity type, either USER or ITEM |
Implements RatingCorrelationMatrix.
static CorrelationMatrix Create | ( | int | num_entities | ) | [inline, static, inherited] |
Creates a correlation matrix.
Gives out a useful warning if there is not enough memory
num_entities | the number of entities |
static CorrelationMatrix Create | ( | IRatings | ratings, |
EntityType | entity_type, | ||
float | shrinkage | ||
) | [inline, static] |
IMatrix<T> CreateMatrix | ( | int | num_rows, |
int | num_columns | ||
) | [inline, inherited] |
Create a matrix with a given number of rows and columns.
num_rows | the number of rows |
num_columns | the number of columns |
Implements IMatrix< T >.
int [] GetNearestNeighbors | ( | int | entity_id, |
uint | k | ||
) | [inline, inherited] |
Get the k nearest neighbors of a given entity.
entity_id | the numerical ID of the entity |
k | the neighborhood size |
IList<int> GetPositivelyCorrelatedEntities | ( | int | entity_id | ) | [inline, inherited] |
Get all entities that are positively correlated to an entity, sorted by correlation.
entity_id | the entity ID |
void Grow | ( | int | num_rows, |
int | num_cols | ||
) | [inline, inherited] |
Grows the matrix to the requested size, if necessary.
The new entries are filled with zeros.
num_rows | the minimum number of rows |
num_cols | the minimum number of columns |
Implements IMatrix< T >.
static CorrelationMatrix ReadCorrelationMatrix | ( | StreamReader | reader | ) | [inline, static, inherited] |
Creates a CorrelationMatrix from the lines of a StreamReader.
In the first line, we expect to be the number of entities. All the other lines have the format
EntityID1 EntityID2 Correlation
where EntityID1 and EntityID2 are non-negative integers and Correlation is a floating point number.
reader | the StreamReader to read from |
double SumUp | ( | int | entity_id, |
ICollection< int > | entities | ||
) | [inline, inherited] |
Sum up the correlations between a given entity and the entities in a collection.
entity_id | the numerical ID of the entity |
entities | a collection containing the numerical IDs of the entities to compare to |
SymmetricMatrix | ( | int | dim | ) | [inline, inherited] |
Initializes a new instance of the SymmetricMatrix class.
dim | the number of rows and columns |
IMatrix<T> Transpose | ( | ) | [inline, inherited] |
Get the transpose of the matrix, i.e. a matrix where rows and columns are interchanged.
Implements IMatrix< T >.
void Write | ( | StreamWriter | writer | ) | [inline, inherited] |
Write out the correlations to a StreamWriter.
writer | A StreamWriter |
int dim [inherited] |
Dimension, the number of rows and columns.
int num_entities [protected, inherited] |
Number of entities, e.g. users or items.
int NumberOfColumns [get, inherited] |
int NumberOfRows [get, inherited] |
float Shrinkage [get, set] |
shrinkage parameter, if set to 0 we have the standard Pearson correlation without shrinkage
T this[int x, int y] [get, set, inherited] |
The value at (i,j)
The value at (i,j)
x | the row ID |
y | the column ID |
Implemented in SparseMatrix< T >, SkewSymmetricSparseMatrix, SymmetricSparseMatrix< T >, SparseBooleanMatrix, SparseBooleanMatrixBinarySearch, and SparseBooleanMatrixStatic.