SASRec algorithms and utilities

class sasrec.model.Encoder(*args, **kwargs)[source]

Invokes Transformer based encoder with user defined number of layers

Parameters
  • num_layers (int) – Number of layers.

  • seq_max_len (int) – Maximum sequence length.

  • embedding_dim (int) – Embedding dimension.

  • attention_dim (int) – Dimension of the attention embeddings.

  • num_heads (int) – Number of heads in the multi-head self-attention module.

  • conv_dims (list) – List of the dimensions of the Feedforward layer.

  • dropout_rate (float) – Dropout probability.

call(x, training, mask)[source]

Model forward pass.

Parameters
  • x (tf.Tensor) – Input tensor.

  • training (tf.Tensor) – Training tensor.

  • mask (tf.Tensor) – Mask tensor.

Returns

tf.Tensor – Output tensor

class sasrec.model.EncoderLayer(*args, **kwargs)[source]

Transformer based encoder layer

Parameters
  • seq_max_len (int) – Maximum sequence length.

  • embedding_dim (int) – Embedding dimension.

  • attention_dim (int) – Dimension of the attention embeddings.

  • num_heads (int) – Number of heads in the multi-head self-attention module.

  • conv_dims (list) – List of the dimensions of the Feedforward layer.

  • dropout_rate (float) – Dropout probability.

call(x, training, mask)[source]

Model forward pass.

Parameters
  • x (tf.Tensor) – Input tensor.

  • training (tf.Tensor) – Training tensor.

  • mask (tf.Tensor) – Mask tensor.

Returns

tf.Tensor – Output tensor

call_(x, training, mask)[source]

Model forward pass.

Parameters
  • x (tf.Tensor) – Input tensor.

  • training (tf.Tensor) – Training tensor.

  • mask (tf.Tensor) – Mask tensor.

Returns

tf.Tensor – Output tensor

class sasrec.model.LayerNormalization(*args, **kwargs)[source]

Layer normalization using mean and variance gamma and beta are the learnable parameters

Parameters
  • seq_max_len (int) – Maximum sequence length.

  • embedding_dim (int) – Embedding dimension.

  • epsilon (float) – Epsilon value.

call(x)[source]

Model forward pass.

Parameters

x (tf.Tensor) – Input tensor.

Returns

tf.Tensor – Output tensor

class sasrec.model.MultiHeadAttention(*args, **kwargs)[source]
  • Q (query), K (key) and V (value) are split into multiple heads (num_heads)

  • each tuple (q, k, v) are fed to scaled_dot_product_attention

  • all attention outputs are concatenated

Parameters
  • attention_dim (int) – Dimension of the attention embeddings.

  • num_heads (int) – Number of heads in the multi-head self-attention module.

  • dropout_rate (float) – Dropout probability.

call(queries, keys)[source]

Model forward pass.

Parameters
  • queries (tf.Tensor) – Tensor of queries.

  • keys (tf.Tensor) – Tensor of keys

Returns

tf.Tensor – Output tensor

class sasrec.model.PointWiseFeedForward(*args, **kwargs)[source]

Convolution layers with residual connection

Parameters
  • conv_dims (list) – List of the dimensions of the Feedforward layer.

  • dropout_rate (float) – Dropout probability.

call(x)[source]

Model forward pass.

Parameters

x (tf.Tensor) – Input tensor.

Returns

tf.Tensor – Output tensor

class sasrec.model.SASREC(*args, **kwargs)[source]

Self-Attentive Sequential Recommendation Using Transformer

Keyword Arguments
  • item_num (int) – Number of items in the dataset.

  • seq_max_len (int) – Maximum number of items in user history.

  • num_blocks (int) – Number of Transformer blocks to be used.

  • embedding_dim (int) – Item embedding dimension.

  • attention_dim (int) – Transformer attention dimension.

  • attention_num_heads (int) – Transformer attention head.

  • conv_dims (list) – List of the dimensions of the Feedforward layer.

  • dropout_rate (float) – Dropout rate.

  • l2_reg (float) – Coefficient of the L2 regularization.

epoch

Epoch of trained model.

Type

int

best_score

Best validation HR@10 score while training.

Type

float

val_users

User list for validation.

Type

list

history

Train history containing epoch, NDCG@10, and HR@10.

Type

pd.DataFrame

batch_predict(inputs, cand_n)[source]

Returns the logits for the item candidates.

Parameters
  • inputs (tf.Tensor) – Input tensor.

  • cand_n (int) – Num of candidates.

Returns

tf.Tensor – Output tensor

call(x, training)[source]

Model forward pass.

Parameters
  • x (tf.Tensor) – Input tensor.

  • training (tf.Tensor) – Training tensor.

Returns
  • tf.Tensor – Logits of the positive examples

  • tf.Tensor – Logits of the negative examples

  • tf.Tensor – Mask for nonzero targets

create_combined_dataset(u, seq, pos, neg)[source]

function to create model inputs from sampled batch data. This function is used during training.

create_combined_dataset_pred(u, seq, cand)[source]

function to create model inputs from sampled batch data. This function is used during predicting on batch.

embedding(input_seq)[source]

Compute the sequence and positional embeddings.

Parameters

input_seq (tf.Tensor) – Input sequence

Returns
  • tf.Tensor – Sequence embeddings

  • tf.Tensor – Positional embeddings

evaluate(dataset, target_user_n=1000, target_item_n=- 1, rank_threshold=10, is_val=False)[source]

Evaluate model on validation set or test set

Parameters
  • dataset (SASRecDataSet) – SASRecDataSet containing users-item interaction history.

  • target_user_n (int, optional) – Number of randomly sampled users to evaluate. Defaults to 1000.

  • target_item_n (int, optional) – Size of candidate. Defaults to -1, which means all.

  • rank_threshold (int, optional) – k value in NDCG@k and HR@k. Defaults to 10.

  • is_val (bool, optional) – If true, evaluate on validation set. If False, evaluate on test set. Defaults to False.

Returns
  • float – NDCG@k

  • float – HR@k

get_user_item_score(dataset, user_id_list, item_list, user_map_dict, item_map_dict, batch_size=128)[source]

Get item score for each user on batch

Parameters
  • dataset (SASRecDataSet) – SASRecDataSet containing users-item interaction history.

  • user_id_list (list) – User list to predict.

  • item_list (list) – Item list to predict

  • user_map_dict (dict) – Dict { user_id : encoded_user_label , … }

  • item_map_dict (dict) – Dict { item : encoded_item_label , … }

  • batch_size (int, optional) – Batch size. Defaults to 128.

Raises

Exception – Batch_size must be smaller than user id list size.

Returns

pd.DataFrame – user-item score

loss_function(pos_logits, neg_logits, istarget)[source]

Losses are calculated separately for the positive and negative items based on the corresponding logits. A mask is included to take care of the zero items (added for padding).

Parameters
  • pos_logits (tf.Tensor) – Logits of the positive examples.

  • neg_logits (tf.Tensor) – Logits of the negative examples.

  • istarget (tf.Tensor) – Mask for nonzero targets.

Returns

float – loss

old_get_user_item_score(dataset, user_map_dict, item_map_dict, user_id_list, item_list, is_test=False)[source]

Deprecated

predict(inputs, neg_cand_n)[source]

Returns the logits for the test items.

Parameters
  • inputs (tf.Tensor) – Input tensor.

  • neg_cand_n – num of negative candidates

Returns

tf.Tensor – Output tensor

recommend_item(dataset, user_map_dict, user_id_list, target_item_n=- 1, top_n=10, exclude_purchased=True, is_test=False)[source]

Recommend items to user

Parameters
  • dataset (util.SASRecDataSet) – SASRecDataSet containing users-item interaction history.

  • user_map_dict (dict) – Dict { user_id : encoded_user_label , … }

  • user_id_list (list) – User list to predict.

  • target_item_n (int, optional) – Size of candidate. Defaults to -1, which means all.

  • top_n (int, optional) – Number of items to recommend. Defaults to 10.

  • exclude_purchased (bool, optional) – If true, exclude already purchased item from candidate. Defaults to True.

  • is_test (bool, optional) – If true, exclude the last item from each user’s sequence. Defaults to False.

Returns

pd.DataFrame – recommended items for users

sample_val_users(dataset, target_user_n)[source]

Sample users for validation

Parameters
  • dataset (SASRecDataSet) – SASRec dataset used for training

  • target_user_n (int) – Number of users to sample

save(path, exp_name='sas_experiment')[source]

Save trained SASRec Model

Parameters
  • path (str) – Path to save model.

  • exp_name (str) – Experiment name.

Examples

>>> model.save(path, exp_name)
train(dataset, sampler, num_epochs=10, batch_size=128, lr=0.001, val_epoch=5, val_target_user_n=1000, target_item_n=- 1, auto_save=False, path='./', exp_name='SASRec_exp')[source]

High level function for model training as well as evaluation on the validation and test dataset

Parameters
  • dataset (util.SASRecDataSet) – SASRecDataSet containing users-item interaction history.

  • sampler (util.WarpSampler) – WarpSampler.

  • num_epochs (int, optional) – Epoch. Defaults to 10.

  • batch_size (int, optional) – Batch size. Defaults to 128.

  • lr (float, optional) – Learning rate. Defaults to 0.001.

  • val_epoch (int, optional) – Validation term. Defaults to 5.

  • val_target_user_n (int, optional) – Number of randomly sampled users to conduct validation. Defaults to 1000.

  • target_item_n (int, optional) – Size of candidate. Defaults to -1, which means all.

  • auto_save (bool, optional) – If true, save model with best validation score. Defaults to False.

  • path (str, optional) – Path to save model.

  • exp_name (str, optional) – Experiment name.