diff --git a/README.md b/README.md
index 8d78202efc..9bc08bd717 100644
--- a/README.md
+++ b/README.md
@@ -82,7 +82,7 @@ The table below lists the recommender algorithms currently available in the repo
 | LightGBM/Gradient Boosting Tree<sup>*</sup> | Content-Based Filtering | Gradient Boosting Tree algorithm for fast training and low memory usage in content-based problems. It works in the CPU/GPU/PySpark environments. | [Quick start in CPU](examples/00_quick_start/lightgbm_tinycriteo.ipynb) / [Deep dive in PySpark](examples/02_model_content_based_filtering/mmlspark_lightgbm_criteo.ipynb) |
 | LightGCN | Collaborative Filtering | Deep learning algorithm which simplifies the design of GCN for predicting implicit feedback. It works in the CPU/GPU environment. | [Deep dive](examples/02_model_collaborative_filtering/lightgcn_deep_dive.ipynb) |
 | GeoIMC<sup>*</sup> | Hybrid | Matrix completion algorithm that has into account user and item features using Riemannian conjugate gradients optimization and following a geometric approach. It works in the CPU environment. | [Quick start](examples/00_quick_start/geoimc_movielens.ipynb) |
-| GRU4Rec | Collaborative Filtering | Sequential-based algorithm that aims to capture both long and short-term user preferences using recurrent neural networks. It works in the CPU/GPU environment. | [Quick start](examples/00_quick_start/sequential_recsys_amazondataset.ipynb) |
+| GRU | Collaborative Filtering | Sequential-based algorithm that aims to capture both long and short-term user preferences using recurrent neural networks. It works in the CPU/GPU environment. | [Quick start](examples/00_quick_start/sequential_recsys_amazondataset.ipynb) |
 | Multinomial VAE | Collaborative Filtering | Generative model for predicting user/item interactions. It works in the CPU/GPU environment. | [Deep dive](examples/02_model_collaborative_filtering/multi_vae_deep_dive.ipynb) |
 | Neural Recommendation with Long- and Short-term User Representations (LSTUR)<sup>*</sup> | Content-Based Filtering | Neural recommendation algorithm for recommending news articles with long- and short-term user interest modeling. It works in the CPU/GPU environment. | [Quick start](examples/00_quick_start/lstur_MIND.ipynb) |
 | Neural Recommendation with Attentive Multi-View Learning (NAML)<sup>*</sup> | Content-Based Filtering | Neural recommendation algorithm for recommending news articles with attentive multi-view learning. It works in the CPU/GPU environment. | [Quick start](examples/00_quick_start/naml_MIND.ipynb) |
diff --git a/docs/source/models.rst b/docs/source/models.rst
index e15204e4dd..4b5080869c 100644
--- a/docs/source/models.rst
+++ b/docs/source/models.rst
@@ -57,9 +57,9 @@ Caser
 .. automodule:: recommenders.models.deeprec.models.sequential.caser
     :members:
 
-GRU4Rec
+GRU
 --------------
-.. automodule:: recommenders.models.deeprec.models.sequential.gru4rec
+.. automodule:: recommenders.models.deeprec.models.sequential.gru
     :members:
 
 NextItNet
diff --git a/examples/00_quick_start/README.md b/examples/00_quick_start/README.md
index 4755cdbb2b..5529ced6e9 100644
--- a/examples/00_quick_start/README.md
+++ b/examples/00_quick_start/README.md
@@ -20,7 +20,7 @@ In this directory, notebooks are provided to perform a quick demonstration of di
 | [sar_azureml_designer](sar_movieratings_with_azureml_designer.ipynb) | MovieLens | Python CPU | An example of how to implement SAR on [AzureML Designer](https://docs.microsoft.com/en-us/azure/machine-learning/concept-designer). |
 | [a2svd](sequential_recsys_amazondataset.ipynb) | Amazon | Python CPU, GPU | Use A2SVD [11] to predict a set of movies the user is going to interact in a short time. |
 | [caser](sequential_recsys_amazondataset.ipynb) | Amazon | Python CPU, GPU | Use Caser [12] to predict a set of movies the user is going to interact in a short time. |
-| [gru4rec](sequential_recsys_amazondataset.ipynb) | Amazon | Python CPU, GPU | Use GRU4Rec [13] to predict a set of movies the user is going to interact in a short time. |
+| [gru](sequential_recsys_amazondataset.ipynb) | Amazon | Python CPU, GPU | Use GRU [13] to predict a set of movies the user is going to interact in a short time. |
 | [nextitnet](sequential_recsys_amazondataset.ipynb) | Amazon | Python CPU, GPU | Use NextItNet [14] to predict a set of movies the user is going to interact in a short time. |
 | [sli-rec](sequential_recsys_amazondataset.ipynb) | Amazon | Python CPU, GPU | Use SLi-Rec [11] to predict a set of movies the user is going to interact in a short time. |
 | [wide-and-deep](wide_deep_movielens.ipynb) | MovieLens | Python CPU, GPU |  Utilizing Wide-and-Deep Model (Wide-and-Deep) [5] to predict movie ratings in a Python+GPU (TensorFlow) environment.
@@ -38,5 +38,5 @@ In this directory, notebooks are provided to perform a quick demonstration of di
 [10] _NPA: Neural News Recommendation with Personalized Attention_, Chuhan Wu, Fangzhao Wu, Mingxiao An, Jianqiang Huang, Yongfeng Huang and Xing Xie. KDD 2019, ADS track.<br>
 [11] _Adaptive User Modeling with Long and Short-Term Preferences for Personailzed Recommendation_, Zeping Yu, Jianxun Lian, Ahmad Mahmoody, Gongshen Liu and Xing Xie, IJCAI 2019.<br>
 [12] _Personalized top-n sequential recommendation via convolutional sequence embedding_, Jiaxi Tang and Ke Wang, ACM WSDM 2018.<br>
-[13] _Session-based Recommendations with Recurrent Neural Networks_, Balazs Hidasi, Alexandros Karatzoglou, Linas Baltrunas and Domonkos Tikk, ICLR 2016.<br>
+[13] _Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation_, Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, arXiv preprint arXiv:1406.1078. 2014.<br>
 [14] _A Simple Convolutional Generative Network for Next Item Recommendation_, Fajie Yuan, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M. Jose and Xiangnan He, WSDM 2019. <br>
diff --git a/examples/00_quick_start/sasrec_amazon.ipynb b/examples/00_quick_start/sasrec_amazon.ipynb
index 7378100769..164fb365f3 100644
--- a/examples/00_quick_start/sasrec_amazon.ipynb
+++ b/examples/00_quick_start/sasrec_amazon.ipynb
@@ -15,7 +15,7 @@
     "\n",
     "![image.png](attachment:image.png)\n",
     "\n",
-    "This is a class of sequential recommendation that uses Transformer \\[2\\] for encoding the users preference represented in terms of a sequence of items purchased/viewed before. Instead of using CNN (Caser \\[3\\]) or RNN (GRU4Rec \\[4\\], SLI-Rec \\[5\\] etc.) the approach relies on Transformer based encoder that generates a new representation of the item sequence. Two variants of this Transformer based approaches are included here, \n",
+    "This is a class of sequential recommendation that uses Transformer \\[2\\] for encoding the users preference represented in terms of a sequence of items purchased/viewed before. Instead of using CNN (Caser \\[3\\]) or RNN (GRU \\[4\\], SLI-Rec \\[5\\] etc.) the approach relies on Transformer based encoder that generates a new representation of the item sequence. Two variants of this Transformer based approaches are included here, \n",
     "\n",
     "- Self-Attentive Sequential Recommendation (or SASRec [1]) that is based on vanilla Transformer and models only the item sequence and\n",
     "- Stochastic Shared Embedding based Personalized Transformer or SSE-PT [6], that also models the users along with the items. \n",
@@ -456,7 +456,7 @@
     "\n",
     "\\[3\\] Jiaxi Tang and Ke Wang. 2018. Personalized top-n sequential recommendation via convolutional sequence embedding. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 565–573.\n",
     "\n",
-    "\\[4\\] Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015)\n",
+    "\\[4\\] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv preprint arXiv:1406.1078. 2014.\n",
     "\n",
     "\\[5\\] Zeping Yu, Jianxun Lian, Ahmad Mahmoody, Gongshen Liu, Xing Xie. Adaptive User Modeling with Long and Short-Term Preferences for Personailzed Recommendation. In Proceedings of the 28th International Joint Conferences on Artificial Intelligence, IJCAI’19, Pages 4213-4219. AAAI Press, 2019.\n",
     "\n",
diff --git a/examples/00_quick_start/sequential_recsys_amazondataset.ipynb b/examples/00_quick_start/sequential_recsys_amazondataset.ipynb
index c76b10aaf5..8d22ba4617 100644
--- a/examples/00_quick_start/sequential_recsys_amazondataset.ipynb
+++ b/examples/00_quick_start/sequential_recsys_amazondataset.ipynb
@@ -18,7 +18,7 @@
                 "### Example: SLi_Rec : Adaptive User Modeling with Long and Short-Term Preferences for Personailzed Recommendation\n",
                 "Unlike a general recommender such as Matrix Factorization or xDeepFM (in the repo) which doesn't consider the order of the user's activities, sequential recommender systems take the sequence of the user behaviors as context and the goal is to predict the items that the user will interact in a short time (in an extreme case, the item that the user will interact next).\n",
                 "\n",
-                "This notebook aims to give you a quick example of how to train a sequential model based on a public Amazon dataset. Currently, we can support NextItNet \\[4\\], GRU4Rec \\[2\\], Caser \\[3\\], A2SVD \\[1\\], SLi_Rec \\[1\\], and SUM \\[5\\]. Without loss of generality, this notebook takes [SLi_Rec model](https://www.microsoft.com/en-us/research/uploads/prod/2019/07/IJCAI19-ready_v1.pdf) for example.\n",
+                "This notebook aims to give you a quick example of how to train a sequential model based on a public Amazon dataset. Currently, we can support NextItNet \\[4\\], GRU \\[2\\], Caser \\[3\\], A2SVD \\[1\\], SLi_Rec \\[1\\], and SUM \\[5\\]. Without loss of generality, this notebook takes [SLi_Rec model](https://www.microsoft.com/en-us/research/uploads/prod/2019/07/IJCAI19-ready_v1.pdf) for example.\n",
                 "SLi_Rec \\[1\\] is a deep learning-based model aims at capturing both long and short-term user preferences for precise recommender systems. To summarize, SLi_Rec has the following key properties:\n",
                 "\n",
                 "* It adopts the attentive \"Asymmetric-SVD\" paradigm for long-term modeling;\n",
@@ -84,7 +84,7 @@
                 "####  to use the other model, use one of the following lines:\n",
                 "# from recommenders.models.deeprec.models.sequential.asvd import A2SVDModel as SeqModel\n",
                 "# from recommenders.models.deeprec.models.sequential.caser import CaserModel as SeqModel\n",
-                "# from recommenders.models.deeprec.models.sequential.gru4rec import GRU4RecModel as SeqModel\n",
+                "# from recommenders.models.deeprec.models.sequential.gru import GRUModel as SeqModel\n",
                 "# from recommenders.models.deeprec.models.sequential.sum import SUMModel as SeqModel\n",
                 "\n",
                 "#from recommenders.models.deeprec.models.sequential.nextitnet import NextItNetModel\n",
@@ -448,7 +448,7 @@
                 "| Models | AUC | g-AUC | NDCG@2 | NDCG@10 | seconds per epoch on GPU | seconds per epoch on CPU| config |\n",
                 "| :------| :------: | :------: | :------: | :------: | :------: | :------: | :------ |\n",
                 "| A2SVD | 0.8251 | 0.8178 | 0.2922 | 0.4264 | 249.5 | 440.0 | N/A |\n",
-                "| GRU4Rec | 0.8411 | 0.8332 | 0.3213 | 0.4547 | 439.0 | 4285.0 | max_seq_length=50, hidden_size=40|\n",
+                "| GRU | 0.8411 | 0.8332 | 0.3213 | 0.4547 | 439.0 | 4285.0 | max_seq_length=50, hidden_size=40|\n",
                 "| Caser | 0.8244 | 0.8171 | 0.283 | 0.4194 | 314.3 | 5369.9 | T=1, n_v=128, n_h=128, L=3, min_seq_length=5|\n",
                 "| SLi_Rec | 0.8631 | 0.8519 | 0.3491 | 0.4842 | 549.6 | 5014.0 | attention_size=40, max_seq_length=50, hidden_size=40|\n",
                 "| NextItNet* | 0.6793 | 0.6769 | 0.0602 | 0.1733 | 112.0 | 214.5 | min_seq_length=3, dilations=\\[1,2,4,1,2,4\\], kernel_size=3 |\n",
@@ -557,13 +557,13 @@
                 "## References\n",
                 "\\[1\\] Zeping Yu, Jianxun Lian, Ahmad Mahmoody, Gongshen Liu, Xing Xie. Adaptive User Modeling with Long and Short-Term Preferences for Personailzed Recommendation. In Proceedings of the 28th International Joint Conferences on Artificial Intelligence, IJCAI’19, Pages 4213-4219. AAAI Press, 2019.\n",
                 "\n",
-                "\\[2\\] Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, Domonkos Tikk. Session-based Recommendations with Recurrent Neural Networks. ICLR (Poster) 2016\n",
+                "\\[2\\] Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv preprint arXiv:1406.1078. 2014.\n",
                 "\n",
                 "\\[3\\] Tang, Jiaxi, and Ke Wang. Personalized top-n sequential recommendation via convolutional sequence embedding. Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 2018.\n",
                 "\n",
-                "\\[4\\] Yuan, F., Karatzoglou, A., Arapakis, I., Jose, J. M., & He, X. A Simple Convolutional Generative Network for Next Item Recommendation. WSDM, 2019\n",
+                "\\[4\\] Yuan, F., Karatzoglou, A., Arapakis, I., Jose, J. M., & He, X. A Simple Convolutional Generative Network for Next Item Recommendation. WSDM, 2019.\n",
                 "\n",
-                "\\[5\\] Lian, J., Batal, I., Liu, Z., Soni, A., Kang, E. Y., Wang, Y., & Xie, X. Multi-Interest-Aware User Modeling for Large-Scale Sequential Recommendations. (2021) arXiv preprint arXiv:2102.09211."
+                "\\[5\\] Lian, J., Batal, I., Liu, Z., Soni, A., Kang, E. Y., Wang, Y., & Xie, X. Multi-Interest-Aware User Modeling for Large-Scale Sequential Recommendations. arXiv preprint arXiv:2102.09211. 2021."
             ]
         },
         {
@@ -598,4 +598,4 @@
     },
     "nbformat": 4,
     "nbformat_minor": 2
-}
\ No newline at end of file
+}
diff --git a/recommenders/README.md b/recommenders/README.md
index a9e4ca7dfe..33f80a04df 100644
--- a/recommenders/README.md
+++ b/recommenders/README.md
@@ -143,7 +143,7 @@ The models submodule contains implementations of various algorithms that can be
   *  Convolutional Sequence Embedding Recommendation (CASER)
   *  Deep Knowledge-Aware Network (DKN)
   *  Extreme Deep Factorization Machine (xDeepFM)
-  *  GRU4Rec
+  *  GRU
   *  LightGCN
   *  Next Item Recommendation (NextItNet)
   *  Short-term and Long-term Preference Integrated Recommender (SLi-Rec)
diff --git a/recommenders/models/deeprec/config/gru4rec.yaml b/recommenders/models/deeprec/config/gru.yaml
similarity index 92%
rename from recommenders/models/deeprec/config/gru4rec.yaml
rename to recommenders/models/deeprec/config/gru.yaml
index ada50292d7..ef01998a39 100644
--- a/recommenders/models/deeprec/config/gru4rec.yaml
+++ b/recommenders/models/deeprec/config/gru.yaml
@@ -8,7 +8,7 @@ data:
 #model
 model:
     method : classification # classification or regression
-    model_type : GRU4Rec
+    model_type : GRU
     layer_sizes : [100, 64]  # layers' size of DNN. In this example, DNN has two layers, and each layer has 100 hidden nodes.
     activation : [relu, relu] # activation function for DNN
     user_dropout: True 
@@ -48,8 +48,8 @@ info:
     save_epoch : 1    # if save_model is set to True, save the model every save_epoch.
     metrics : ['auc','logloss']  # metrics for evaluation.
     pairwise_metrics : ['mean_mrr', 'ndcg@2;4;6', "group_auc"]  # pairwise metrics for evaluation, available when pairwise comparisons are needed
-    MODEL_DIR : ./tests/resources/deeprec/gru4rec/model/gru4rec_model/  # directory of saved models.
-    SUMMARIES_DIR : ./tests/resources/deeprec/gru4rec/summary/gru4rec_summary/  # directory of saved summaries.
+    MODEL_DIR : ./tests/resources/deeprec/gru/model/gru_model/  # directory of saved models.
+    SUMMARIES_DIR : ./tests/resources/deeprec/gru/summary/gru_summary/  # directory of saved summaries.
     write_tfevents : True  # whether to save summaries.
 
     
diff --git a/recommenders/models/deeprec/config/sum.yaml b/recommenders/models/deeprec/config/sum.yaml
index 62b9fa79bd..353298e6d3 100644
--- a/recommenders/models/deeprec/config/sum.yaml
+++ b/recommenders/models/deeprec/config/sum.yaml
@@ -51,8 +51,8 @@ info:
     save_epoch : 1    # if save_model is set to True, save the model every save_epoch.
     metrics : ['auc','logloss']  # metrics for evaluation.
     pairwise_metrics : ['mean_mrr', 'ndcg@2;4;6', "group_auc"]  # pairwise metrics for evaluation, available when pairwise comparisons are needed
-    MODEL_DIR : ./tests/resources/deeprec/gru4rec/model/gru4rec_model/  # directory of saved models.
-    SUMMARIES_DIR : ./tests/resources/deeprec/gru4rec/summary/gru4rec_summary/  # directory of saved summaries.
+    MODEL_DIR : ./tests/resources/deeprec/gru/model/gru_model/  # directory of saved models.
+    SUMMARIES_DIR : ./tests/resources/deeprec/gru/summary/gru_summary/  # directory of saved summaries.
     write_tfevents : True  # whether to save summaries.
 
     
diff --git a/recommenders/models/deeprec/deeprec_utils.py b/recommenders/models/deeprec/deeprec_utils.py
index 264514f944..cc9fce84b7 100644
--- a/recommenders/models/deeprec/deeprec_utils.py
+++ b/recommenders/models/deeprec/deeprec_utils.py
@@ -184,7 +184,7 @@ def check_nn_config(f_config):
             "data_format",
             "dropout",
         ]
-    if f_config["model_type"] in ["gru4rec", "GRU4REC", "GRU4Rec"]:
+    if f_config["model_type"] in ["gru", "GRU"]:
         required_parameters = [
             "item_embedding_dim",
             "cate_embedding_dim",
diff --git a/recommenders/models/deeprec/models/sequential/gru4rec.py b/recommenders/models/deeprec/models/sequential/gru.py
similarity index 82%
rename from recommenders/models/deeprec/models/sequential/gru4rec.py
rename to recommenders/models/deeprec/models/sequential/gru.py
index 591c68fafa..281e7f97b7 100644
--- a/recommenders/models/deeprec/models/sequential/gru4rec.py
+++ b/recommenders/models/deeprec/models/sequential/gru.py
@@ -8,25 +8,27 @@
 )
 from tensorflow.compat.v1.nn import dynamic_rnn
 
-__all__ = ["GRU4RecModel"]
+__all__ = ["GRUModel"]
 
 
-class GRU4RecModel(SequentialBaseModel):
-    """GRU4Rec Model
+class GRUModel(SequentialBaseModel):
+    """GRU Model
 
     :Citation:
 
-        B. Hidasi, A. Karatzoglou, L. Baltrunas, D. Tikk, "Session-based Recommendations
-        with Recurrent Neural Networks", ICLR (Poster), 2016.
+        Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, 
+        Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning Phrase 
+        Representations using RNN Encoder-Decoder for Statistical Machine Translation. 
+        arXiv preprint arXiv:1406.1078. 2014.
     """
 
     def _build_seq_graph(self):
-        """The main function to create GRU4Rec model.
+        """The main function to create GRU model.
 
         Returns:
-            object:the output of GRU4Rec section.
+            object:the output of GRU section.
         """
-        with tf.compat.v1.variable_scope("gru4rec"):
+        with tf.compat.v1.variable_scope("gru"):
             # final_state = self._build_lstm()
             final_state = self._build_gru()
             model_output = tf.concat([final_state, self.target_item_embedding], 1)