Table 2. Number of (un)shared parameters of each model

(A) Retailer S (B) Retailer L (C) Grocery S (D) Grocery L
Vocabulary size 2,138 2,160 8,765 9,521
#Total parameters 1,702,490 1,710,960 4,253,885 4,544,945
#Shared parameters 879,360
 Encoder layer 406,528
 Decoder layer 472,832
#Unshared parameters 823,130 831,600 3,374,525 3,665,585
 Encoder embedding layer 273,664 276,480 1,121,920 1,218,688
 Decoder embedding layer 273,664 276,480 1,121,920 1,218,688
 Output layer 275,802 278,640 1,130,685 1,228,209