网站首页 站内搜索

搜索结果

查询Tags标签: multiHeadAttention,共有 2条记录
  • 小黑算法成长日记23:selfAttention与multiHeadAttention

    SelfAttention操作 从单个字的角度: qi=hiWQ,kj=hjWK,vj=hjWVq_i = h_iW_Q,k_j = h_jW_K,v_j = h_jW_Vqi​=hi​WQ​,kj​=hj​WK​,vj​=hj​WV​ eij=qikjTe_{ij} = q_ik_j^Teij​=qi​kjT​ αi=Softmax([ei,1,...,ei,T])\alpha_i = Softmax([e_{i,1},...,e_{i,T}])αi…

    2022/1/2 12:37:34 人评论 次浏览
  • 小黑算法成长日记23:selfAttention与multiHeadAttention

    SelfAttention操作 从单个字的角度: qi=hiWQ,kj=hjWK,vj=hjWVq_i = h_iW_Q,k_j = h_jW_K,v_j = h_jW_Vqi​=hi​WQ​,kj​=hj​WK​,vj​=hj​WV​ eij=qikjTe_{ij} = q_ik_j^Teij​=qi​kjT​ αi=Softmax([ei,1,...,ei,T])\alpha_i = Softmax([e_{i,1},...,e_{i,T}])αi…

    2022/1/2 12:37:34 人评论 次浏览
扫一扫关注最新编程教程