代码之家  ›  专栏  ›  技术社区  ›  floflo29

理解张索多

  •  54
  • floflo29  · 技术社区  · 8 年前

    在我学会如何使用之后 einsum ,我现在正试图了解 np.tensordot 作品

    然而,我有点迷茫,尤其是关于参数的各种可能性 axes .

    为了理解它,因为我从未练习过张量演算,我使用了以下示例:

    A = np.random.randint(2, size=(2, 3, 5))
    B = np.random.randint(2, size=(3, 2, 4))
    

    在这种情况下,有什么不同的可能 np.tensordot 你怎么手工计算呢?

    3 回复  |  直到 7 年前
        1
  •  64
  •   Divakar    8 年前

    想法 tensordot 非常简单-我们输入数组和相应的轴,沿着这些轴进行求和。参与求和的轴在输出中被移除,而输入阵列中的所有剩余轴为: 展开 作为保持输入阵列被馈送的顺序的输出中的不同轴。

    让我们看看几个具有一个和两个求和减少轴的示例情况,并交换输入位置,看看如何在输出中保持顺序。

    一、 求和的一个轴

    输入:

     In [7]: A = np.random.randint(2, size=(2, 6, 5))
       ...:  B = np.random.randint(2, size=(3, 2, 4))
       ...: 
    

    案例1:

    In [9]: np.tensordot(A, B, axes=((0),(1))).shape
    Out[9]: (6, 5, 3, 4)
    
    A : (2, 6, 5) -> reduction of axis=0
    B : (3, 2, 4) -> reduction of axis=1
    
    Output : `(2, 6, 5)`, `(3, 2, 4)` ===(2 gone)==> `(6,5)` + `(3,4)` => `(6,5,3,4)`
    

    情况#2(与情况"1相同,但输入被馈送交换):

    In [8]: np.tensordot(B, A, axes=((1),(0))).shape
    Out[8]: (3, 4, 6, 5)
    
    B : (3, 2, 4) -> reduction of axis=1
    A : (2, 6, 5) -> reduction of axis=0
    
    Output : `(3, 2, 4)`, `(2, 6, 5)` ===(2 gone)==> `(3,4)` + `(6,5)` => `(3,4,6,5)`.
    

    二、求和的两个轴

    输入:

    In [11]: A = np.random.randint(2, size=(2, 3, 5))
        ...: B = np.random.randint(2, size=(3, 2, 4))
        ...: 
    

    案例1:

    In [12]: np.tensordot(A, B, axes=((0,1),(1,0))).shape
    Out[12]: (5, 4)
    
    A : (2, 3, 5) -> reduction of axis=(0,1)
    B : (3, 2, 4) -> reduction of axis=(1,0)
    
    Output : `(2, 3, 5)`, `(3, 2, 4)` ===(2,3 gone)==> `(5)` + `(4)` => `(5,4)`
    

    案例2:

    In [14]: np.tensordot(B, A, axes=((1,0),(0,1))).shape
    Out[14]: (4, 5)
    
    B : (3, 2, 4) -> reduction of axis=(1,0)
    A : (2, 3, 5) -> reduction of axis=(0,1)
    
    Output : `(3, 2, 4)`, `(2, 3, 5)` ===(2,3 gone)==> `(4)` + `(5)` => `(4,5)`
    

    我们可以将其扩展到尽可能多的轴。

        2
  •  8
  •   hpaulj    8 年前

    tensordot 交换轴并重塑输入,使其可以应用 np.dot 到2个2d阵列。然后将其交换并重新成形回目标。实验可能比解释更容易。没有特殊的张量数学,只是扩展 dot 在更高的维度中工作。 tensor 只是指多于2d的数组。如果你已经习惯了 einsum 那么将结果与之进行比较将是最简单的。

    样本测试,在一对轴上求和

    In [823]: np.tensordot(A,B,[0,1]).shape
    Out[823]: (3, 5, 3, 4)
    In [824]: np.einsum('ijk,lim',A,B).shape
    Out[824]: (3, 5, 3, 4)
    In [825]: np.allclose(np.einsum('ijk,lim',A,B),np.tensordot(A,B,[0,1]))
    Out[825]: True
    

    另一个,对二求和。

    In [826]: np.tensordot(A,B,[(0,1),(1,0)]).shape
    Out[826]: (5, 4)
    In [827]: np.einsum('ijk,jim',A,B).shape
    Out[827]: (5, 4)
    In [828]: np.allclose(np.einsum('ijk,jim',A,B),np.tensordot(A,B,[(0,1),(1,0)]))
    Out[828]: True
    

    我们也可以这样做 (1,0) 一对考虑到维度的混合,我认为没有其他组合。

        3
  •  5
  •   dereks    5 年前

    上面的答案很棒,对我的理解有很大帮助 tensordot 但它们并没有显示操作背后的实际数学。这就是为什么我在TF 2中为自己做了相同的操作,并决定在这里分享它们:

    a = tf.constant([1,2.])
    b = tf.constant([2,3.])
    print(f"{tf.tensordot(a, b, 0)}\t tf.einsum('i,j', a, b)\t\t- ((the last 0 axes of a), (the first 0 axes of b))")
    print(f"{tf.tensordot(a, b, ((),()))}\t tf.einsum('i,j', a, b)\t\t- ((() axis of a), (() axis of b))")
    print(f"{tf.tensordot(b, a, 0)}\t tf.einsum('i,j->ji', a, b)\t- ((the last 0 axes of b), (the first 0 axes of a))")
    print(f"{tf.tensordot(a, b, 1)}\t\t tf.einsum('i,i', a, b)\t\t- ((the last 1 axes of a), (the first 1 axes of b))")
    print(f"{tf.tensordot(a, b, ((0,), (0,)))}\t\t tf.einsum('i,i', a, b)\t\t- ((0th axis of a), (0th axis of b))")
    print(f"{tf.tensordot(a, b, (0,0))}\t\t tf.einsum('i,i', a, b)\t\t- ((0th axis of a), (0th axis of b))")
    
    [[2. 3.]
     [4. 6.]]    tf.einsum('i,j', a, b)     - ((the last 0 axes of a), (the first 0 axes of b))
    [[2. 3.]
     [4. 6.]]    tf.einsum('i,j', a, b)     - ((() axis of a), (() axis of b))
    [[2. 4.]
     [3. 6.]]    tf.einsum('i,j->ji', a, b) - ((the last 0 axes of b), (the first 0 axes of a))
    8.0          tf.einsum('i,i', a, b)     - ((the last 1 axes of a), (the first 1 axes of b))
    8.0          tf.einsum('i,i', a, b)     - ((0th axis of a), (0th axis of b))
    8.0          tf.einsum('i,i', a, b)     - ((0th axis of a), (0th axis of b))
    

    (2,2) 形状:

    a = tf.constant([[1,2],
                     [-2,3.]])
    
    b = tf.constant([[-2,3],
                     [0,4.]])
    print(f"{tf.tensordot(a, b, 0)}\t tf.einsum('ij,kl', a, b)\t- ((the last 0 axes of a), (the first 0 axes of b))")
    print(f"{tf.tensordot(a, b, (0,0))}\t tf.einsum('ij,ik', a, b)\t- ((0th axis of a), (0th axis of b))")
    print(f"{tf.tensordot(a, b, (0,1))}\t tf.einsum('ij,ki', a, b)\t- ((0th axis of a), (1st axis of b))")
    print(f"{tf.tensordot(a, b, 1)}\t tf.matmul(a, b)\t\t- ((the last 1 axes of a), (the first 1 axes of b))")
    print(f"{tf.tensordot(a, b, ((1,), (0,)))}\t tf.einsum('ij,jk', a, b)\t- ((1st axis of a), (0th axis of b))")
    print(f"{tf.tensordot(a, b, (1, 0))}\t tf.matmul(a, b)\t\t- ((1st axis of a), (0th axis of b))")
    print(f"{tf.tensordot(a, b, 2)}\t tf.reduce_sum(tf.multiply(a, b))\t- ((the last 2 axes of a), (the first 2 axes of b))")
    print(f"{tf.tensordot(a, b, ((0,1), (0,1)))}\t tf.einsum('ij,ij->', a, b)\t\t- ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))")
    [[[[-2.  3.]
       [ 0.  4.]]
      [[-4.  6.]
       [ 0.  8.]]]
    
     [[[ 4. -6.]
       [-0. -8.]]
      [[-6.  9.]
       [ 0. 12.]]]]  tf.einsum('ij,kl', a, b)   - ((the last 0 axes of a), (the first 0 axes of b))
    [[-2. -5.]
     [-4. 18.]]      tf.einsum('ij,ik', a, b)   - ((0th axis of a), (0th axis of b))
    [[-8. -8.]
     [ 5. 12.]]      tf.einsum('ij,ki', a, b)   - ((0th axis of a), (1st axis of b))
    [[-2. 11.]
     [ 4.  6.]]      tf.matmul(a, b)            - ((the last 1 axes of a), (the first 1 axes of b))
    [[-2. 11.]
     [ 4.  6.]]      tf.einsum('ij,jk', a, b)   - ((1st axis of a), (0th axis of b))
    [[-2. 11.]
     [ 4.  6.]]      tf.matmul(a, b)            - ((1st axis of a), (0th axis of b))
    16.0    tf.reduce_sum(tf.multiply(a, b))    - ((the last 2 axes of a), (the first 2 axes of b))
    16.0    tf.einsum('ij,ij->', a, b)          - ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))