Misplaced Pages

Tensor reshaping

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Tensor reshaping" – news · newspapers · books · scholar · JSTOR (April 2023) (Learn how and when to remove this message)

In multilinear algebra, a reshaping of tensors is any bijection between the set of indices of an order- M {\displaystyle M} tensor and the set of indices of an order- L {\displaystyle L} tensor, where L < M {\displaystyle L<M} . The use of indices presupposes tensors in coordinate representation with respect to a basis. The coordinate representation of a tensor can be regarded as a multi-dimensional array, and a bijection from one set of indices to another therefore amounts to a rearrangement of the array elements into an array of a different shape. Such a rearrangement constitutes a particular kind of linear map between the vector space of order- M {\displaystyle M} tensors and the vector space of order- L {\displaystyle L} tensors.

Definition

Given a positive integer M {\displaystyle M} , the notation [ M ] {\displaystyle } refers to the set { 1 , , M } {\displaystyle \{1,\dots ,M\}} of the first M positive integers.

For each integer m {\displaystyle m} where 1 m M {\displaystyle 1\leq m\leq M} for a positive integer M {\displaystyle M} , let V m {\displaystyle V_{m}} denote an I m {\displaystyle I_{m}} -dimensional vector space over a field F {\displaystyle F} . Then there are vector space isomorphisms (linear maps)

V 1 V M F I 1 F I M F I π 1 F I π M F I π 1 I π 2 F I π 3 F I π M F I π 1 I π 3 F I π 2 F I π 4 F I π M F I 1 I 2 I M , {\displaystyle {\begin{aligned}V_{1}\otimes \cdots \otimes V_{M}&\simeq F^{I_{1}}\otimes \cdots \otimes F^{I_{M}}\\&\simeq F^{I_{\pi _{1}}}\otimes \cdots \otimes F^{I_{\pi _{M}}}\\&\simeq F^{I_{\pi _{1}}I_{\pi _{2}}}\otimes F^{I_{\pi _{3}}}\otimes \cdots \otimes F^{I_{\pi _{M}}}\\&\simeq F^{I_{\pi _{1}}I_{\pi _{3}}}\otimes F^{I_{\pi _{2}}}\otimes F^{I_{\pi _{4}}}\otimes \cdots \otimes F^{I_{\pi _{M}}}\\&\,\,\,\vdots \\&\simeq F^{I_{1}I_{2}\ldots I_{M}},\end{aligned}}}

where π S M {\displaystyle \pi \in {\mathfrak {S}}_{M}} is any permutation and S M {\displaystyle {\mathfrak {S}}_{M}} is the symmetric group on M {\displaystyle M} elements. Via these (and other) vector space isomorphisms, a tensor can be interpreted in several ways as an order- L {\displaystyle L} tensor where L M {\displaystyle L\leq M} .

Coordinate representation

The first vector space isomorphism on the list above, V 1 V M F I 1 F I M {\displaystyle V_{1}\otimes \cdots \otimes V_{M}\simeq F^{I_{1}}\otimes \cdots \otimes F^{I_{M}}} , gives the coordinate representation of an abstract tensor. Assume that each of the M {\displaystyle M} vector spaces V m {\displaystyle V_{m}} has a basis { v 1 m , v 2 m , , v I m m } {\displaystyle \{v_{1}^{m},v_{2}^{m},\ldots ,v_{I_{m}}^{m}\}} . The expression of a tensor with respect to this basis has the form A = i 1 = 1 I 1 i M = 1 I M a i 1 , i 2 , , i M v i 1 1 v i 2 2 v i M M , {\displaystyle {\mathcal {A}}=\sum _{i_{1}=1}^{I_{1}}\ldots \sum _{i_{M}=1}^{I_{M}}a_{i_{1},i_{2},\ldots ,i_{M}}v_{i_{1}}^{1}\otimes v_{i_{2}}^{2}\otimes \cdots \otimes v_{i_{M}}^{M},} where the coefficients a i 1 , i 2 , , i M {\displaystyle a_{i_{1},i_{2},\ldots ,i_{M}}} are elements of F {\displaystyle F} . The coordinate representation of A {\displaystyle {\mathcal {A}}} is i 1 = 1 I 1 i M = 1 I M a i 1 , i 2 , , i M e i 1 1 e i 2 2 e i M M , {\displaystyle \sum _{i_{1}=1}^{I_{1}}\ldots \sum _{i_{M}=1}^{I_{M}}a_{i_{1},i_{2},\ldots ,i_{M}}\mathbf {e} _{i_{1}}^{1}\otimes \mathbf {e} _{i_{2}}^{2}\otimes \cdots \otimes \mathbf {e} _{i_{M}}^{M},} where e i m {\displaystyle \mathbf {e} _{i}^{m}} is the i th {\displaystyle i^{\text{th}}} standard basis vector of F I m {\displaystyle F^{I_{m}}} . This can be regarded as a M-way array whose elements are the coefficients a i 1 , i 2 , , i M {\displaystyle a_{i_{1},i_{2},\ldots ,i_{M}}} .

General flattenings

For any permutation π S M {\displaystyle \pi \in {\mathfrak {S}}_{M}} there is a canonical isomorphism between the two tensor products of vector spaces V 1 V 2 V M {\displaystyle V_{1}\otimes V_{2}\otimes \cdots \otimes V_{M}} and V π ( 1 ) V π ( 2 ) V π ( M ) {\displaystyle V_{\pi (1)}\otimes V_{\pi (2)}\otimes \cdots \otimes V_{\pi (M)}} . Parentheses are usually omitted from such products due to the natural isomorphism between V i ( V j V k ) {\displaystyle V_{i}\otimes (V_{j}\otimes V_{k})} and ( V i V j ) V k {\displaystyle (V_{i}\otimes V_{j})\otimes V_{k}} , but may, of course, be reintroduced to emphasize a particular grouping of factors. In the grouping, ( V π ( 1 ) V π ( r 1 ) ) ( V π ( r 1 + 1 ) V π ( r 2 ) ) ( V π ( r L 1 + 1 ) V π ( r L ) ) , {\displaystyle (V_{\pi (1)}\otimes \cdots \otimes V_{\pi (r_{1})})\otimes (V_{\pi (r_{1}+1)}\otimes \cdots \otimes V_{\pi (r_{2})})\otimes \cdots \otimes (V_{\pi (r_{L-1}+1)}\otimes \cdots \otimes V_{\pi (r_{L})}),} there are L {\displaystyle L} groups with r l r l 1 {\displaystyle r_{l}-r_{l-1}} factors in the l th {\displaystyle l^{\text{th}}} group (where r 0 = 0 {\displaystyle r_{0}=0} and r L = M {\displaystyle r_{L}=M} ).

Letting S l = ( π ( r l 1 + 1 ) , π ( r l 1 + 2 ) , , π ( r l ) ) {\displaystyle S_{l}=(\pi (r_{l-1}+1),\pi (r_{l-1}+2),\ldots ,\pi (r_{l}))} for each l {\displaystyle l} satisfying 1 l L {\displaystyle 1\leq l\leq L} , an ( S 1 , S 2 , , S L ) {\displaystyle (S_{1},S_{2},\ldots ,S_{L})} -flattening of a tensor A {\displaystyle {\mathcal {A}}} , denoted A ( S 1 , S 2 , , S L ) {\displaystyle {\mathcal {A}}_{(S_{1},S_{2},\ldots ,S_{L})}} , is obtained by applying the two processes above within each of the L {\displaystyle L} groups of factors. That is, the coordinate representation of the l th {\displaystyle l^{\text{th}}} group of factors is obtained using the isomorphism ( V π ( r l 1 + 1 ) V π ( r l 1 + 2 ) V π ( r l ) ) ( F I π ( r l 1 + 1 ) F I π ( r l 1 + 2 ) F I π ( r l ) ) {\displaystyle (V_{\pi (r_{l-1}+1)}\otimes V_{\pi (r_{l-1}+2)}\otimes \cdots \otimes V_{\pi (r_{l})})\simeq (F^{I_{\pi (r_{l-1}+1)}}\otimes F^{I_{\pi (r_{l-1}+2)}}\otimes \cdots \otimes F^{I_{\pi (r_{l})}})} , which requires specifying bases for all of the vector spaces V k {\displaystyle V_{k}} . The result is then vectorized using a bijection μ l : [ I π ( r l 1 + 1 ) ] × [ I π ( r l 1 + 2 ) ] × × [ I π ( r l ) ] [ I S l ] {\displaystyle \mu _{l}:\times \times \cdots \times \to } to obtain an element of F I S l {\displaystyle F^{I_{S_{l}}}} , where I S l := i = r l 1 + 1 r l I π ( i ) {\textstyle I_{S_{l}}:=\prod _{i=r_{l-1}+1}^{r_{l}}I_{\pi (i)}} , the product of the dimensions of the vector spaces in the l th {\displaystyle l^{\text{th}}} group of factors. The result of applying these isomorphisms within each group of factors is an element of F I S 1 F I S L {\displaystyle F^{I_{S_{1}}}\otimes \cdots \otimes F^{I_{S_{L}}}} , which is a tensor of order L {\displaystyle L} .

Vectorization

By means of a bijective map μ : [ I 1 ] × × [ I M ] [ I 1 I M ] {\displaystyle \mu :\times \cdots \times \to } , a vector space isomorphism between F I 1 F I M {\displaystyle F^{I_{1}}\otimes \cdots \otimes F^{I_{M}}} and F I 1 I M {\displaystyle F^{I_{1}\cdots I_{M}}} is constructed via the mapping e i 1 1 e i m m e i M M e μ ( i 1 , i 2 , , i M ) , {\displaystyle \mathbf {e} _{i_{1}}^{1}\otimes \cdots \mathbf {e} _{i_{m}}^{m}\otimes \cdots \otimes \mathbf {e} _{i_{M}}^{M}\mapsto \mathbf {e} _{\mu (i_{1},i_{2},\ldots ,i_{M})},} where for every natural number i {\displaystyle i} such that 1 i I 1 I M {\displaystyle 1\leq i\leq I_{1}\cdots I_{M}} , the vector e i {\displaystyle \mathbf {e} _{i}} denotes the ith standard basis vector of F i 1 i M {\displaystyle F^{i_{1}\cdots i_{M}}} . In such a reshaping, the tensor is simply interpreted as a vector in F I 1 I M {\displaystyle F^{I_{1}\cdots I_{M}}} . This is known as vectorization, and is analogous to vectorization of matrices. A standard choice of bijection μ {\displaystyle \mu } is such that

vec ( A ) = [ a 1 , 1 , , 1 a 2 , 1 , , 1 a n 1 , 1 , , 1 a 1 , 2 , 1 , , 1 a I 1 , I 2 , , I M ] T , {\displaystyle \operatorname {vec} ({\mathcal {A}})={\begin{bmatrix}a_{1,1,\ldots ,1}&a_{2,1,\ldots ,1}&\cdots &a_{n_{1},1,\ldots ,1}&a_{1,2,1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{M}}\end{bmatrix}}^{T},}

which is consistent with the way in which the colon operator in Matlab and GNU Octave reshapes a higher-order tensor into a vector. In general, the vectorization of A {\displaystyle {\mathcal {A}}} is the vector [ a μ 1 ( i ) ] i = 1 I 1 I M {\displaystyle _{i=1}^{I_{1}\cdots I_{M}}} .

The vectorization of A {\displaystyle {\mathcal {A}}} denoted with v e c ( A ) {\displaystyle vec({\mathcal {A}})} or A [ : ] {\displaystyle {\mathcal {A}}_{}} is an [ S 1 , S 2 ] {\displaystyle } -reshaping where S 1 = ( 1 , 2 , , M ) {\displaystyle S_{1}=(1,2,\ldots ,M)} and S 2 = {\displaystyle S_{2}=\emptyset } .

Mode-m Flattening / Mode-m Matrixization

Let A F I 1 F I 2 F I M {\displaystyle {\mathcal {A}}\in F^{I_{1}}\otimes F^{I_{2}}\otimes \cdots \otimes F^{I_{M}}} be the coordinate representation of an abstract tensor with respect to a basis. Mode-m matrixizing (a.k.a. flattening) of A {\displaystyle {\mathcal {A}}} is an [ S 1 , S 2 ] {\displaystyle } -reshaping in which S 1 = ( m ) {\displaystyle S_{1}=(m)} and S 2 = ( 1 , 2 , , m 1 , m + 1 , , M ) {\displaystyle S_{2}=(1,2,\ldots ,m-1,m+1,\ldots ,M)} . Usually, a standard matrixizing is denoted by

A [ m ] = A [ S 1 , S 2 ] {\displaystyle {\mathbf {A} }_{}={\mathcal {A}}_{}}

This reshaping is sometimes called matrixizing, matricizing, flattening or unfolding in the literature. A standard choice for the bijections μ 1 ,   μ 2 {\displaystyle \mu _{1},\ \mu _{2}} is the one that is consistent with the reshape function in Matlab and GNU Octave, namely

A [ m ] := [ a 1 , 1 , , 1 , 1 , 1 , , 1 a 2 , 1 , , 1 , 1 , 1 , , 1 a I 1 , I 2 , , I m 1 , 1 , I m + 1 , , I M a 1 , 1 , , 1 , 2 , 1 , , 1 a 2 , 1 , , 1 , 2 , 1 , , 1 a I 1 , I 2 , , I m 1 , 2 , I m + 1 , , I M a 1 , 1 , , 1 , I m , 1 , , 1 a 2 , 1 , , 1 , I m , 1 , , 1 a I 1 , I 2 , , I m 1 , I m , I m + 1 , , I M ] {\displaystyle {\mathbf {A} }_{}:={\begin{bmatrix}a_{1,1,\ldots ,1,1,1,\ldots ,1}&a_{2,1,\ldots ,1,1,1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{m-1},1,I_{m+1},\ldots ,I_{M}}\\a_{1,1,\ldots ,1,2,1,\ldots ,1}&a_{2,1,\ldots ,1,2,1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{m-1},2,I_{m+1},\ldots ,I_{M}}\\\vdots &\vdots &&\vdots \\a_{1,1,\ldots ,1,I_{m},1,\ldots ,1}&a_{2,1,\ldots ,1,I_{m},1,\ldots ,1}&\cdots &a_{I_{1},I_{2},\ldots ,I_{m-1},I_{m},I_{m+1},\ldots ,I_{M}}\end{bmatrix}}}

Definition Mode-m Matrixizing: [ A [ m ] ] j k = a i 1 i m i M ,  where  j = i m  and  k = 1 + n = 0 n m M ( i n 1 ) l = 0 l m n 1 I l . {\displaystyle }]_{jk}=a_{i_{1}\dots i_{m}\dots i_{M}},\;\;{\text{ where }}j=i_{m}{\text{ and }}k=1+\sum _{n=0 \atop n\neq m}^{M}(i_{n}-1)\prod _{l=0 \atop l\neq m}^{n-1}I_{l}.} The mode-m matrixizing of a tensor A F I 1 × . . . I M , {\displaystyle {\mathcal {A}}\in F^{I_{1}\times ...I_{M}},} is defined as the matrix A [ m ] F I m × ( I 1 I m 1 I m + 1 I M ) {\displaystyle {\mathbf {A} }_{}\in F^{I_{m}\times (I_{1}\dots I_{m-1}I_{m+1}\dots I_{M})}} . As the parenthetical ordering indicates, the mode-m column vectors are arranged by sweeping all the other mode indices through their ranges, with smaller mode indexes varying more rapidly than larger ones; thus

References

  1. Vasilescu, M. Alex O. (2009), "Multilinear (Tensor) Algebraic Framework for Computer Graphics, Computer Vision and Machine Learning" (PDF), University of Toronto, p. 21
Categories: