> A fairly easy way to introduce rotation invariance in DCNNS is to perform random rotations on the inputs during training. Likewise for scale invariance. Translation invariance is already introduced by the convolution operation itself.
Just to be clear (and I'm sorry if I'm being pedantic), you're talking about invariance of two separate things. In the first case, you're talking about the invariance of the overall network, F(x), i.e. if R is a rotation operator, F(Rx) = F(x). The network's prediction does not change for a suitable set of R's.
On the other hand, convolution is a shift invariant operator, meaning it acts the same no matter its location. If Cx is the output of a convolutional layer and Sx is a shifted signal, then C(Sx) = S(Cx). This is not shift invariance of the output.
The shift invariance of the operator means the convolutional will detect features that resonate well with its kernel irrespective of the location of their location in the signal. However, this does not automatically guarantee that the network's prediction will be shift invariant, i.e. F(Sx) = F(x).
Just to be clear (and I'm sorry if I'm being pedantic), you're talking about invariance of two separate things. In the first case, you're talking about the invariance of the overall network, F(x), i.e. if R is a rotation operator, F(Rx) = F(x). The network's prediction does not change for a suitable set of R's.
On the other hand, convolution is a shift invariant operator, meaning it acts the same no matter its location. If Cx is the output of a convolutional layer and Sx is a shifted signal, then C(Sx) = S(Cx). This is not shift invariance of the output.
The shift invariance of the operator means the convolutional will detect features that resonate well with its kernel irrespective of the location of their location in the signal. However, this does not automatically guarantee that the network's prediction will be shift invariant, i.e. F(Sx) = F(x).