Look in the 1998 video (http://youtube.com/watch?v=dmmxVA5xhuo) around 4:30. You could perhaps argue that's 'scaling' not 'zooming', but that's a very pedantic position, and not one I personally would entertain.
Devil's advocate: It is neigh impossible to judge from the video, but to me, the video shows less of a direct link between hand movement and what happens on the display. It is not as if the hand really grabs the screen. I also got the impression that he had to issue a command, chosen from the top of the screen (outside the video image) to switch from scaling to rotation. Another argument could be that this user does not actually seem to touch the screen (at least, that is my impression). It is likely that all these are due to technological limitations, but if I had to defend this as "not prior art" in court, I would try and work from those observations. There is little else to argue about, as a lot of the interactions are there, in a primitive (technologically) way.