are there out-of-the-box software tools to do this?
I thought about something like this before, but its non-trivial to train an image recognition system to do this, let alone the commands to mimic it through hardware, its quite an engineering feat in itself.
I thought about something like this before, but its non-trivial to train an image recognition system to do this, let alone the commands to mimic it through hardware, its quite an engineering feat in itself.