So I would want to include a big corpus like GPT-3 or this newfangled "Neo" thing but still have it trained to respond to our own customers based on 200k email passages.
200k emails is not enough to train a model from scratch. If you check out the google colab file in the GPT-Neo repository, it talks about how to fine-tune the model on data which is what you want to do
I wouldn't trust any model to generate text for customers yet. Not even the largest GPT3. There are no guarantees on what they will output and could be damaging to your business.
You're better off either:
1- Defining common "intents" that a lot of customer queries are categorized into, and having a model map the incoming message to the appropriate canned response. Look at Rasa, for an example of this.
2- if you insist on generating the text, have it be a recommendation to a human agent that either chooses to send it or writes their own response.
How to create a hybrid?