This is a problem that is both solved again and again, but also all the available solutions are bad.
In my experience what happens is:
1. you start with a "ship logs from X to Y" product
2. you add more sources and more destinations, making it more of a central router. you add config options for specifying your sources and dests.
3. since the way you checkpoint or consume or pull or push certain sources or dests doesn't generalize, you end up buffering internally to present a unified "I have recieved / sent this message successfuly" concept to your inputs and outputs.
4. you want to do some basic transforms on the logs as you go. you implement "filters" or "transforms" or "steps" and make them configurable. your config now describes a graph of sources -> filters -> dests
5. your filters need to be more flexible. you add generic filters whose behaviour is mostly controlled by their config options. your configs grow more complicated as you use multiple layers of differently-configured filters
6. you have a bad turing complete programming language embedded in your config file. getting simple tasks done is possible, getting complex tasks done becomes an awful, inefficient and unreadable mess.
My solution to this cycle has been to just write simple hard-coded applications that can only do the job I need them to do. If they need a different configuration later I edit the source. I'm writing my transforms in a real programming language and I avoid the additional complexity of abstractions. Of course, that comes with its own costs but I consider it well worth it.
In my experience what happens is:
1. you start with a "ship logs from X to Y" product
2. you add more sources and more destinations, making it more of a central router. you add config options for specifying your sources and dests.
3. since the way you checkpoint or consume or pull or push certain sources or dests doesn't generalize, you end up buffering internally to present a unified "I have recieved / sent this message successfuly" concept to your inputs and outputs.
4. you want to do some basic transforms on the logs as you go. you implement "filters" or "transforms" or "steps" and make them configurable. your config now describes a graph of sources -> filters -> dests
5. your filters need to be more flexible. you add generic filters whose behaviour is mostly controlled by their config options. your configs grow more complicated as you use multiple layers of differently-configured filters
6. you have a bad turing complete programming language embedded in your config file. getting simple tasks done is possible, getting complex tasks done becomes an awful, inefficient and unreadable mess.
My solution to this cycle has been to just write simple hard-coded applications that can only do the job I need them to do. If they need a different configuration later I edit the source. I'm writing my transforms in a real programming language and I avoid the additional complexity of abstractions. Of course, that comes with its own costs but I consider it well worth it.