I'd like to note that this is only stronger on news.
Yahoo Questions it is not top performer.
It's not far fetched to think that news are written in a similar way, sometimes even partly copied, and therefore have a lot of words in common.
Yahoo Questions is a forum and I'd expect there to be a greater variation of word, but the word themself have a semantic similarity.
That is, gzip is strong when many words overlap (the size increase when gzipped is smaller) but if it's semantic similarity DNN's win everyday.
The results are interesting but not as interesting as it sounds IMO.
How do they work then that semantic similarity would be any different? That's just a matter of grouping semantically similar 'representations' in training, surely?
I really enjoy using DVC. I do have some drawbacks compared to other offering like MLFlow and W&B.
1. Harder to track experiments on remote VM's (e.g. Azure) as there's no server (we need to feed results back somehow)
2. Impossible (?) to track different types of experiments in the same repo. MLFlow has a way to define experiments and runs, which means I can easily group Regression vs Classification or even if I try a completely different task with the same data.
If anyone has a good suggestion on how to solve these two I'd love to fully commit to DVC!
We solved 1. the same way, but it felt "off" somehow. Perhaps it's a good solution.
2. That's a sound solution, but a tiny bit cumbersome. I have projects where we deploy both classifier and regressor, where it'd nice to keep all in main. Alas, you can't have it all.
We have a strong focus on provenance, Tribuo models capture their input and output domains, along with the necessary configuration to rebuild a model. Tribuo's also more object oriented, nothing returns a bare float or int, you always get a strongly typed prediction object back which you can use without looking things up. Tribuo is also happy to integrate with other ML libraries on the JVM like TensorFlow and XGBoost, providing the same provenance/tracking benefits as standard Tribuo models, and we contribute fixes back to those projects to help support the ecosystem. Plus we can load models trained in Python via ONNX.
To your direct question, I've not benchmarked Smile against Tribuo. We are very interested in the upcoming Java Vector API - https://openjdk.java.net/jeps/338 - targeted at Java 16, which will let us accelerate computations which C2 or Graal don't autovectorise.
Awesome, I love to see how there's coming more and more frameworks for interpretability. It's incredibly important, especially when selling your solution to higher-ups.
There's another solution named LIME which seems to take a similar but more general approach, I like this more tailored idea as it'll probably give a better interpretation of the NLP questions.
Yahoo Questions it is not top performer. It's not far fetched to think that news are written in a similar way, sometimes even partly copied, and therefore have a lot of words in common. Yahoo Questions is a forum and I'd expect there to be a greater variation of word, but the word themself have a semantic similarity.
That is, gzip is strong when many words overlap (the size increase when gzipped is smaller) but if it's semantic similarity DNN's win everyday.
The results are interesting but not as interesting as it sounds IMO.