It provides a stateless layer like interface which feels intuitive, and avoids bugs due to global state.
But, it is built on top of matplotlib. So, if plotnine cannot do something, you can go back to the matplotlib way of doing things and take advantage of the large number of blogs, example, tutorials, and stackoverflow answers.
Performance wise, it can handle more data than altair. But less than matplotlib, if you use the built in data manipulation features.
Altair builds on Vega-Lite but if you cannot do something in Vega-Lite, you can do it in Vega.
Performance is a huge concern for us and we are working on some improvements in that area. We will first focus on pushing computation and aggregation down into the Python kernel.
In the grand scheme of things it's a minor gripe, but I really dislike APIs that encourage chaining of calls like `alt.Chart(data).mark_circle(size=200).encode(...)` from the article. This is found in many other libraries, especially JavaScript ones like e.g. jQuery. I know it's compact, but it makes it harder to see what's going on and hides the fact that all of the operations are actually being applied to the leftmost object.
SmallTalk has the correct answer here, with its method chaining operator `;`, I feel. For cases like this where you build complex state on an object via method calls, it gives some pretty nice code. And since it's a separate construct, the individual method calls can have return values that make sense too, as opposed to this style of interface where the methods all have to return the object again for chaining.
It throws an error if you try to plot more than 5000 datapoints by default, internally altair produces a JSON representation (vega-lite) that is larger than the data you plot (because it contains the data you plot, plus formatting information). If you save this output, for example in an ipython notebook, it gets phat pretty quickly.
Altair is still great though, but this issue makes it occasionally annoying.
Anyone figured out how to use Altair with VSCode to plot in a separate window ? If I use matplotlib, I can use the show() method to plot in a separate window. I would love to have a similar thing for Altair. I'm fine if there's a working method in Spyder as an alternative.
Me and my team are big tableau users, so now that we are testing Python vega and altair are the natural approach. I am having my struggles to make altair work on jupyterlab (while ipyvega works so I could start playing), but I think it’s worth the effort. The Altair guys are extremely nice and responsive on github which is great too. For those looking for side projects, I think a nice altair gui that works in jupyterlab would be great. Anyway, why this approach is superior? Because once you get it you’ll be amazed the stuff you can do and how easily, also how easy is to train new analysts on this, this is important and matplotlib falls short. But don’t take my word, grab a trial copy of Tableau and see it for yourself.
Matplotlib, due to mental inertia. But nothing I do rises to the level of "visualization," just plotting. ;-) And my plots are rarely seen by anybody but me.
As part of our work, we create data dashboards. We use NumPy and Pandas to analyze data, either Flask or Django as a framework, and HighCharts for interactive charts. The JS charting library has a variety of charts to meet our needs. Our input stored either as SQL or CSV is dynamic data.
I am beginning to wonder if HighCharts is replaceable with a fully open-source charting library.
It's pretty awesome and i use it for a similar purpose. MY dashhboards/charts are pretty basic but there seems to be good support for interactivity and callbacks.
https://github.com/sirrice/pygg provides the ggplot2 syntax in Python as a wrapper around Wickham's R implementation. It is useful if you 1) want the R syntax, 2) program in Python, 3) just want static plots.
I'm looking for an interactive visualization tool that I could have site visitors access via a github blog page. Does Altair manage client-side processing? I'm still pretty new in the search but like Altair's adoption of Vega-lite.
Here's a guide to using Github+Kyso [1] to publish your type of article to the web, it should be a very similar workflow to github pages, and you can use any of the popular python visualization libraries - we support plotly, bokeh, vega, altair, matplotlib etc.
I'll dig in -- thanks! Looking for the interactivity -- surely that's not supportable in native HTML, right? Has to use JS or CSS? (I'm not a web-dev of any sort, only very superficial understanding, data scientist by day)
How is Altair on 3D data? I see no examples of this. Matplotlib is decent here (apart from all the same disadvantages the author lays out), but the default options look kinda fugly.
No fundamental reason. ggplot2 was released 4 years after matplotlib, and the Python ecosystem was already centered around the latter by the time it became obvious that the grammar of graphics approach was superior. Python's surging popularity in the data analysis space is also pretty recent.
But any approach to a ggplot2 equivalent either has to abandon the massive ecosystem around matplotlib, or build on top of it – and matplotlib's heavily state-based approach makes that difficult. Plotnine is attempting to do that, I hear it's pretty good.
There are several libraries inspired by the grammar of graphics and ggplot, and there has even been something like a port (although its now abandoned):
What I think the author means is that there is no ggplot in the sense that there's no One Ring To Rule Them All -- ggplot2 basically killed off lattice and base R graphics for about 90% of users. The Python graphics ecosystem is more Balkanized.
The problem with all these nice new visualization libraries for Python, is that they all (at least the shinny nice ones) fail totally short when it comes to do B&W graphics for journal publications. Things like filling patterns, line patterns etc, are mostly missing.
I still use Matplotlib and I can make it look beautiful and exactly how I want... it's just a lot more work to get the shinny bits.
For my thesis, I tried a couple of different options, but in the end the only one that really made publication-grade output was gnuplot with the epslatex terminal. It's a bit fiddly to get it up and running, but hands down the best result I think.
The underlying vega library supports overriding styles for color and line properties - it may not be as difficult as you imagine to generate B&W graph outputs for print.
I save the data from python or matlab and use pgfplots to create stunning plots. Nothing I saw in any other plotting lib came ever close to pgfplots in terms of beauty and flexibility.
second that. also for the scientific community at large, big portions of "not-so-happy"-matplotlib-users are just using whatever they/their admin installed sometime ago, which probably is outdated and does not include a bunch of features introduced in v3
It provides a stateless layer like interface which feels intuitive, and avoids bugs due to global state.
But, it is built on top of matplotlib. So, if plotnine cannot do something, you can go back to the matplotlib way of doing things and take advantage of the large number of blogs, example, tutorials, and stackoverflow answers.
Performance wise, it can handle more data than altair. But less than matplotlib, if you use the built in data manipulation features.