Hacker Newsnew | past | comments | ask | show | jobs | submit | exagolo's commentslogin

I believe Karma solely comes from upvotes for comments minus downvotes. Submissions don't count.

That might be in real life ("afk"), but on HN even submissions give you karma.

Have a look at your submissions, they brought you karma. <https://news.ycombinator.com/submitted?id=exagolo>

Although nothing is crystal clear, the karma system is not 1:1 for submissions.

<https://news.ycombinator.com/item?id=29024032>

    Comment upvote +1
    Comment downvote -1
    Submission upvote >0 && <1 (not documented to prevent abuses)
    Submission downvote not possible (only flagging is allowed)

In case you have a single table with time-series data, then Clickhouse will perform typically better. It's very much optimized for this type of use cases. Once you are joining tables and having more advanced analytics, than Exasol will easily outperform it.

Exasol has been performance leader for more than 15 years in the market, as you can see in the official TPC-H publications, but has not gotten the broader market attention yet. We are trying to change that now and have recently been more active in the developer communities. We also just launched a completely free Exasol Personal edition that can be used for production use cases.


You mean the "execution plan" for your queries? Ideally, those types of decisions are automatically done by the database.


ideally? yes. in practice? big nope.

How you actually interpret what you're seeing here? does it look like more like optimizer fragility (plans that assume ideal memory conditions) or more like runtime memory management limits (good plans, but no adaptive behavior under pressure)?


I think the issue in the tests was the lack of a proper resource management of Clickhouse that led to queries failing under pressure. Although I have to admit that the level of pressure was minimal. Just a few concurrent users shouldn't be considered pressure. Also, having far more RAM than the whole database size means very little pressure. And the schema model is quite simple, just two fact tables and a few dimension tables.

Any database should be able to handle 100 concurrent queries robustly, even if this means to slow down the execution of queries.


Traditional database benchmarks focus on throughput and latency – how many queries per second can be processed, how execution time changes as hardware resources increase. This benchmark revealed something different: reliability under realistic conditions is the first scalability constraint.


The tool is very flexible and you can create your own benchmarks with your own data. This is always the best benchmark, as any public benchmark will have a certain bias.


Another alternative is Exasol that is factors (>10x) faster than Clickhouse and scales much better for complex analytics workloads that joins data. There is a free edition for personal use without data limit that can run on any number of cluster nodes.

If you just want to read and analyze single table data, then Clickhouse or DuckDB are perfect.

Disclaimer: I work at Exasol


For the bigger tasks, Exasol might also be a very neat option for you. We have a free personal edition that can scale regarding data volumes, #servers (MPP architecture) and complex workloads.

Recently, we have also compared ourselves against DuckDB and were 4 times faster even on a single node. We are in-memory optimized, but data doesn't need to fit in the RAM.

Disclaimer: I'm CTO@Exasol


If not having to adjust queries is a major driver for your considerations, then I would highly recommend looking at SQLGlot (https://github.com/tobymao/sqlglot), a transpiler that makes you (more) independent of query dialects. They already support 30 dialects (big vendors such as Snowflake, Databricks, BigQuery, but also loads of the specialists such as ClickHouse, SingleStore or Exasol). Repo is maintained extremely well.

Picking the best solution for your concrete workload (and your future demands) should be equally important to the implementation effort, to avoid that you run into walls later on. At least as long as data volume, query complexity or concurrency scalability can be challenges.


Are you using a transpiler technology such as SQLGlot for the multi-vendor SQL generation?


I do agree. I still think that the article articulates a very interesting thought... the better the input for a problem, the better the output. This applies both to LLMs but also for colleagues.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: