* Redis Cluster: a distributed implementation of a subset of Redis.
* New "embedded string" object encoding resulting in less cache
misses. Big speed gain under certain work loads.
* AOF child -> parent final data transmission to minimize latency due
to "last write" during AOF rewrites.
* Much improved LRU approximation algorithm for keys eviction.
* WAIT command to block waiting for a write to be transmitted to
the specified number of slaves.
* MIGRATE connection caching. Much faster keys migraitons.
* MIGARTE new options COPY and REPLACE.
* CLIENT PAUSE command: stop processing client requests for a
specified amount of time.
* BITCOUNT performance improvements.
* CONFIG SET accepts memory values in different units (for example
you can use "CONFIG SET maxmemory 1gb").
* Redis log format slightly changed reporting in each line the role of the
instance (master/slave) or if it's a saving child log.
* INCR performance improvements.
Yes, he is not doing Jepsen at the moment. The end of his last blog post on it has some explanation. That said, I wouldn't wonder if he'd make an exception, given that it's Redis...
(To note, I have limited experience with using Redis. My questions may be stupid.)
As far as I can tell, most of the advantages of Redis come from the fact that it's all held in memory and so access is fast. Is networked access to other parts of the cluster quick enough that it is quicker than storing the data on one computer, partly on disk? When would one want to use a Redis cluster rather than something stored on-disk and cached in memory?
The performance section of the spec[1] does a pretty good job explaining how the implementation remains fast.
In Redis Cluster nodes don't proxy commands to the right
node in charge for a given key, but instead they redirect
clients to the right nodes serving a given portion of the
key space.
Eventually clients obtain an up to date representation
of the cluster and which node serves which subset of
keys, so during normal operations clients directly
contact the right nodes in order to send a given command.
Because of the use of asynchronous replication, nodes
does not wait for other nodes acknowledgment of writes
(if not explicitly requested using the WAIT command).
The primary advantage of Redis is native data types and access patterns beyond simple key-values: lists, sets, sorted sets, hash tables, pub/sub, hyperloglog, and scripting (lua) support.
Being memory-based was simply a feature but not necessarily something that set it apart: Memcached had that area pretty well locked down for being a blazing fast key-value store. And then Membase was basically memcached + persistence and clustering. Now Redis has clustering too!
seconded. memory = fast is not the point. the point is that it gives you specific data types that do specific things and since it's in memory it does those things very quickly. that said, you need to implement your own clustering or sharing solution. jury is still out on the new cluster code, i havent reviewed it yet.
As well as the optimisations mentioned by famousactress, note that it's common to run Redis (or other in-memory databases such as memcached) on a separate server anyway. So there's already some network overhead but it tends to be small enough if the content is small and everything's co-located.
Right, but I assume the commenter was more curious about the presumed multiple-network-accesses usually involved with a cluster which proxies to some other node to satisfy requests would do to the overall performance of the system... which is a totally reasonable question/concern.
Maybe it works great from day one,
maybe it will need a few more iterations,
and possibly with 3.2 we'll improve support for many stuff,
but my guess is that Redis 3.0.0 today, in some way, changes what Redis is.
Not a joke, the release was due in these days, so I picked 1 April since we have a tradition now to ship 1 April. Last year with HyperLogLog support, because of the futuristic name of the data structure, people had an hard time to believe it was a really thing and not a joke.
I understand your reasons and nevertheless hope that you can realize that you are still contributing to April Fool's insanity by making it that much harder for people to tell what is serious and what isn't on this day of disregard.
You’re totally within your rights here. I just beg you to consider the audience.
Does anyone have more details on what the "embedded string" object encoding is and what workloads it helps with? The closest thing I can find is https://github.com/antirez/redis/issues/543, which seems pretty old.
Hey, it's very straightforward. Normally you have something like the redis object structure which has a type field, and a pointer to the actual representation of the object. If type is REDIS_STRING, you have the pointer to an "sds" string (where sds is the name of the string library used).
Now with embedded strings there is a special kind of string objects that instead use a single allocation for both the object structure and the string itself. This is slightly more memory efficient, but especially, improves memory locality a lot, so basically everything uses string objects (string types, or aggregate data types large enough to use string objects as values of the collection), will perform better.
Those special strings are used only for small strings (the majority in most work loads).
No... Sentinel will still be developed alongside with Redis Cluster. For single instance deployments where all you want is HA, Sentinel may be a more obvious way to get it compared to running Redis in cluster mode. In the long term, with plenty of early warnings, we may support the current Sentinel use case with Cluster and merge the two stuff.
Depends on the use case, but basically, there is a big set of problems you can solve with different technologies depending on your exact details, one will be better in one way, one will be better in some other way. The sensibility of the programmer is to pick the right one, in an effort to maximize the different aspects: data model fitness for the problem at hand, operational aspects, consistency guarantees, performances (number of nodes needed), scalability, simplicity (do I need support since it's a complex stuff?), and so forth.
http://memcached.org/ used to be. I haven't done system architecture in about 2 years but when we were looking at in memory databases, it came down to Redis or Memcache.
Memcached is just a stupid key-value store, and by stupid I mean it just stores and retrieves values. It's extremely primitive compared to Redis.
It's not even close to the same thing as Redis except superficially.
The key/value part of Redis is just the beginning. The values themselves can be of several different types that allow for a lot more flexibility in how you store and query data.
>In order to achieve its outstanding performance, Redis works with an in-memory dataset. Depending on your use case, you can persist it either by dumping the dataset to disk every once in a while, or by appending each command to a log.
Well, it's written to disk either way. You're right that if you do the "once in a while" setup, you can lose some data in a power failure.
At a previous job people started using Redis thinking it was a fast in-memory data store. It turned out we had accumulated tens of thousands of records.
I haven't measured, but I doubt it's much faster than Postgres. It does have other nice features. I like using the expiring records for caching.
There are plenty of alternatives to every library having to have yet another probably broken security layer. Probably better to focus on this layer being separate from everyone having to implement it.
Why are you connecting to a Redis box across the internet? There's a great (and after Heartbleed, prophetic) post on the Varnish web site about why they don't implement SSL, I imagine Redis would be similar:
I love this post. Not every single piece of software needs to include SSL support out of the box. Sometimes, for the exact reasons Varnish explains, it just doesn't make sense.
It's hard to imagine every service in your infrastructure implementing SSL would be more secure than a single VPN tool. You are very optimistic about the difficulties of getting security right.
It's really simple to imagine and I even have implemented it :) "One single VPN" may (and will) fail sometimes, so count your complexity and stability with and without one extra service.
I'm sorry to be skeptical, but when a random person on the internet claims to have implemented SSL more securely than open source tools that are completely built around security, I tend to not believe it.
Implementing SSL is easy. Implementing SSL correctly is very difficult, and you probably won't find out you did it wrong for a long time, if ever.
I'm not implementing SSL, I just use it. With MySQL you can just use it. With Redis you have to use VPN with all costs of VPN. Please calm down and stop forcing your preference of VPN as the only right way.
As a operations person, this is the wrong way to go. The VPN becomes a single point of failure. Attempts at HA fail in my experience.
Also solutions like stunnel create a separate process that has to be managed. If I have one for redis, and then one for something else it is harder to tell them apart, because both will be named stunnel.