Cassandra in 30 seconds

  1. writes 
    1. writes entries directly to disk without checking if they already exist
    2. does fancy indexing of entries
    3. returns a write "OK" to the writing client after a quorum of nodes have confirmed
  2. reads
    1. tries to return the newest entry when client does a read
    2. has methods to eventually get the newest entry to return even if old ones still around
  3. replication
    1. stores entries to multiple nodes if replication is turned on
  4. deletes
    1. doesn't offically delete, just marks dead entries with a "tombstone"
    2. compaction is what gets rid of old versions of entries and dead entries
  5. balancing
    1. automatically fills in data holes if a node disappears
    2. automatically spreads data if new nodes are added
  6. resurrection
    1. 3-nodes: X, Y, Z, all replicate all data
    2. server X goes down
    3. delete goes to Y and Z for key A
    4. Y and Z are "compacted"
      1. i.e., redundant keys & tombstones cleaned up / removed
      2. key A is completely gone as far as  Y and Z know
    5. X comes up and has value for key A
    6. A is back! resurrected from the dead! life sucks.
    7. NOTE: if Y and Z didn't have tombstones removed, they would have had a date that was more recent than X's key A entry, so they would have invalidated X's key A. But, they are gone after a compaction or cleanup.


Popular posts from this blog

Debugging pfsense firewall rules clearly and easily

Direct ssh to a server via proxy using putty/plink on Windows

telnet vs netcat