flumelog-offset

An flumelog where the offset into the file is the key. Each value is appended to the log with a double ended framing, and the "sequence" is the position in the physical file where the value starts, this means if you can do a read in O(1) time!

Also, this is built on top of aligned-block-file so that caching works very well.

Usage

initialize with a file and a codec, and wrap with flumedb.

var OffsetLog = require('flumelog-offset')
var codec = require('flumecodec')
var Flume = require('flumedb')

var db = Flume(OffsetLog(filename, {codec: codec.json}))
  .use(...) //also add some flumeviews

db.append({greets: 'hello!'}, function (cb) {

})

Options

var OffsetLog = require('flumelog-offset')
var log = OffsetLog('/data/log', {
  blockSize: 1024,        // default is 1024*16
  codec: {encode, decode} // defaults to no codec, expects buffers. for json use flumecodec/json
  flags: 'r',             // default is 'r+' (from aligned-block-file)
  cache: {set, get}       // default is require('hashlru')(1024)
  offsetCodec: {          // default is require('./frame/offset-codecs')[32]
    byteWidth,            // with the default offset-codec, the file can have
    encode,               // a size of 4GB max.
    decodeAsync
  }
})

legacy

if you used flumelog-offset before 3, and want to read your old data, use require('flumelog-offset/legacy')

recovery

If your system crashes while an append is in progress, it's unlikely but possible to have a partially written state. flumelog-offset will rewind to the last good state on the next start up.

After running this for several months (in my personal secure-scuttlebutt instance) I eventually got an error, which lead to the changes in this version.

format

data is stored in a append only log, where the byte index of the start of a record is the primary key (offset).

offset-><data.length (UInt32BE)>
        <data ...>
        <data.length (UInt32BE)>
        <file_length (UInt32BE or Uint48BE or Uint53BE)>

by writing the length of the data both before and after each record it becomes possible to scan forward and backward (like a doubly linked list)

It's very handly to be able to scan backwards, as often you want to see the last N items, and so you don't need an index for this.

future ideas

  • secured file (hashes etc)
  • encrypted file
  • make the end of the record be the primary key. this might make other code nicer...

License

MIT

results matching ""

    No results matching ""