New year, new toy: forqlift

Posted by Q McCallum on 2011-01-01

I have a new project.

It’s called forqlift.

This one should be of use to people who crunch data with Hadoop or Mahout.

Here’s a bit of a blurb from forqlift’s page to get you started:

SequenceFiles are nice, but they can be unwieldy at times. I wrote forqlift to make it easier to manage SequenceFiles.

forqlift is a command-line tool that lets you:

  • create SequenceFiles from files on your local filesystem (just like creating an archive with tar or zip)
  • set compression (none, bzip2, gzip) and value types (text or binary)
  • extract the contents of a SequenceFile back to the filesystem
  • convert popular archive formats — tar (including tar.bz2 and tar.gz) and zip — to and from SequenceFile format

Head over to the forqlift page for more info!