PROTOCOL

Protocol between Spark Driver and Alchemist Driver (AlDriver):
  - Spark Driver sends sequence of commands to AlDriver, and AlDriver sends
    replies synchronously
  - all values are sent and received in network byte order
  - each command starts with a 32-bit "message type" field.

  - MESSAGE TYPES
    - handshake
      cmd:
        uint32_t typeCode = 0xABCD
        uint32_t protocolVersion = 0x1

      reply:
        uint32_t statusCode = 0xDCBA
        uint32_t protocolVersion = 0x1
        uint32_t numWorkers = number of AlWorkers
        WorkerInfo workers[numWorkers]


    - shutdown
        uint32_t typeCode = 0xFFFFFFFF

      reply:
        uint32_t statusCode = 0x1

    - newMatrix
        uint32_t typeCode = 0x1
        uint64_t numRows
        uint64_t numCols
        uint64_t numWorkers
        // each entry of layout must be number of rows to be recvd by that worker
        uint32_t layout[numWorkers]

      reply 1: "ready for receiving data"
        uint32_t statusCode = 0x1
        uint32_t matrixHandle

      [then spark sends data to workers]

      reply 2: "matrix successfully loaded"
        uint32_t statusCode = 0x1

    - matrixMul
        uint32_t matrixHandle 
        uint32_t matrixHandle 

      reply:
        uint32_t statusCode = 0x1
        uint32_t matrixHandle

  - STRUCTS
    - string
      uint32_t length
      uint8_t data[length]

    - WorkerInfo:
        string hostname
        uint32_t port


Block transfer protocol of AlWorker:
  uint32_t typeCode = 0x1
  uint32_t matrixHandle
  uint64_t rowIdx
  uint64_t numCols
  double rowData[numCols]

###
alchemist:
	use locality
	figure out how to force persistence of cached RDDs returned from Alchemist (to avoid replay -- maybe make it persist to disk in the worst case)
	allow freeing of matrices on the RDD and alchemist sides
	need logging system for better debugging and profiling (other than println to console)
	allow AlMatrices to be reshuffled internally (e.g. to row-wise distributions for kmeans)
	need to have more extensible way of adding to library functionality (e.g. a separate header for each lib we link against, and use strings for operations instead of hexs, since easier to make unique, e.g. "elemental/thinSVD" instead of 0x5)
	support returning vectors (of ints as well, e.g. for kmeans cluster assignments)
	support returning vector/matrices direct to driver (again e.g. for kmeans), gathering to Alchemist driver first if not already there (to avoid collects on Spark side)
	have better way to track the distributed (and new driver only) matrices/vectors/objects more generally, in the Alchemist driver
	more formal testing framework w/ unit tests, extensible for new functions, etc.