2011-02-26

Serializing Records and Incanter Matrix in Clojure with print-dup

For most Clojure data structures, you can get a serialization by doing something like this:


(def x (some-data-structure 1 2 3))
(def x2 (binding [*print-dup* true]
          (prn-str x)))
;; now x2 is a string serialization of x
;; You can save x2 to disk or whatever
;; Then to read it back in:
(def x-clone (read-string x2))

(Note: you really shouldn't be programming in Clojure with a bunch of def'ed vars like that...)

Unfortunately, Clojure records does not support this functionality quite just yet. There is defrecord2 that implements this serialization functionality for records though, described in this discussion in the Clojure group.

I had a second problem though, since I had records storing Incanter matrices, which are Parallel Colt matrices, and they don't print-dup in a way that can be read back in with read (or read-string). So to solve both problems at the same time, we can just implement our own print-dup for the record we create with defrecord (print-dup is a multmethod).


So as an example, let's say I have:

(ns silly.core
  (:use incanter.core))
(defrecord rec-mat [m])
where m will be an Incanter matrix. We'd write something like this:


(defmethod print-dup silly.core.rec-mat [obj wrtr]
  (.write wrtr "#=(silly.core/new-rec-mat #=(clojure.lang.PersistentArrayMap/create ")
  (print-dup (update-in [:m] #(matrix-to-list-frm %))
    wrtr)
  (.write wrtr "))"))

What that does is create a string that can be read back in by read or read-string, and which will call your silly.core/new-rec-mat function with a single argument (a map with key :m, that has as value the Incanter matrix in a list form — built by the matrix-to-list-frm function)

That means we still have two more functions to write to complete this exercise. First, the matrix-to-list-frm:


(defn matrix-to-list-frm
  [mat]
  (if (== 1 (first (dim mat)))
    (list (to-list mat))
    (to-list mat)))

to-list is a function from Incanter, but it doesn't handle matrices with only a single row properly — single row and single column matrices gets transformed into the same list form, so when read back in, it's hard to tell if it was originally a row or column vector, which may be an important distinction in your program — and that is why we have to wrap it up with our own matrix-to-list-frm function.

Secondly, we write the new-rec-mat function:

(defn new-rec-mat
  [{:keys [:m]}]
  (rec-mat. (matrix m)))

And that's it! Using the same kind of thing as above, you can make records contain Clojure vectors of Incanter matrices, and still be able to print-dup it out.

If you just need to save a single Incanter matrix, you should probably just use the incanter.core/save function though.

Edit: Argh! I forgot to write about how this doesn't work if you assoc into the record some key/val that's not pre-defined as part of that record. There's a way around it, of course, but I haven't had time to write it down here...one of these days though...

No comments: