Páginas

Wednesday, January 5, 2011

Manipulating nested hashes in Ruby

Lately I needed to work with some nested hashes in ruby. By nested hashes I mean hashes with hashes (or any other type, actually) in them, something like this:

nested_hash = {"first_key"=>{"second_key"=>"value"},"third_key"=>12}

That was when I found out that there are no actual methods to do this, and therefore I had to come up with my own.

If you want to get all the values in a nested hash, you can do this:

def get_all_values_nested(nested_hash={})
  nested_hash.each_pair do |k,v|
    case v
      when String, Fixnum then @all_values << v
      when Hash then get_all_values_nested(v)
      else raise ArgumentError, "Unhandled type #{v.class}"
    end
  end

  return @all_values
end

Obviously, you could just run trough all the pairs of key/value, but this would only work if there were no nested hashes. If there are, you need to recursively call the method each time you find a hash.

That's why we need the case statement, in order to differenciate between values and other hashes (or Arrays...). If it is a hash, you just give the value to the function and do a recursive search, that stops when it hits a value, adding this value to the final array. In this example I've considered to be values, String and Fixnum, if any other type happens to be present, an exception will be raised.

Note that the all_values variable is an instance variable. It cannot be a local variable, because it's values are going to be used in all the recursive calls. Another way of doing it would be by passing the variable to the method each time and update it at each return. I find it simpler and prettier like this.

In the previous example you'll get an array with all of the values, and that's it. You may, however, want to now what was the path travelled to get to the value. It is actually not that hard to implement, by changing the code just a little bit.

def get_all_values_nested(nested_hash={})
  nested_hash.each_pair do |k,v|
    @path << k
    case v
      when String, Fixnum then
        @all_values.merge!({"#{@path.join(".")}" => "#{v}"})
        @path.pop
      when Hash then get_all_values_nested(v)
      else raise ArgumentError, "Unhandled type #{v.class}"
    end
  end
  @path.pop

  return @all_values
end

There are two main differences in this code, the first is that there is a new instance variable, called path, that's an array, with all the keys that had to be "visited", to get to the value. The last key to be visited is the last key in the array, and is poped each time a value is found, or when all the keys of a certain hash are exhausted.

Imagine you have the hash first presented as an example, the evolution of the path array would be:

["first_key"] , ["first_key","second_key"] - Here, the value is found, and therefore a pop occurs, leaving the array with the previous state, after saving the path in the all_values hash:

["first_key"] - first_key does not have any more keys, so another pop happens, and so on:

[] , ["third_key"] , []

The other difference is that an hash is returned instead of an array. The hash has the following format (using the previous example):

{"first_key.second_key"=>"value" , "third_key"=>12}

So, now you can get the values, and know where they came from, the next thing you probably will want to do is change them and update the nested hash.

def set_value_from_path(nested_hash,path_to_value,newValue)
  path_array = path_to_value.split "."
  last_key = path_array.pop
  hash = create_hash_with_path(path_array,{last_key=>newValue})
  self.merge!(nested_hash,hash)
end

This method receives the path to the value in the format used in the get, and the new value to be inserted. It first transforms the string of the path into an array as the ones we've used before, then it makes the array and the value into a hash, using an auxiliary method, create_hash_with_path. It is a very simple method that gets a path array and a simple hash with the last key before the value, and the value.

def create_hash_with_path(path_array,hash)
  newHash = Hash.new
  tempHash = Hash.new
  flag = 1
  path_array.reverse.each do |value|
    if flag == 1
      tempHash = hash
      flag = 0
    else
      tempHash = newHash
    end
    newHash = {value => tempHash}
  end

  return newHash
end

Afterwards it merges the nested hash (that is the one from the first example), and the newly created hash. The only problem is that you cannot use the merge methods from the Hash class, because it does not work for nested hashes. You can write a simple method that does just that.

def merge!(merge1,merge2)
  case merge1
    when String,Fixnum then merge2
    else merge1.merge!(merge2) {|key,old_value,new_value| self.merge!(old_value,new_value)} if merge1 && merge2
  end
end

It just redefines the Hash's class merge! to be recursive, and to stop when it reaches a value (String, Fixnum), and to using the value in merge2, that is the new value.

So, there you go. Now you have a whole new arsenal of methods to deal with nested hashes. Have in mind that you can chage this code to be compatible with other kinds of values, like Symbols and/or to accept other containers as an Array.

No comments: