What’s in this post
A design pattern is a general repeatable solution to a commonly occurring problem in software design (link for a detailed discussion).
To be honest, I think lots of patterns are peculiar. But there are a few very interesting patterns.
The Visitor pattern is my favorite.
In this post, we will using the visitor pattern to implement
- a JSON dump and load library,
- a binary serialization/deserialization library.
- handle type version.
The Visitor Pattern
The visitor pattern can be summarized as:
- build a (traverse) tree for types.
- a visitor object will traverse the tree and visit each element.
Consider this code,
class Traversable(ABC): @abstractmethod def traverse(self): pass class A(Traversable): def __init__(self): self.a = 1 self.b = "whatever" self.c = [1, 2, 3] def traverse(self, visitor): visitor.visit('a', self.a) visitor.visit('b', self.b) visitor.visit('c', self.c) class B(Traversable): def __init__(self): self.b1 = 100 self.instance_of_A = A() def traverse(self, visitor): visitor.visit('b1', self.b1) visitor.visit('instance_of_A', self.instance_of_A)
Type A and Type B has a method call traverse
. In the function, a visitor “visits” each member variable.
Now, let’s check the implementation of the visitor.
class VisitorBase(ABC): def __init__(self): pass @abstractmethod def on_leaf(self, name, obj): pass @abstractmethod def on_enter_level(self, name): pass @abstractmethod def on_leave_level(self): pass @abstractmethod def on_enter_list(self, name, obj): pass @abstractmethod def on_leave_list(self): pass def visit(self, name, obj): if isinstance(obj, list): self.on_enter_list(name, obj) for e in obj: self.visit(name = None, obj = e) self.on_leave_list() elif isinstance(obj, Traversable): self.on_enter_level(name) obj.traverse(self) self.on_leave_level() else: self.on_leaf(name, obj)
For a visitor, the visit function “visits” the current object. There are 3 cases.
- If the visited object is a list, the visitor visits all elements in the list.
- If the visited object is an instance of
Traversable
, we use the current visitor to traverse theTraversable
object. - else, we assume the current object is an instance of primitive types. e.g. int, float, string. We call the
self.on_leaf
method to do the actual work on the primitive types. A user will inhereint theVisitorBase
type and define the actual work.
There are some abstractmethods that define the behavior before or after an action. The usage will be clear later in a concrete example later.
If we call the traverse
function for an instance of Type B, the traversal and visits be can be visualized as,
Using the visitor pattern, we can implement functionalities related to serialization and deserialization.
We will implement,
- A JSON dumper and a JSON loader.
- A binary serializer and deserializer.
- Adding version handling.
A JSON dumper
We can use the visitor pattern to implement a JSON dumper.
We simply need to inherent the VisitorBase
and implement its abstract methods. Here is the full implementation for the JSON dumper.
class JsonDump(VisitorBase): def __init__(self): self.result = '' self.level = 0 self.indent = 2 def on_leaf(self, name, obj): obj_str = str(obj) if not isinstance(obj, str) else '"{}"'.format(obj) if name == None: self.result += ' ' * self.level + obj_str + ',\n' else: self.result += ' ' * self.level + '"' + name + '"' + ':' + obj_str + ',\n' def on_enter_level(self, name): if name == None: self.result += ' ' * self.level + '{\n' else: self.result += ' ' * self.level + '"' + name + '"' + ':' + '{\n' self.level += self.indent def on_leave_level(self): self.level -= self.indent self.result += ' ' * self.level + '}\n' def on_enter_list(self, name, obj): if name == None: self.result += ' ' * self.level + '[\n' else: self.result += ' ' * self.level + '"' + name + '"' + ':' + '[\n' self.level += self.indent def on_leave_list(self): # remove the last ',' self.result = self.result[:-2] + '\n' self.level -= self.indent self.result += ' ' * self.level + ']\n'
We maintain a indent.
When we enter a traversable
object, we save its name, increase the indent, and add a “{“. When we exit the traversable
object, we decrease the indent and add a “}”.
We do the same for list with the exception of add “[” or “]”.
For leave nodes, we need to save the leave’s name and value.
An example.
class A(Traversable): def __init__(self): self.a = 1 self.b = "whatever" self.c = [1, 2, 3] def traverse(self, visitor): visitor.visit('a', self.a) visitor.visit('b', self.b) visitor.visit('c', self.c) class B(Traversable): def __init__(self): self.b1 = 100 self.instance_of_A = A() def traverse(self, visitor): visitor.visit('b1', self.b1) visitor.visit('instance_of_A', self.instance_of_A) if __name__ == "__main__": b = B() jsonDump = JsonDump() jsonDump.visit('obj_b', b) print(jsonDump.result)
Outputs.
yimu@yimu-dell:/usr/bin/python3 ~/yimu-blog/random/visitor/simple/visitor_simple.py "obj_b":{ "b1":100, "instance_of_A":{ "a":1, "b":"whatever", "c":[ 1, 2, 3 ] } }
A JSON Loader
Using the visitor pattern, we can also implement a JSON loader.
However, there is a Python issue. For function arguments, Python passes objects by reference but pass primitive types (int/float) by value.
We can’t change a int argument for updating/outputting the int variable outside of the function.
For example.
# in C++, the function can change a. The value of a is updated out side of the function void update_a(int &a) { a = 100; } # in python, we copy the argument a. The value of a stay the same out side of the function def update_a(a : int) { a = 100 }
To work around the issue, I used a object to warp-around primitive types. I call the type RefObj
.
class RefObj: def __init__(self, val): self.val = val ... class TypeA(Traversable): def __init__(self): self.a = RefObj(1) self.b = RefObj("whatever") self.c = [RefObj(1), RefObj(-1), RefObj(2.2)] def traverse(self, visitor): visitor.visit('a', self.a) visitor.visit('b', self.b) visitor.visit('c', self.c) class TypeB(Traversable): def __init__(self): self.b1 = RefObj(123) self.instance_of_A = TypeA() def traverse(self, visitor): visitor.visit('b1', self.b1) visitor.visit('instance_of_A', self.instance_of_A)
Now we can implement the JSON loader. (The JSON dumper also need to be updated for the RefObj)
class JsonLoader(visitor_ref.VisitorBase): def __init__(self, dumped): self.dumped = deque(dumped.split('\n')) def on_leaf(self, name, obj): line = self.dumped.popleft() line = line.strip().rstrip(',') value_str = line if ':' in line: _, value_str = line.split(':') if value_str[0] == '"': obj.val = value_str.strip('"') else: obj.val = float(value_str) def on_enter_level(self, name): self.dumped.popleft() def on_leave_level(self): self.dumped.popleft() def on_enter_list(self, name, obj): cur = self.dumped.popleft() def indent(line): return len(line) - len(line.lstrip()) list_start_indent = indent(cur) idx = 0 while indent(self.dumped[idx]) != list_start_indent: idx += 1 list_size = idx # need to operate on the list object obj.clear() for _ in range(list_size): obj.append(visitor_ref.RefObj(None)) def on_leave_list(self): self.dumped.popleft()
The JsonLoader
take a JSON dumped string as initialization input. To make processing easier, we split the string by “\n” and put results into a deque. Using deque, we can process data one by one.
Then, as we traverse the tree, we simply put data into each leave node.
(one caveat: python is dynamically typed. There is no easy way to differentiate a type as an int or float. For all numerical string, I converted them to float.)
That’s it!
Here is a example using the JSON dumper and the JSON loader.
def json_example(): print("============= binary_example ================") # dump the body of b b1 = TypeB() bin_dumper = BinDumper() bin_dumper.visit('b1', b1) print('b1 dump:', bin_dumper.result, sep='\n') b1_dump = bin_dumper.result # b2 with no values b2 = TypeB() b2.b1 = RefObj(None) b2.instance_of_A.a = RefObj(None) b2.instance_of_A.b = RefObj(None) b2.instance_of_A.c = [] bin_dumper = BinDumper() bin_dumper.visit('b2', b2) print('b2 dump:', bin_dumper.result, sep='\n') # deserialize b dump into b2 loader = BinLoader(b1_dump) loader.visit('b2', b2) # check the dump of b2 jsonDump = BinDumper() jsonDump.visit('b2', b2) print('b2 dump after loading b1:', jsonDump.result, sep='\n')
output.
yimu@yimu-dell:~/Desktop/yimu-blog/least_squares/land_rocket/build$ /usr/bin/python3 /home/yimu/Desktop/yimu-blog/random/visitor/visitor_ref/example.py ============= binary_example ================ b1 dump: "b1":{ "b1":123, "instance_of_A":{ "a":1, "b":"whatever", "c":[ 1, -1, 2.2 ] } } b2 dump: "b2":{ "b1":None, "instance_of_A":{ "a":None, "b":None, "c":[ ] } } b2 dump after loading b1: "b2":{ "b1":123.0, "instance_of_A":{ "a":1.0, "b":"whatever", "c":[ 1.0, -1.0, 2.2 ] } }
Binary Serializer and Deserializer
We can use the visitor pattern to implement a binary serializer and a deserializer. Here is the full implementation.
class BinDumper(visitor_ref.VisitorBase): def __init__(self): self.result = '' self.seperator = '|' def on_leaf(self, name, obj): obj_str = str(obj.val) if not isinstance( obj.val, str) else '"{}"'.format(obj.val) self.result += obj_str + self.seperator def on_enter_level(self, name): pass def on_leave_level(self): pass def on_enter_list(self, name, obj): # save the length of the list self.result += str(len(obj)) + self.seperator def on_leave_list(self): pass class BinLoader(visitor_ref.VisitorBase): def __init__(self, dumped): self.seperator = '|' self.dumped = deque(dumped.split(self.seperator)) def on_leaf(self, name, obj): value_str = self.dumped.popleft() if value_str[0] == '"': obj.val = value_str.strip('"') else: obj.val = float(value_str) def on_enter_level(self, name): pass def on_leave_level(self): pass def on_enter_list(self, name, obj): list_size = int(self.dumped.popleft()) # need to operate on the list object obj.clear() for _ in range(list_size): obj.append(visitor_ref.RefObj(None)) def on_leave_list(self): pass
The only trick is using a separator for each leave node. We simple traversal the tree and save/read data.
An example.
def binary_example(): print("============= binary_example ================") # dump the body of b b1 = TypeB() json_dumper = JsonDumper() json_dumper.visit('b1', b1) print('b1 dump:', json_dumper.result, sep='\n') b1_dump = json_dumper.result # b2 with no values b2 = TypeB() b2.b1 = RefObj(None) b2.instance_of_A.a = RefObj(None) b2.instance_of_A.b = RefObj(None) b2.instance_of_A.c = [] json_dumper = JsonDumper() json_dumper.visit('b2', b2) print('b2 dump:', json_dumper.result, sep='\n') # deserialize b dump into b2 loader = JsonLoader(b1_dump) loader.visit('b2', b2) # check the dump of b2 json_dumper = JsonDumper() json_dumper.visit('b2', b2) print('b2 dump after load b1:', json_dumper.result, sep='\n')
outputs.
============= binary_example ================ b1 dump: 123|1|"whatever"|3|1|-1|2.2| b2 dump: None|None|None|0| b2 dump after load b1: 123.0|1.0|"whatever"|3|1.0|-1.0|2.2|
Handle Version
A common problem in practice is the version. For a type, we may want to add member variables to it in the development process.
However, after we added new variables, we can’t deserialize old dump object anymore.
The visitor pattern solve the issue by adding version.
In the Traversable
base class, we add a version int into it. In the base traverse
function, we visit the self.version
variable.
class Traversable(ABC): def __init__(self): self.version = RefObj(1) @abstractmethod def traverse(self, visitor): visitor.visit('version', self.version)
For types,
class TypeA(Traversable): def __init__(self): "SEE HERE!!!!!!" super().__init__() self.version = RefObj(2) self.a = RefObj(1) self.b = RefObj("whatever") self.c = [RefObj(1), RefObj(-1), RefObj(2.2)] self.v2_var = RefObj('version 2!') def traverse(self, visitor): "SEE HERE!!!!!!"" super().traverse(visitor) visitor.visit('a', self.a) visitor.visit('b', self.b) visitor.visit('c', self.c) "SEE HERE!!!!!!!"" if self.version.val >= 2: visitor.visit('v2_var', self.v2_var) class TypeB(Traversable): def __init__(self): "SEE HERE!!!!!!!"" super().__init__() self.b1 = RefObj(123) self.instance_of_A = TypeA() def traverse(self, visitor): "SEE HERE!!!!!!!"" super().traverse(visitor) visitor.visit('b1', self.b1) visitor.visit('instance_of_A', self.instance_of_A)
In each type, we need to call the base class __init__
function to init the version variable.
In the traverse
class, we call the base class traverse
function to visit the self.verison
first.
In line 21. We have an if statement for visiting variables. When loading an old version of a serialized string, we load the version first. if the loaded version is lower than the current version, we simply skip the variable.
An example.
def json_example(): print("============= binary_example ================") # dump the body of b b1 = TypeB() b1.instance_of_A.version = RefObj(1) json_dumper = JsonDumper() json_dumper.visit('b1', b1) print('b1 dump:', json_dumper.result, sep='\n') b1_dump = json_dumper.result # b2 with no values b2 = TypeB() b2.b1 = RefObj(None) b2.instance_of_A.a = RefObj(None) b2.instance_of_A.b = RefObj(None) b2.instance_of_A.c = [] json_dumper = JsonDumper() json_dumper.visit('b2', b2) print('b2 dump:', json_dumper.result, sep='\n') # deserialize b dump into b2 loader = JsonLoader(b1_dump) loader.visit('b2', b2) # check the dump of b2 json_dumper = JsonDumper() json_dumper.visit('b2', b2) print('b2 dump after load b1:', json_dumper.result, sep='\n')
Outputs.
yimu@yimu-dell:~/Desktop/yimu-blog/least_squares/land_rocket/build$ /usr/bin/python3 /home/yimu/Desktop/yimu-blog/random/visitor/vistor_ref_version/example.py ============= binary_example ================ b1 dump: "b1":{ "version":1, "b1":123, "instance_of_A":{ "version":1, "a":1, "b":"whatever", "c":[ 1, -1, 2.2 ] } } b2 dump: "b2":{ "version":1, "b1":None, "instance_of_A":{ "version":2, "a":None, "b":None, "c":[ ] "v2_var":"version 2!", } } b2 dump after load b1: "b2":{ "version":1.0, "b1":123.0, "instance_of_A":{ "version":1.0, "a":1.0, "b":"whatever", "c":[ 1.0, -1.0, 2.2 ] } }
The End
I love the pattern 🙂
I hope you enjoy it too!
71 thoughts on “The Visitor Pattern”