【问题标题】:Visual alternative for hierarchical tree: (((A,B),(C,D)),E)?分层树的视觉替代方案:(((A,B),(C,D)),E)?
【发布时间】:2014-05-03 16:58:48
【问题描述】:

我有以下形式的分层树:

(((A,B),(C,D)),E)

有没有一种简单的方法来重新排列/绘制它(例如 Python)?

【问题讨论】:

    标签: python plot sublimetext2 visualization


    【解决方案1】:

    运行这一系列替换:

    • ( -> <ul><li>
    • ) -> </ul></li>
    • , -> </li><li>

    然后在浏览器中打开


    或者如果是python对象,使用pprint:

    >>> x = ((("A","B"),("C","D")),"E")
    >>> from pprint import pprint
    >>> pprint(x, width=1)
    ((('A',
       'B'),
      ('C',
       'D')),
     'E')
    

    或者自定义python解决方案:

    from itertools import izip
    
    def first_then(first, then):
        yield first
        while True:
            yield then
    
    def tree_lines(x):
        if type(x) is tuple:
            if len(x) == 1:
                # singular tuple
                for p, l in izip(first_then('--', '  '), tree_lines(x[0])):
                    yield p + l
            else:
                first, rest, last = x[0], x[1:-1], x[-1]
    
                # first entry
                for p, l in izip(first_then('T-', '| '), tree_lines(first)):
                    yield p + l
    
                # middle entries
                for y in rest:
                    for p, l in izip(first_then('>-', '| '), tree_lines(y)):
                        yield p + l
    
                # last entries
                for p, l in izip(first_then('L-', '  '), tree_lines(last)):
                    yield p + l
        else:
            yield str(x)
    
    x = ((('A','B'),('C','D')),'E')
    
    for l in tree_lines(x):
        print(l)
    

    【讨论】:

      【解决方案2】:

      不久前,我写了一些东西来制作树木的文字表示。可能适合这里。

      class Node:
          def __init__(self, value):
              self.value = value
              self.children = []
      
      
      pipe = chr(179)
      t = chr(195)
      l = chr(192)
      backwards_r = chr(191)
      
      def printable(node, seq_is_last_child = []):
          """returns a string representation of the given node"""
          ret = ""
          if seq_is_last_child:
              for b in seq_is_last_child[:-1]:
                  if b:
                      ret = ret + "  "
                  else:
                      ret = ret + pipe + " "
              if seq_is_last_child[-1]:
                  ret = ret + l + " "
              else:
                  ret = ret + t + " "
          ret = ret + node.value
          for idx, c in enumerate(node.children):
              ret = ret + "\n" + printable(c, seq_is_last_child + [idx == len(node.children)-1])
          return ret
      
      def make_node(t):
          """creates a Node system from a nested tuple"""
          ret = Node(backwards_r)
          for child in t:
              if isinstance(child, str):
                  ret.children.append(Node(child))
              else:
                  ret.children.append(make_node(child))
          return ret
      
      x = ((('A','B'),('C','D')),'E')
      print printable(make_node(x))
      

      结果:

      ┐
      ├ ┐
      │ ├ ┐
      │ │ ├ A
      │ │ └ B
      │ └ ┐
      │   ├ C
      │   └ D
      └ E
      

      编辑:Unicode 版本:

      class Node:
          def __init__(self, value):
              self.value = value
              self.children = []
      
      def printable(node, seq_is_last_child = []):
          """returns a string representation of the given node"""
          ret = ""
          if seq_is_last_child:
              for b in seq_is_last_child[:-1]:
                  if b:
                      ret = ret + "  "
                  else:
                      ret = ret + "│ "
              if seq_is_last_child[-1]:
                  ret = ret + "└ "
              else:
                  ret = ret + "├ "
          ret = ret + node.value
          for idx, c in enumerate(node.children):
              ret = ret + "\n" + printable(c, seq_is_last_child + [idx == len(node.children)-1])
          return ret
      
      def make_node(t):
          """creates a Node system from a nested tuple"""
          ret = Node("┐")
          for child in t:
              if isinstance(child, str):
                  ret.children.append(Node(child))
              else:
                  ret.children.append(make_node(child))
          return ret
      
      x = ((('A','B'),('C','D')),'E')
      print printable(make_node(x))
      

      【讨论】:

      • 也许你应该使用 unicode。在我的机器上,例如chr(179) == '³'。 ASCII 是 7 位的,最多只能达到 127。
      • @Hyperboreus,你是对的。我添加了一个 Unicode 版本。
      【解决方案3】:

      您可以使用迭代函数找到每个点的depthheight

      def locate(xs, depth, cnt):
          from functools import partial
          if isinstance(xs, str):
              return dict(depth=depth, height=- next(cnt), inner=None, txt=xs)
          else:
              fn = partial(locate, depth=depth+1, cnt=cnt)
              loc = list(map(fn, xs))
              height = np.mean([x['height'] for x in loc])
              return dict(depth=depth, height=height, inner=loc, txt=None)
      

      上面的函数返回一个字典,我们需要另一个函数遍历这个字典并绘制每个节点:

      def walk(loc, ax):
          col, lw = 'DarkBlue', 2
          x, y, inner, txt = map(loc.get, ['depth', 'height', 'inner', 'txt'])
          if not inner:
              ax.text(x, y, ' ' + txt, ha='left', va='center', size='large')
              return y
          else:
              ys =[walk(t, ax) for t in inner]
              for y1 in ys:
                  ax.plot([x, x+1], [y1, y1], color=col, linewidth=lw)
              ax.plot([x, x], [min(ys), max(ys)], color=col, linewidth=lw)
              return y
      

      location 函数通过传递 count 迭代器在顶层调用,并返回一个字典,其中包含绘制每个级别的所有必要信息:

      from itertools import count
      xs = ((('A','B'),('C','D')),'E',)
      loc = locate(xs, 0, count())
      

      字典和轴被传递给walk函数:

      fig = plt.figure(figsize=(2, 3))
      ax = fig.add_axes([.05, .05, .9, .9])                
      walk(loc, ax)
      
      plt.axis('off')
      xl, yl = ax.get_xlim(), ax.get_ylim()
      ax.set_xlim(xl[0] - .05, xl[1] + .05)
      ax.set_ylim(yl[0] - .05, yl[1] + .05)
      

      结果是:

      再举一个例子:

      xs = ((('A','B','C','D'),('E'),('F1','F2'),'G'),(('H1','H2'),('I','J','K'),'L'))
      

      【讨论】:

        【解决方案4】:

        使用 scipy 的cluster.hierarchy.dendrogram:

        import re
        import numpy as np
        import matplotlib.pyplot as plt
        import scipy.cluster.hierarchy as hier
        import scipy.spatial.distance as dist
        import itertools as IT
        
        def parse_nested(text, left=r'[(]', right=r'[)]', sep=r','):
            """ http://stackoverflow.com/a/17141899/190597 (falsetru) """
            pat = r'({}|{}|{})'.format(left, right, sep)
            tokens = re.split(pat, text)    
            stack = [[]]
            for x in tokens:
                if not x: continue
                if re.match(sep, x): continue
                if re.match(left, x):
                    stack[-1].append([])
                    stack.append(stack[-1][-1])
                elif re.match(right, x):
                    stack.pop()
                    if not stack:
                        raise ValueError('error: opening bracket is missing')
                else:
                    stack[-1].append(x)
            if len(stack) > 1:
                print(stack)
                raise ValueError('error: closing bracket is missing')
            return stack.pop()
        
        def place_points(datalist, x=IT.count(), y=1):
            retval = []
            for item in datalist:
                if isinstance(item, list):
                    next(x)
                    retval.extend(place_points(item, x=x, y=y*2.5))
                else:
                    retval.append([item, (next(x), y)])
            return retval
        
        # data = '(((A,B,G),(C,D,F)),(E,(H,I,J,(K,L,M,N))))'
        data = '((A,B),(C,D),E)'
        labels, points = zip(*place_points(parse_nested(data)))
        d = dist.pdist(points)
        linkage_matrix = hier.linkage(d)
        P = hier.dendrogram(linkage_matrix, labels=labels)
        plt.show()
        

        【讨论】:

          猜你喜欢
          • 2012-01-27
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2013-06-23
          • 1970-01-01
          • 2020-08-09
          • 2021-08-14
          相关资源
          最近更新 更多