【问题标题】:Is there a regex to recursively modify a list and change its elements in python3?是否有正则表达式可以递归地修改列表并在 python3 中更改其元素?
【发布时间】:2020-02-02 10:44:01
【问题描述】:

我有以下格式的字符串列表:

["6,7",
"6-8",
"10,12",
"15-18"]

我需要将字符串拆分为单独的元素。如果有,,我只需要拆分元素。如果有-,我需要生成一个数字范围以包含其间的数字。

示例:'6,7' 拆分为 ['6','7'],而 '6-8' 更改为 ['6','7','8']

我写了这个函数,它非常适合:

def process_nums(verse_nums_):
     if ',' in verse_nums_:
         verse_nums = [i for i in map(str.strip,verse_nums_.split(','))]
     elif '-' in verse_nums_:
         beg_end = [int(i) for i in map(str.strip,verse_nums_.split('-'))]
         verse_nums = [i for i in range(beg_end[0],beg_end[1]+1)]
     else:
         verse_nums = [verse_nums_]
     return verse_nums

但是我遇到了一个字符串:'6-8,10'。这应该更改为['6','7','8','10']。我可以进行初始拆分以获得['6-8','10']

我已经写了一些关于通过代码的小回合:

verse_nums = process_nums('6-8,10')

        for x in verse_nums:
            if '-' in x:
                verse_nums.extend(process_nums(x))
                verse_nums.pop(verse_nums.index(x))
        verse_nums = [int(i) for i in verse_nums].sort()

有没有更优雅的方法来做到这一点?

注意: 我不确定如何在标题中正确地提出问题。请随意修改。

【问题讨论】:

    标签: python regex list


    【解决方案1】:

    我认为你很接近。不需要正则表达式。我会做的总是用逗号分割,然后新的部分要么包含一个范围,要么是单个项目。

    def process_nums(nums):
      parts = nums.split(',')
      for part in parts:
        if '-' in part:
          a, b = part.split('-')
          yield from (str(i) for i in range(int(a), int(b)+1))
        else:
          yield part
    
    print(list(process_nums('6-8,10')))
    

    【讨论】:

      【解决方案2】:

      IMO 正则表达式更好,因为str.split 可能无法检测到无效输入,例如:“,-,-2”

      import re
      from typing import List
      
      
      def process(numbers: List[str]) -> List[str]:
          output = []
          for no_idea_what_this_is in numbers:
              for value in no_idea_what_this_is.split(","):
                  match = re.fullmatch(r"(\d+)-(\d+)", value)
                  if match:
                      start = int(match.group(1))
                      stop = int(match.group(2)) + 1
                      output.extend([str(i) for i in range(start, stop)])
                  elif re.fullmatch("\d+", value):
                      output.append(value)
                  else:
                      raise ValueError(f"Unable to parse {value}")
          return output
      
      
      print(process(["4-8,10"]))
      # ['4', '5', '6', '7', '8', '10']
      
      

      【讨论】:

        【解决方案3】:

        试试这个:

        mylist = ["6,7","6-8","10,12","15-18"]
        new_list = []
        
        
        
        for i in mylist :
            if ',' in i :
                splited = i.split(',')
                new_list.append(splited[0])
                new_list.append(splited[1])
            elif '-' in i :
                splited = i.split('-')
                x = range(int(splited[0]),int(splited[1])+1)
                for y in x :
                    new_list.append(str(y))
        
        print(new_list)
        

        输出:

        ['6', '7', '6', '7', '8', '10', '12', '15', '16', '17', '18']
        

        【讨论】:

          【解决方案4】:

          我们可以对GENERATE LIST OF NUMBERS FROM HYPHENATED AND COMMA SEPARETED STRING LIKE "1-5,25-30,4,5" (PYTHON RECIPE)的代码做一个小修改

          这种方法的优点(过度发布)是它能够处理更复杂的重叠范围,例如:

          为:

          '2,3,4-8,2-5,9'
          

          生产

          ['2', '3', '4', '5', '6', '7', '8', '9'] 
          

          当公认的解决方案产生时

          ['2', ' 3', '4', '5', '6', '7', '8', '2', '3', '4', '5', ' 9']
          

          哪个有重复的索引

          def hyphen_range(s):
              """ Takes a range in form of "a-b" and generate a list of numbers between a and b inclusive.
              Also accepts comma separated ranges like "a-b,c-d,f" will build a list which will include
              Numbers from a to b, a to d and f"""
              s= "".join(s.split())#removes white space
              r= set()
              for x in s.split(','):
                  t=x.split('-')
                  if len(t) not in [1,2]: raise SyntaxError("hash_range is given its argument as "+s+" which seems not correctly formatted.")
                  r.add(int(t[0])) if len(t)==1 else r.update(set(range(int(t[0]),int(t[1])+1)))
              l=list(r)
              l.sort()
              return list(map(str, l))  # added string conversion
          
          # Test shows handling of overlapping ranges and commas in pattern
          # i.e. '2, 3, 4-8, 2-5, 9'
          for x in ["6,7", "6-8", "10,12", "15-18", '6-8,10', '2, 3, 4-8, 2-5, 9']:
            print(f"'{x}' -> {hyphen_range(x)}")
          

          输出

          '6,7' -> ['6', '7']
          '6-8' -> ['6', '7', '8']
          '10,12' -> ['10', '12']
          '15-18' -> ['15', '16', '17', '18']'6-8,10' -> ['6', '7', '8', '10']
          '6-8,10' -> ['6', '7', '8', '10']
          '2, 3, 4-8, 2-5, 9' -> ['2', '3', '4', '5', '6', '7', '8', '9']
          

          生成器版本

          def hyphen_range_generator(s):
              """ yield each integer from a complex range string like "1-9,12, 15-20,23"
          
              >>> list(hyphen_range('1-9,12, 15-20,23'))
              [1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 15, 16, 17, 18, 19, 20, 23]
          
              >>> list(hyphen_range('1-9,12, 15-20,2-3-4'))
              Traceback (most recent call last):
                  ...
              ValueError: format error in 2-3-4
              """
              for x in s.split(','):
                  elem = x.split('-')
                  if len(elem) == 1: # a number
                      yield int(elem[0])
                  elif len(elem) == 2: # a range inclusive
                      start, end = map(int, elem)
                      for i in range(start, end+1):
                          yield str(i)  # only mod to posted software
                  else: # more than one hyphen
                      raise ValueError('format error in %s' % x)
          
          # Need to use list(...) to see output since using generator
          for x in ["6,7", "6-8", "10,12", "15-18", '6-8,10', '2, 3, 4-8, 2-5, 9']:
            print(f"'{x}' -> {list(hyphen_range_generator(x))}")
          

          输出

          Same as the non-generator version above
          

          【讨论】:

            猜你喜欢
            • 2011-09-17
            • 2011-09-14
            • 2010-10-27
            • 2013-02-07
            • 2014-04-14
            • 1970-01-01
            相关资源
            最近更新 更多