【问题标题】:Python, how to move an if statement outside of a for loop without duplicating the loopPython,如何在不复制循环的情况下将 if 语句移到 for 循环之外
【发布时间】:2017-08-07 19:54:03
【问题描述】:

我可以用两种方式编写我的函数。

def output_ip_hist(target, final, stats, table_name, bulk_qty, type = "sql"):
    if(type == "sql"):
        field_names = ",".join(get_field_names(final, table_name))
        count = 0
        stats[table_name] = 0
        values = []
        for comp_name, row in final.items():
            for ip_address, sub_row in row.items():
                for index, ip_hist in enumerate(sub_row):
                    hist_item = ip_hist.replace('"', "'")
                    values.append('("' + comp_name + '", "' + ip_address + '", ' + str(index) + ',"' + hist_item + '")')
                    count += 1
                    if(count == bulk_qty):
                        insert_sql_many(target, count, table_name, field_names, values, stats)
                        count = 0
                        values = []
        if(count != 0):
            insert_sql_many(target, count, table_name, field_names, values, stats)
    elif(type == "csv"):
        for comp_name, row in final.items():
            for ip_address, sub_row in row.items():
                for index, ip_hist in enumerate(sub_row):
                    insert_csv(target, { "computer_name": comp_name, "id": str(index), "ip_address": ip_address, "hist_item": ip_hist.replace('"', "'") }, stats, table_name)

这是第一种方式。这种方式的缺点是循环被写了两次,造成了一些重复。

第二种方法是将最外面的 if 语句移到循环内,这样循环只完成一次,但这样做的缺点是 if 语句在每个循环上都执行,从而减慢了循环速度,这可能循环通过 400 万条记录。

我想知道是否有可能两全其美,减少重复并尽可能快地保持循环。

谢谢!

【问题讨论】:

    标签: python loops optimization


    【解决方案1】:

    您可以定义两个函数:process_sqlprocess_csv。根据data_type,您可以将process_data 设置为第一个函数或第二个函数。

    在循环内,您可以使用process_data

    def process_sql(a,b):
        print "SQL"
        return a-b
    
    def process_csv(a,b):
        print "CSV"
        return a+b
    
    data_type = "CSV"
    
    if data_type == "CSV":
        process_data = process_csv
    else:
        process_data = process_sql
    
    for a in range(3):
        for b in range(3):
            print process_data(a,b)
    #   CSV 
    #   0
    #   CSV
    #   1
    #   CSV
    #   2
    #   CSV
    #   1
    #   CSV
    #   2
    #   CSV
    #   3
    #   CSV
    #   2
    #   CSV
    #   3
    #   CSV
    #   4
    

    【讨论】:

      【解决方案2】:

      您可以尝试编写一个生成器来从嵌套循环中生成值流。这也让您可以使用itertools.islice 来简化您的 SQL 批处理代码。

      def my_generator(final):
          for comp_name, row in final.items():
              for ip_address, sub_row in row.items():
                  for index, ip_hist in enumerate(sub_row):
                     yield comp_name, ip_address, index, ip_hist.replace('"', "'")
      
      def output_ip_hist(target, final, stats, table_name, bulk_qty, type = "sql"):
          items = my_generator(final)
          if type == "sql":
              field_names = ",".join(get_field_names(final, table_name))
              stats[table_name] = 0
              while True:
                  values = ['("%s", "%s", "%s", "%s")' % i for i in islice(items, bulk_qty)]
                  if not values:
                      break                
                  insert_sql_many(target, len(values), table_name, field_names, values, stats)
          elif type == "csv":
              for comp_name, ip_address, index, hist_item in items:
                  blob = {
                      "computer_name": comp_name,
                      "id": str(index),
                      "ip_address": ip_address,
                      "hist_item": hist_item
                  }
                  insert_csv(target, blob, stats, table_name)
      

      不过,将实际上是两个不同的函数合并为一个包装if 语句的函数,这在某种程度上是一种反模式。

      def output_ip_hist_sql(target, final, stats, table_name, bulk_qty):
          field_names = ",".join(get_field_names(final, table_name))
          stats[table_name] = 0
      
          items = my_generator(final)
          while True:
              values = ['("%s", "%s", "%s", "%s")' % i for i in islice(items, bulk_qty)]
              if not values:
                  break                
              insert_sql_many(target, len(values), table_name, field_names, values, stats)
      
      def output_ip_hist_csv(target, final, stats, table_name):
          items = my_generator(final)
          for comp_name, ip_address, index, hist_item in items:
              blob = {
                  "computer_name": comp_name,
                  "id": str(index),
                  "ip_address": ip_address,
                  "hist_item": hist_item
              }
              insert_csv(target, blob, stats, table_name)    
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2013-05-30
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多