【问题标题】:How to loop throuh a line inside a file by regex as loop variable如何通过正则表达式作为循环变量循环文件内的一行
【发布时间】:2019-01-20 10:32:20
【问题描述】:

我正在尝试为 json 文件制作类似 Explode 函数的东西。循环应该逐行获取一个 json 文件,并且在每一行中我有多个值,我想从该行中提取并将其与主行放在一起(如 SQL 中的横向视图或 Explode 函数)

数据看起来像这样

{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key2":103717,"wl_key3":589101,"wl_key4":23095,"wl_key5":200527,"wl_key6":60319}

现在我想要的就是 SQL 爆炸这个

{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key2":103717}
{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key3":589101}
{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key4":23095}
{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key5":200527}


 import io
 import sys
 import re

 i = 0
 with io.open('lateral_result.json', 'w', encoding="utf-8") as f, io.open('lat.json', encoding="utf-8") as g:
for line in g:
    x = re.search('(.*wl_timestamp":"[^"]+",)', line)
    y = re.search('("wl_key[^,]+),', line)
    for y in line:
        i = i + 1
        print (x.group(0), y.group(i),'}', file=f)    

我总是得到一个错误,我无法将 str 作为组,但是当我将正则表达式放在下一个 for 循环中时,它只会让我得到第一个结果并且什么都不做,或者以另一种方式它只是需要相同的结果并在行中找到一个字符时将其写入。

【问题讨论】:

  • 为什么要用regex解析json?使用 json.load() 并检查创建的数据结构? What is a XY-Problem?
  • 标签 explodelateral 具有误导性-explode 是 PHP 而不是 python,横向仅由 3 ppl 观看-更好的标签 python in除了 python-3.x。通过标记 explode,您可以针对无法真正帮助使用 python 的 PHP 开发人员。

标签: regex python-3.x for-loop explode lateral


【解决方案1】:

不要在 json 上使用正则表达式 - 在 json 上使用 json 并操作数据结构:

import json

data_str = """{"wl_id":0,"wl_customer_id":0,"wl_webpage_name":"webpage#00","wl_timestamp":"2013-01-27 16:07:02","wl_key2":103717,"wl_key3":589101,"wl_key4":23095,"wl_key5":200527,"wl_key6":60319}"""

data = json.loads(data_str)  # you can use json.load( file_handle )

print(data)

for k in (x for x in data.keys() if x.startswith("wl_key")):
    print(data["wl_timestamp"],k,data[k])

输出:

2013-01-27 16:07:02 wl_key2 103717
2013-01-27 16:07:02 wl_key3 589101
2013-01-27 16:07:02 wl_key4 23095
2013-01-27 16:07:02 wl_key5 200527
2013-01-27 16:07:02 wl_key6 60319

【讨论】:

    【解决方案2】:

    这里是解决我的案例的代码

    import json
    import io
    import sys
    import re
    
    with io.open('lateral_result.json', 'w', encoding="utf-8") as f, io.open('lat.json', encoding="utf-8") as g:
        for line in g:
            l = str(line)
            data = json.loads(l)  
            for k in (x for x in data.keys() if x.startswith("wl_key")):
                 x = re.search('(.*wl_timestamp":"[^"]+",")', line)
                 print(x.group(0)+str(k)+'":'+str(data[k])+'}', file=f)
    

    【讨论】:

      猜你喜欢
      • 2011-08-11
      • 1970-01-01
      • 2012-10-03
      • 2014-01-19
      • 2012-10-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2011-10-17
      相关资源
      最近更新 更多