【发布时间】:2021-02-01 17:08:11
【问题描述】:
我尝试生成以下序列。
text = ACCCEBCE
target = 000000D0
生成不同字符的随机文本。在文本序列中,如果找到以下子序列,则目标为D或E,否则目标为0。
ABC --> D
BCD --> E
我编写以下代码。如果我生成少量字符,它会很好地工作。但如果我让 timesteps = 1000 等,它不会给出任何输出。
import string
import random as rn
import numpy as np
def is_subseq(x, y):
it = iter(y)
return all(any(c == ch for c in it) for ch in x)
def count(a, b, m, n):
# If both first and second string
# is empty, or if second string
# is empty, return 1
if ((m == 0 and n == 0) or n == 0):
return 1
# If only first string is empty
# and second string is not empty,
# return 0
if (m == 0):
return 0
# If last characters are same
# Recur for remaining strings by
# 1. considering last characters
# of both strings
# 2. ignoring last character
# of first string
if (a[m - 1] == b[n - 1]):
return (count(a, b, m - 1, n - 1) +
count(a, b, m - 1, n))
else:
# If last characters are different,
# ignore last char of first string
# and recur for remaining string
return count(a, b, m - 1, n)
# create a sequence classification instance
def get_sequence(n_timesteps):
alphabet="ABCDE"#string.ascii_uppercase
text = ''.join(rn.choices(alphabet, k=n_timesteps))
print(text)
seq_length=3
subseqX = []
subseqY = []
for i in range(0, len(alphabet) - seq_length, 1):
seq_in = alphabet[i:i + seq_length]
seq_out = alphabet[i + seq_length]
subseqX.append([char for char in seq_in])
subseqY.append(seq_out)
print(seq_in, "\t-->\t",seq_out)
y2 = []
match = 0
countlist=np.zeros(len(subseqX))
for i, val in enumerate(text):
found = False
counter = 0
for g, val2 in enumerate(subseqX):
listToStr = ''.join(map(str, subseqX[g]))
howmany = count(text[:i], listToStr, len(text[:i]),len(listToStr))
if is_subseq(listToStr, text[:i]):
if countlist[g] < howmany:
match = match + howmany
countlist[g] = howmany
temp = g
found = True
if found:
y2.append(subseqY[temp])
else:
y2.append(0)
print("counter:\t", counter)
print(text)
print(y2)
# define problem properties
n_timesteps = 100
get_sequence(n_timesteps)
这可能是因为递归函数的深度。但我需要生成 1000 或 10000 个字符。 我该如何解决这个问题?有什么想法吗?
【问题讨论】:
-
您可以创建所需子序列的字典,例如
d={'ABC':'A', 'BCD':B},您将在其中监视下一个所需字符以堆叠需求。找到最后一个字符后,用所需的字母填充列表(为此使用另一个字典)并从头开始重新开始循环。对不起,它需要很多代码,我没有时间处理它跨度>