【发布时间】:2020-06-19 12:57:49
【问题描述】:
我有以下列,其中值被重复任意次数:
FRUIT
Apples
Bananas
Bananas
Pear
Pear
Pear
Pear
Melon
Melon
Melon
Melon
Melon
Melon
Orange
Orange
Orange
Orange
Orange
Orange
Orange
Orange
Orange
我想为每个值分配一个索引号,但是对于重复的值,我想重复该索引号最多 4 次。如果该值出现 10 次,我希望索引重复前四个,然后索引 + 1 到第二个四个,依此类推。例如:
Index FRUIT
1 Apples
2 Bananas
2 Bananas
3 Pear
3 Pear
3 Pear
3 Pear
4 Melon
4 Melon
4 Melon
4 Melon
5 Melon
5 Melon
6 Orange
6 Orange
6 Orange
6 Orange
7 Orange
7 Orange
7 Orange
7 Orange
8 Orange
这是我的尝试:
fruit = {'FRUIT':['Apples','Bananas','Bananas','Pear','Pear','Pear','Pear','Melon','Melon','Melon','Melon','Melon','Melon','Orange','Orange','Orange','Orange','Orange','Orange','Orange','Orange','Orange']}
fruit_df = pd.DataFrame(fruit)
index = 0
index_and_fruit = []
for (columnName, columnData) in fruit_df.iteritems():
fruit_list = fruit_df['FRUIT'].tolist()
index = index + 1
for i in fruit_list:
if fruit_list.count(i) >= 4:
index = index + 1
index_with_fruit_list = {i:index}
index_and_fruit.append(index_with_fruit_list)
if fruit_list.count(i) >= 8:
index_with_fruit_list = {i:index}
index_and_fruit.append(index_with_fruit_list)
else:
index_with_fruit_list = {i:index}
index_and_fruit.append(index_with_fruit_list)
print(index_and_fruit)
【问题讨论】:
标签: python indexing duplicates