【问题标题】:Apply clustering algorithm to group similar actors应用聚类算法对相似的参与者进行分组
【发布时间】:2021-03-15 10:40:36
【问题描述】:
data=pd.read_csv('movie_actor_network.csv', index_col=False, names=['movie','actor'])
from gensim.models import Word2Vec
model = Word2Vec(walks, size=128, window=5)
model.wv.vectors.shape

输出:

(4703, 128)

node_ids

node_ids = model.wv.index2word  # list of node IDs
node_embeddings = model.wv.vectors  # numpy.ndarray of size number of nodes times embeddings dimensionality
node_targets = [ A.node[node_id]['label'] for node_id in node_ids]

现在使用函数

 def data_split(node_ids,node_targets,node_embeddings):
        '''In this function, we will split the node embeddings into actor_embeddings , movie_embeddings '''
        actor_nodes,movie_nodes=[],[]
        actor_embeddings,movie_embeddings=[],[]
        # split the node_embeddings into actor_embeddings,movie_embeddings based on node_ids
        actor_embedding = [x for i,x in enumerate(node_embeddings) if node_targets[i]=='actor']
        actor_embeddings.append(actor_embedding)
        actor_node = [x for i,x in enumerate(node_ids) if node_targets[i]=='actor']
        actor_nodes.append(actor_node)
        movie_embedding = [x for i,x in enumerate(node_embeddings) if node_targets[i]=='movies']
        movie_embeddings.append(movie_embedding)
        movie_node = [x for i,x in enumerate(node_ids) if node_targets[i]=='movie']
        movie_nodes.append(movie_node)
    
        # By using node_embedding and node_targets, we can extract actor_embedding and movie embedding
        # By using node_ids and node_targets, we can extract actor_nodes and movie nodes
    
        return actor_nodes,movie_nodes,actor_embeddings,movie_embeddings

分级功能 - 1

def grader_actors(data):
    assert(len(data)==3411)
    return True
grader_actors(actor_nodes)

现在出错了

NameError                                 Traceback (most recent call last)
<ipython-input-30-ee1852cb1df5> in <module>
      2     assert(len(data)==3411)
      3     return True
----> 4 grader_actors(actor_nodes)

NameError: name 'actor_nodes' is not defined

我们如何解决它?

【问题讨论】:

    标签: python python-3.x machine-learning


    【解决方案1】:

    该函数将返回列表中的列表,因此您仍然会收到由 def Grader_actors(data) 抛出的断言错误。如果您希望您的 Grader_actor 函数不引发断言错误,请使用以下代码:

    def data_split(node_ids,node_targets,node_embeddings):
        '''In this function, we will split the node embeddings into actor_embeddings , movie_embeddings '''
        actor_nodes,movie_nodes=[],[]
        actor_embeddings,movie_embeddings=[],[]
        actor_embedding = [actor_embeddings.append(x) for i,x in enumerate(node_embeddings) if node_targets[i]=='actor']
        actor_node = [actor_nodes.append(x) for i,x in enumerate(node_ids) if node_targets[i]=='actor']
        movie_embedding = [movie_embeddings.append(x) for i,x in enumerate(node_embeddings) if node_targets[i]=='movies']
        movie_node = [movie_nodes.append(x) for i,x in enumerate(node_ids) if node_targets[i]=='movie']
    
        return actor_nodes,movie_nodes,actor_embeddings,movie_embeddings
    

    【讨论】:

      猜你喜欢
      • 2018-01-14
      • 2015-07-20
      • 2012-04-27
      • 2014-07-13
      • 2017-12-21
      • 1970-01-01
      • 2018-05-28
      • 2019-09-26
      • 1970-01-01
      相关资源
      最近更新 更多