Neo4j 密码查询结果进入 Pandas DataFrame答案

【问题标题】：Neo4j cypher query results into Pandas DataFrameNeo4j 密码查询结果进入 Pandas DataFrame
【发布时间】：2021-05-26 06:13:27
【问题描述】：

我正在尝试读取具有节点 ID 及其各自关系的 csv 文件。前两列代表节点，第三列代表它们之间的关系。到目前为止，我能够在 neo4j 中创建数据库，但我不确定将所需数据提取到 pandas DataFrame 中的密码查询是什么！

我将在这里使用大型数据集的子集来说明我的问题。原始数据集包含数千个节点和关系。

我的 csv 文件（Node1_id、Node2_id、relation_id）如下所示：

0   1   1
4   2   1
44  3   1
0   4   1
0   5   1
4   10173   3
4   10191   2
4   10192   2
6   10193   2
8   10194   2
3   10195   2
6   10196   2

这里是节点创建和通过从 csv 文件加载 id 来定义节点之间的关系。（我想这张图是正确的，但如果您发现任何问题，请告诉我）我正在使用 csv 文件中的 id 为节点和关系分配一个属性“id”。

LOAD CSV WITH HEADERS FROM  'file:///edges.csv' AS row FIELDTERMINATOR ","
WITH row
WHERE row.relation_id = '1'
MERGE (paper:Paper{id:(row.Node1_id)})
MERGE (author:Author{id:(row.Node2_id)})
CREATE (paper)-[au:AUTHORED{id: '1'}]->(author);

到目前为止，我已经尝试过这样的事情：

    query = ''' MATCH (paper)-[au:AUTHORED{id: '1'}]->(author) RETURN paper,author LIMIT 3; ''' 
    result = session.run(query)
    df = DataFrame(result)

    for dataF in df.itertuples(index=False):
    print(row)

它返回这个：

0   1
0   (id)    (id)
1   (id)    (id)
2   (id)    (id)

期望的结果：

我希望通过从 graphDB 中查询数据并逐行迭代结果，以带有节点 id 和关系 id 的格式（如上面 csv 中定义）将结果放入 pandas DataFrame。

0   1   1
4   2   1
44  3   1
0   4   1
0   5   1
4   10173   3
4   10191   2
4   10192   2
6   10193   2
8   10194   2
3   10195   2
6   10196   2

我也很想知道密码查询对象的返回类型是什么，在这种情况下它是pandas.core.frame.DataFrame，但是我如何在密码查询期间访问节点和关系的独立属性。这是主要问题。

请随时详细解释，非常感谢您的帮助。

使用 neo4j 版本：4.2.1

【问题讨论】：

标签： python pandas neo4j cypher graph-databases

【解决方案1】：

我正在使用 py2neo，所以如果您使用不同的方式，您可以使用它或告诉我您使用的是哪个 neo4j 库，我将编辑我的答案。

#1：期望的结果

我希望以节点 ID 和格式将结果放入 pandas DataFrame 中通过查询来自 graphDB 并逐行迭代结果。

 from py2neo import Graph 
 from pandas import DataFrame
 # remove search by au.id='1' and limit so that you will get all 
 # return the id in your query 
 session = Graph("bolt://localhost:7687", auth=("neo4j", "****"))
 query = ''' MATCH (paper)-[au:AUTHORED{id: '1'}]->(author) RETURN paper.id, author.id, au.id LIMIT 3; ''' 
 # access the result data
 result = session.run(query).data() 
 # convert result into pandas dataframe 
 df = DataFrame(result)
 df.head()

结果：

0   1   1
4   2   1
44  3   1

#2：另一个问题

如何访问节点和关系的独立属性密码查询 ANS：节点内的属性是 dict 所以使用 get 函数

 # Note that we are returning the nodes and not ids
 query = ''' MATCH (paper)-[au:AUTHORED{id: '1'}]->(author) RETURN paper, author, au LIMIT 3; ''' 
result = session.run(query).data() 
print ("What is data type of result? ", type(result))
print ("What is the data type of each item? ", type(result[0]))
print ("What are the keys of the dictionary? ", result[0].keys())
print ("What is the class of the node? ", type(result[0].get('paper')))
print ("How to access the first node? ", result[0].get('paper'))
print ("How to access values inside the node? ", result[0].get('paper',{}).get('id'))

Result:
What is data type of result?  <class 'list'>
What is the data type of each item?  <class 'dict'>
What are the keys of the dictionary?  dict_keys(['paper', 'author', 'au'])
What is the class of the node?  <class 'py2neo.data.Node'>
How to access the first node?  (_888:paper {id: '1'})
How to access values inside the node?  '1'

【讨论】：