【发布时间】:2021-10-15 08:43:18
【问题描述】:
我目前正在从事一个项目,该项目需要我使用特定数据集进行一些 xQueries。这是上述数据集的一小部分:
<root>
<Transaction>
<Person>
<Full_Name> Katherine Eaton</Full_Name>
<Age>87</Age>
<Ssn>314-44-0462</Ssn>
<Credit_Card>
<Cc_Provider>JCB 15 digit</Cc_Provider>
<Cc_Number>5547858204343354 </Cc_Number>
</Credit_Card>
<Bought_From>
<Date>2021-02-13</Date>
<Price>$34478.90</Price>
<Status>Undisputed </Status>
<Merchant>
<Shop> McDonalds</Shop>
<Phone>+1-371-602-9171x83395</Phone>
<Resides_In>
<Province>Colorado</Province>
<City>West Morgantown</City>
<Address>834 Walker Canyon</Address>
<Lat>80.2658445</Lat>
<Lon>156.324095 </Lon>
</Resides_In>
</Merchant>
</Bought_From>
</Person>
</Transaction>
<Transaction>
<Person>
<Full_Name> Charles Wright</Full_Name>
<Age>55</Age>
<Ssn>420-62-7501</Ssn>
<Credit_Card>
<Cc_Provider>Diners Club / Carte Blanche</Cc_Provider>
<Cc_Number>4743336688954504 </Cc_Number>
</Credit_Card>
<Bought_From>
<Date>2020-09-24</Date>
<Price>$477.99</Price>
<Status>Undisputed </Status>
<Merchant>
<Shop> Subway</Shop>
<Phone>6922856236</Phone>
<Resides_In>
<Province>Wisconsin</Province>
<City>West Sherri</City>
<Address>807 Cordova Ferry</Address>
<Lat>-6.079631</Lat>
<Lon>-150.485761 </Lon>
</Resides_In>
</Merchant>
</Bought_From>
</Person>
</Transaction>
<Transaction>
<Person>
<Full_Name> Scott Gibbs</Full_Name>
<Age>52</Age>
<Ssn>717-01-2401</Ssn>
<Credit_Card>
<Cc_Provider>VISA 19 digit</Cc_Provider>
<Cc_Number>371936215412640 </Cc_Number>
</Credit_Card>
<Bought_From>
<Date>2021-01-06</Date>
<Price>$2.52</Price>
<Status>Disputed </Status>
<Merchant>
<Shop> American Apparel</Shop>
<Phone>(453)737-9365</Phone>
<Resides_In>
<Province>Nebraska</Province>
<City>Sheilamouth</City>
<Address>70734 Frye Ridge</Address>
<Lat>51.8881985</Lat>
<Lon>-147.147829 </Lon>
</Resides_In>
</Merchant>
</Bought_From>
</Person>
</Transaction>
<Transaction>
<Person>
<Full_Name> Wesley Underwood</Full_Name>
<Age>82</Age>
<Ssn>265-39-3658</Ssn>
<Credit_Card>
<Cc_Provider>Discover</Cc_Provider>
<Cc_Number>30354748203291 </Cc_Number>
</Credit_Card>
<Bought_From>
<Date>2021-07-20</Date>
<Price>$691.93</Price>
<Status>Disputed </Status>
<Merchant>
<Shop> Amazon</Shop>
<Phone>(274)381-6022</Phone>
<Resides_In>
<Province>Minnesota</Province>
<City>Jorgeview</City>
<Address>877 Debra Way Apt. 305</Address>
<Lat>-59.405851</Lat>
<Lon>3.413555 </Lon>
</Resides_In>
</Merchant>
</Bought_From>
</Person>
</Transaction>
<Transaction>
<Person>
<Full_Name> Scott Gibbs</Full_Name>
<Age>52</Age>
<Ssn>717-01-2401</Ssn>
<Credit_Card>
<Cc_Provider>VISA 19 digit</Cc_Provider>
<Cc_Number>371936215412640 </Cc_Number>
</Credit_Card>
<Bought_From>
<Date>2020-12-03</Date>
<Price>$1.21</Price>
<Status>Disputed </Status>
<Merchant>
<Shop> Amazon</Shop>
<Phone>(274)381-6022</Phone>
<Resides_In>
<Province>Minnesota</Province>
<City>Jorgeview</City>
<Address>877 Debra Way Apt. 305</Address>
<Lat>-59.405851</Lat>
<Lon>3.413555 </Lon>
</Resides_In>
</Merchant>
</Bought_From>
</Person>
</Transaction>
我想通过此查询获得所有交易中争议最多的人:
for $xml in
doc("dataset_100.xml")/root/Transaction
where $xml//Status = "Disputed"
for $x in
(
for $name in distinct-values(//Full_Name)
order by count(//Full_Name[. = $name]) descending
return <x>{$name}</x>
)
return fn:concat(
$x,
' - Contexted Transactions - ',
xs:string(count(//Full_Name[. = $x])))
但结果是每次从第一个元素到最后一个元素的列表中包含所有进行的交易,无论是有争议的还是无争议的:
`Katherine Eaton - Contexted Transactions - 3
Charles Wright - Contexted Transactions - 6
Scott Gibbs - Contexted Transactions - 3
Wesley Underwood - Contexted Transactions - 3
Andres Hanna - Contexted Transactions - 2`
我知道这是不正确的,因为我已经在 neo4j 中对其进行了测试,但我现在真的不知道该把手放在哪里。
编辑:我实际上发现了如何编写以前不允许我发布的代码的缺失部分。所以,我真的很抱歉,我要感谢你的回答,Martin Honnen,但这是实际的 xml。
【问题讨论】:
-
我当然希望这些完全是虚构的姓名、SSN 和信用卡号。
-
您的查询预计会有
<root>和<Transaction>,但您的示例数据不包括这些内容。请使其具有代表性。 -
顺便说一句,在 XQuery 3.0 中,您可以使用
declare context item将您的文档和查询一起嵌入到可以直接一起运行的单个事物中。这使得构建(和测试)完全独立的minimal reproducible examples 变得更加容易。 -
("请使其具有代表性" -- 意思是,"请测试您的查询实际上可以针对您提供的数据运行,并且它返回您在运行时声明的特定输出该数据”)
-
是的,数据完全是用python中的faker类随机生成的