或者在 Gremlin 中使用 Match 语句的语句答案

【问题标题】：Or statement with Match statement in Gremlin或者在 Gremlin 中使用 Match 语句的语句
【发布时间】：2020-04-29 18:21:42
【问题描述】：

我有一个具有以下架构的 Janusgraph 数据库：

（期刊）

我正在尝试使用 gremlin match() 子句编写查询，该子句将搜索两个不同的期刊和标题中带有关键字的相关论文和作者。这是我到目前为止所拥有的：

sg = g.V().match(
    __.as('a').has('Journal', 'displayName', textContains('Journal Name 1')),
    __.as('a').has('Journal', 'displayName', textContains('Journal Name 2')),
    __.as('a').inE('PublishedIn').subgraph('sg').outV().as('b'), 
    __.as('b').has('Paper', 'paperTitle', textContains('My Key word')),
    __.as('b').inE('AuthorOf').subgraph('sg').outV().as('c')).
 cap('sg').next()

此查询成功运行，但返回 0 个顶点和 0 个边。如果我将查询分成两部分并分别搜索每个 Journal displayName 我会得到完整的图表，所以我认为我的查询的逻辑/语法有问题。

如果我这样写查询：

sg = g.V().or(has('JournalFixed', 'displayName', textContains('Journal Name 1')),
              has('JournalFixed', 'displayName', textContains('Journal Name 2'))).
              inE('PublishedInFixed').subgraph('sg').
              outV().has('Paper', 'paperTitle', textContains('My Key word')).
              inE('AuthorOf').subgraph('sg').
              outV().
              cap('sg').
              next()

它返回一个包含大约 7000 个节点的网络。如何重新编写此查询以使用 match() 子句？

【问题讨论】：

标签： gremlin graph-databases janusgraph

【解决方案1】：

我不确定这是否是您的全部问题，但我认为您的 match() 正在将您的“显示名称”步骤建模为 and() 而不是 or()。您可以使用 profile() 进行检查，就像我在这里使用 TinkerGraph 所做的那样：

gremlin> g.V().match(__.as('a').has('name','marko'), __.as('a').has('name','josh')).profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[name.eq(marko), name.eq...                                             0.067   100.00
                                            >TOTAL                     -           -           0.067        -

我想你可以通过多种方式解决这个问题。对于我使用within() 的示例，如earlier question from you 的不同答案中所述，效果很好：

gremlin> g.V().match(__.as('a').has('name', within('marko','josh'))).profile()
==>Traversal Metrics
Step                                                               Count  Traversers       Time (ms)    % Dur
=============================================================================================================
TinkerGraphStep(vertex,[name.within([marko, jos...                     2           2           0.098   100.00
                                            >TOTAL                     -           -           0.098        -

对于你的情况，我会替换：

or(has('JournalFixed', 'displayName', textContains('Journal Name 1')),
   has('JournalFixed', 'displayName', textContains('Journal Name 2')))

与：

has('JournalFixed', 'displayName', textContains('Journal Name 1').
                                   or(textContains('Journal Name 2'))

基本上利用了P.or()。我认为这些选项中的任何一个都应该比预先使用or()-step 更好，但只有JanusGraph 的profile() 会告诉我们here 的讨论。

说了这么多，我想知道为什么你的or()不能直接翻译成match()如下：

g.V().match(
    __.as('a').or(has('Journal', 'displayName', textContains('Journal Name 1')),
                  has('Journal', 'displayName', textContains('Journal Name 2'))),
    __.as('a').inE('PublishedIn').subgraph('sg').outV().as('b'), 
    __.as('b').has('Paper', 'paperTitle', textContains('My Key word')),
    __.as('b').inE('AuthorOf').subgraph('sg').outV().as('c')).
 cap('sg')

我想我对P.or() 的建议的性能要好得多。

【讨论】：

谢谢斯蒂芬，这很有帮助。另外，我不知道profile() 我认为这会有所帮助。在我掌握 Gremlin 时，我非常感谢您的帮助