Neo4j 遍历框架超时答案

【问题标题】：Timeout on Neo4j traversal frameworkNeo4j 遍历框架超时
【发布时间】：2015-09-02 10:04:51
【问题描述】：

我有一个非常大的图表，其中包含数亿个节点和关系，我需要在其中进行遍历以查找特定节点是否与另一个包含特定属性的节点相连。
数据是高度互连的，对于一对节点，可以有多个关系链接它们。

鉴于此操作需要在实时系统上运行，我有非常严格的时间限制，需要不超过 200 毫秒才能找到可能的结果。

所以我创建了以下 TraversalDescriptor：

TraversalDescription td = graph.traversalDescription()
                           .depthFirst()
                           .uniqueness(Uniqueness.NODE_GLOBAL)
                           .expand(new SpecificRelsPathExpander(requiredEdgeProperty)
                           .evaluator(new IncludePathWithTargetPropertyEvaluator(targetNodeProperty));

评估器检查每条路径是否末端节点是我的目标，如果是，包括并修剪路径，或者排除它，如果不是，则继续。此外，我对遍历所花费的时间和要查找的最大结果数设置了限制。一切都可以在下面的代码中看到：

private class IncludePathWithTargetPropertyEvaluator implements Evaluator {

private String targetProperty;
private int results;
private long startTime, curTime, elapsed;       

public IncludePathWithTargetPropertyEvaluator(String targetProperty) {
    this.targetProperty = targetProperty;
    this.startTime = System.currentTimeMillis();
    this.results = 0;
}

public Evaluation evaluate(Path path) {

    curTime = System.currentTimeMillis();
    elapsed = curTime - startTime;

    if (elapsed >= 200) {
        return Evaluation.EXCLUDE_AND_PRUNE;
    }

    if (results >= 3) {
        return Evaluation.EXCLUDE_AND_PRUNE;
    }

    String property = (String) path.endNode().getProperty("propertyName");

    if (property.equals(targetProperty)) {
        results = results + 1;
        return Evaluation.INCLUDE_AND_PRUNE;
    }

    return Evaluation.EXCLUDE_AND_CONTINUE;
}

最后我写了一个自定义的 PathExpander，因为每次我们只需要遍历具有特定属性值的边：

私有类 SpecificRelsPathExpander 实现 PathExpander {

private String requiredProperty;

public SpecificRelsPathExpander(String requiredProperty) {
    this.requiredProperty = requiredProperty;
}

public Iterable<Relationship> expand(Path path, BranchState<Object> state) {
    Iterable<Relationship> rels = path.endNode().getRelationships(RelTypes.FOO, Direction.BOTH);
    if (!rels.iterator().hasNext())
        return null;
    List<Relationship> validRels = new LinkedList<Relationship>();
    for (Relationship rel : rels) {
        String property = (String) rel.getProperty("propertyName");
        if (property.equals(requiredProperty)) {
            validRels.add(rel);
        }
    }
    return validRels;
}

// not used
public PathExpander<Object> reverse() {
    return null;
}

问题是遍历器在 200 毫秒过去后仍然继续运行。

据我了解，求值器的行为是将使用 EXCLUDE_AND_CONTINUE 求值的每个路径的所有后续分支排入队列，并且遍历器本身不会停止，直到它访问了队列中的所有后续路径。
所以可能发生的情况是：即使我有很少的高度节点，也会导致数千条路径被遍历。

在这种情况下，有没有办法让遍历器在达到超时时突然停止并返回在while中找到的可能的有效路径？

【问题讨论】：

标签： java neo4j traversal

【解决方案1】：

我会采用以下思路：

一旦超时，停止扩展图表。

private class SpecificRelsPathExpander implements PathExpander {

private String requiredProperty;
private long startTime, curTime, elapsed;

public SpecificRelsPathExpander(String requiredProperty) {
    this.requiredProperty = requiredProperty;
    this.startTime = System.currentTimeMillis();
}

public Iterable<Relationship> expand(Path path, BranchState<Object> state) {
    curTime = System.currentTimeMillis();
    elapsed = curTime - startTime;

    if (elapsed >= 200) {
        return null;
    }

    Iterable<Relationship> rels = path.endNode().getRelationships(RelTypes.FOO, Direction.BOTH);
    if (!rels.iterator().hasNext())
        return null;
    List<Relationship> validRels = new LinkedList<Relationship>();
    for (Relationship rel : rels) {
        String property = (String) rel.getProperty("propertyName");
        if (property.equals(requiredProperty)) {
            validRels.add(rel);
        }
    }
    return validRels;
}

// not used
public PathExpander<Object> reverse() {
    return null;
}

我认为看看Neo4J TraversalDescription Definition 可能对你也有好处。

【讨论】：

这是一个很好的改进，肯定有助于实现目标。但是问题仍然存在，有可能扩展具有数百个关系的节点，这将导致（具有一些类似的节点）在遍历中排队的路径快速增长。我需要一个更严格的约束，可以立即停止遍历。

【解决方案2】：

我会实现扩展器来保持遍历框架的惰性，也因为它的代码更简单。这将防止遍历急切地收集节点的所有关系，如下所示：

public class SpecificRelsPathExpander implements PathExpander, Predicate<Relationship>
{
    private final String requiredProperty;

    public SpecificRelsPathExpander( String requiredProperty )
    {
        this.requiredProperty = requiredProperty;
    }

    @Override
    public Iterable<Relationship> expand( Path path, BranchState state )
    {
        Iterable<Relationship> rels = path.endNode().getRelationships( RelTypes.FOO, Direction.BOTH );
        return Iterables.filter( this, rels );
    }

    @Override
    public boolean accept( Relationship relationship )
    {
        return requiredProperty.equals( relationship.getProperty( "propertyName", null ) );
    }

    // not used
    @Override
    public PathExpander<Object> reverse()
    {
        return null;
    }
}

只要客户端，即持有从开始遍历调用时收到的迭代器的客户端 hasNext/next，遍历就会继续。本身不会有遍历，都发生在 hasNext/next 中。

【讨论】：