【发布时间】:2015-09-19 17:08:57
【问题描述】:
我想通过客户端应用程序将 250 万个顶点加载到 Titan 中。我已经格式化了txt文件。该文件的第一行;
id:12345,companyname:Abcd,country:Abcd,... 格式(propertyname:propertyvalue,...)
我尝试使用客户端应用程序中的 Rexter 将示例 100 行加载到 Titan 并成功。
对于 250 万行,我认为使用 BatchGraph 是最好的方法。对于测试,只需获取第一行并保存为 test.txt
成功编译并运行这段代码;
BaseConfiguration config = new BaseConfiguration();
config.setProperty("storage.backend", "inmemory");
config.setProperty("storage.hostname", "192.168.200.141");
config.setProperty("storage.port", "8182");
config.setProperty("storage.batch-loading", "true");
TitanGraph graph = null;
graph = TitanFactory.open(config);
BatchGraph bg = new BatchGraph(graph, VertexIDType.NUMBER, 1000);
Vertex currentNode = null;
String path = "c:\\test.txt";
Charset encoding = Charset.forName("ISO-8859-1");
List<String> lines = null;
try {
lines = Files.readAllLines(Paths.get(path), encoding);
} catch (IOException e) {
e.printStackTrace();
}
for (String line : lines) {
currentNode = bg.addVertex(1);
String[] values = line.split(",");
for (String value : values) {
String[] property = value.split(":");
currentNode.setProperty(property[0].toString(), property[1].toString());
}
bg.commit();
}
添加属性时,出现此错误;
java.lang.IllegalArgumentException: Property Key with given name does not exist: id
at com.thinkaurelius.titan.graphdb.types.typemaker.DisableDefaultSchemaMaker.makePropertyKey(DisableDefaultSchemaMaker.java:27)
at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.getOrCreatePropertyKey(StandardTitanTx.java:902)
at com.thinkaurelius.titan.graphdb.vertices.AbstractVertex.setProperty(AbstractVertex.java:239)
at com.tinkerpop.blueprints.util.wrappers.batch.BatchGraph$BatchVertex.setProperty(BatchGraph.java:492)
at tr.com.titanbulk.TitanBulk$5.widgetSelected(TitanBulk.java:213)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
我已经通过 gremlin 设置了属性键和复合索引;
mgmt = g.getManagementSystem()
id = mgmt.makePropertyKey('id').dataType(Integer.class).make()
companyname = mgmt.makePropertyKey('companyname').dataType(String.class).make()
country = mgmt.makePropertyKey('country').dataType(String.class).make()
mgmt.buildIndex('ni_id',Vertex.class).addKey(id).buildCompositeIndex()
mgmt.buildIndex('ni_companynamecountry',Vertex.class).addKey(companyname).addKey(country).buildCompositeIndex()
mgmt.buildIndex('ni_companyname',Vertex.class).addKey(companyname).buildCompositeIndex()
mgmt.buildIndex('ni_country',Vertex.class).addKey(country).buildCompositeIndex()
mgmt.commit()
g.getIndexedKeys(Vertex.class)
==>id
==>companyname
==>country
使用 cassandra 后端 (How to import a CSV file into Titan graph database?) 通过 gremlin 从 txt 成功加载。但仍然需要从我的应用程序中完成。我变了; config.setProperty("storage.backend", "inmemory"); 至 config.setProperty("storage.backend", "cassandra");
但是在打开连接时 (graph = TitanFactory.open(config);) 得到这个错误;
18:26:15.503 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - About to instantiate class public com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy(int,int) with 2 arguments
18:26:15.509 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Instantiated RetryBackoffStrategy object com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy@52e6fdee from config string "com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy,1000,5000"
18:26:15.511 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - About to instantiate class public com.netflix.astyanax.retry.BoundedExponentialBackoff(long,long,int) with 3 arguments
18:26:15.512 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Instantiated RetryPolicy object com.netflix.astyanax.retry.BoundedExponentialBackoff@7ec7ffd3[maxSleepTimeMs=25000,MAX_SHIFT=30,random=java.util.Random@dd8ba08,baseSleepTimeMs=100,maxAttempts=8,attempts=0] from config string "com.netflix.astyanax.retry.BoundedExponentialBackoff,100,25000,8"
18:26:15.530 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Custom RetryBackoffStrategy com.netflix.astyanax.connectionpool.impl.FixedRetryBackoffStrategy@52e6fdee
18:26:15.810 [main] INFO c.n.a.c.i.ConnectionPoolMBeanManager - Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=ClusterTitanConnectionPool,ServiceType=connectionpool
18:26:15.823 [main] INFO c.n.a.c.i.CountingConnectionPoolMonitor - AddHost: 192.168.200.141
18:26:16.851 [pool-4-thread-1] DEBUG c.n.astyanax.thrift.ThriftConverter - java.net.ConnectException: Connection refused: connect
18:26:25.832 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Failed to describe keyspace titan
18:26:25.832 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Creating keyspace titan...
18:26:26.853 [pool-4-thread-1] DEBUG c.n.astyanax.thrift.ThriftConverter - java.net.ConnectException: Connection refused: connect
18:26:35.848 [main] DEBUG c.t.t.d.c.a.AstyanaxStoreManager - Failed to create keyspace titan
java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:55)
at com.thinkaurelius.titan.diskstorage.Backend.getImplementationClass(Backend.java:421)
at com.thinkaurelius.titan.diskstorage.Backend.getStorageManager(Backend.java:361)
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1275)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:93)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:73)
at tr.com.kale.titanbulk.TitanBulk$5.widgetSelected(TitanBulk.java:196)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at tr.com.kale.titanbulk.TitanBulk.open(TitanBulk.java:68)
at tr.com.kale.titanbulk.TitanBulk.main(TitanBulk.java:52)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:44)
... 13 more
Caused by: com.thinkaurelius.titan.diskstorage.TemporaryBackendException: Temporary failure in storage backend
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:563)
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager.<init>(AstyanaxStoreManager.java:283)
... 18 more
Caused by: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=192.168.200.141(192.168.200.141):9160, latency=10002(10002), attempts=1]Timed out waiting for connection
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:231)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:198)
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:84)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:117)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:338)
at com.netflix.astyanax.thrift.ThriftClusterImpl.executeSchemaChangeOperation(ThriftClusterImpl.java:146)
at com.netflix.astyanax.thrift.ThriftClusterImpl.internalCreateKeyspace(ThriftClusterImpl.java:321)
at com.netflix.astyanax.thrift.ThriftClusterImpl.addKeyspace(ThriftClusterImpl.java:294)
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(AstyanaxStoreManager.java:558)
... 19 more
java.lang.IllegalArgumentException: Graph may not be null
at com.tinkerpop.blueprints.util.wrappers.batch.BatchGraph.<init>(BatchGraph.java:81)
at tr.com.kale.titanbulk.TitanBulk$5.widgetSelected(TitanBulk.java:206)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at tr.com.kale.titanbulk.TitanBulk.open(TitanBulk.java:68)
at tr.com.kale.titanbulk.TitanBulk.main(TitanBulk.java:52)
我也尝试过 cassandrathrift;
18:35:18.296 [main] DEBUG c.t.t.d.c.t.t.CTConnectionFactory - Creating TSocket(192.168.200.141, 9160, null, null, 10000)
java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:55)
at com.thinkaurelius.titan.diskstorage.Backend.getImplementationClass(Backend.java:421)
at com.thinkaurelius.titan.diskstorage.Backend.getStorageManager(Backend.java:361)
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.<init>(GraphDatabaseConfiguration.java:1275)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:93)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:73)
at tr.com.kale.titanbulk.TitanBulk$5.widgetSelected(TitanBulk.java:196)
at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at tr.com.kale.titanbulk.TitanBulk.open(TitanBulk.java:68)
at tr.com.kale.titanbulk.TitanBulk.main(TitanBulk.java:52)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.thinkaurelius.titan.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:44)
... 13 more
Caused by: com.thinkaurelius.titan.diskstorage.TemporaryBackendException: Temporary failure in storage backend
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager.getCassandraPartitioner(CassandraThriftStoreManager.java:218)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager.<init>(CassandraThriftStoreManager.java:196)
... 18 more
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused: connect
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeRawConnection(CTConnectionFactory.java:88)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:52)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.thriftpool.CTConnectionFactory.makeObject(CTConnectionFactory.java:21)
at org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1220)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager.getCassandraPartitioner(CassandraThriftStoreManager.java:215)
... 19 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 25 more
谢谢。
【问题讨论】:
-
您能澄清一下您的问题到底出在哪里吗?似乎您遇到了问题并自己解决了一些问题,但除了可能连接到 cassandra 之外,我无法准确说出您被困在哪里。这是主要问题吗?如果是这样,您似乎应该检查与 cassandra 实例的网络连接 - 仔细检查端口是否打开。也许首先在 localhost 上进行测试以验证您是否可以获得连接。
-
很高兴在这里见到你。我上周查看了有关泰坦的所有问题和答案。他们的共同点,你的答案。首先,在我通过 gremlin 创建属性键之后,我尝试了“inmemory”后端,但得到错误“具有给定名称的属性键不存在”。我意识到,它只是记忆而不是存储。我认为,在使用内存后端时,必须在同一事务中定义属性键。然后将后端更改为“cassandra”以将数据写入存储,但我收到连接错误消息,例如“连接被拒绝:连接”。
-
我在 mycode 中的 BaseConfiguration 属性上注释了端口定义,但同样的错误。之后,我尝试在 localhost 上使用 gremlin 接口使用 cassandra 后端加载我的文件并成功加载。所以主要问题是从我的客户端应用程序连接到 cassandra 后端。我想如果我连接到 cassandra,我的代码会正常工作。
-
是的——这听起来不像是泰坦问题。如果您的代码在 localhost 上正常运行,那么您的问题仅在于您的网络、cassandra 配置等。建立与远程集群的基本连接,您的问题应该会得到解决。
-
我检查了防火墙日志,没有任何拒绝。还检查了服务器中的所有属性、xml 和 yaml 文件,但没有任何线索。我无法打开从我的电脑到 Titan 服务器的连接:(