当前位置:Gxlcms > 数据库问题 > 主流图数据库Neo4J、ArangoDB、OrientDB综合对比:架构分析

主流图数据库Neo4J、ArangoDB、OrientDB综合对比:架构分析

时间:2021-07-01 10:21:17 帮助过:14人阅读

# neo4j-import --into njaq --nodes /home/dawn/csv/perosnInfo.csv --relationships /home/dawn/csv/know.csv --skip-bad-relationships true --skip-bad-entries-logging true --bad-tolerance true WARNING: neo4j-import is deprecated and support for it will be removed in a future version of Neo4j; please use neo4j-admin import instead. Neo4j version: 3.2.1 Importing the contents of these files into njaq: Nodes: /home/dawn/csv/perosnInfo.csv Relationships: /home/dawn/csv/know.csv Available resources: Total machine memory: 3.84 GB Free machine memory: 1.61 GB Max heap memory : 875.00 MB Processors: 4 Configured max memory: 700.35 MB Nodes, started 2017-06-08 05:35:30.741+0000 [>:18.87 MB|NODE:152.59 MB----|*PROPERTIES(3)============|LABEL SCAN--|v:37.14 MB/s-----------]20.0M ?21.8K Done in 51s 548ms Prepare node index, started 2017-06-08 05:36:22.495+0000 [*DETECT:419.62 MB----------------------------------------------------------------------------]20.0M ?-6500000 Done in 9s 126ms Relationships, started 2017-06-08 05:36:31.678+0000 [>:7|T|*PREPARE(4)=========================================================|RE|CALCULATE-|P|v:]79.9M ?10.9K Done in 4m 17s 742ms Relationship --> Relationship 1/1, started 2017-06-08 05:40:49.548+0000 [*>-----------------------------------------------------------------------|LINK------------|v:]79.9M ? 405K Done in 2m 5s 784ms RelationshipGroup 1/1, started 2017-06-08 05:42:55.404+0000 [*>:??----------------------------------------------------------------------------------------] 0 ? 0 Done in 11ms Node --> Relationship, started 2017-06-08 05:42:55.439+0000 [>:13|*>-------------------------------------------------|LIN|v:26.00 MB/s--------------------]19.9M ?2.18M Done in 11s 833ms Relationship <-- Relationship 1/1, started 2017-06-08 05:43:07.308+0000 [*>-------------------------------------------------------------------------------|LINK----|v:]79.9M ? 168K Done in 11m 29s 787ms Count groups, started 2017-06-08 05:54:37.570+0000 [*>:??----------------------------------------------------------------------------------------] 0 ? 0 Done in 1ms Gather, started 2017-06-08 05:54:38.061+0000 [*>:??----------------------------------------------------------------------------------------] 0 ? 0 Done in 4ms Write, started 2017-06-08 05:54:38.156+0000 [*>:??----------------------------------------------------------------------------------------] 0 ? 0 Done in 15ms Node --> Group, started 2017-06-08 05:54:38.213+0000 [*>:??----------------------------------------------------------------------------------------] 0 ? 0 Done in Node counts, started 2017-06-08 05:54:38.264+0000 [*>(4)====================================================================================|COU]20.0M ?80.0K Done in 1m 26s 338ms Relationship counts, started 2017-06-08 05:56:04.625+0000 [*>(4)======================================================|COUNT----------------------------]80.0M ?1.81M Done in 2m 47s 277ms IMPORT DONE in 23m 22s 420ms. Imported: 20000000 nodes 79994052 relationships 80000000 properties Peak memory usage: 899.62 MB

Neo4J使用导入方法之后会建立索引,否则基本没有性能,建立索引很快。


图算法支持

ArangoDB图算法支持

  1. AQL
    1. 遍历:从指定开始点,通过一定算法、边类型、图类型、深度获取与指定开始点相关连通的点。
      1. 数据源:图、边集合
      2. 边方向:出边、入边、全部
      3. 遍历方式:BFS Or DFS
    2. 最短路径:两点最短路径,选项基本和上面类型
  2. Pregel
    1. @arangodb/pregel 文件夹下,很多分布式的图算法
    2. PageRank
    3. CC 强弱连通算法
    4. 单源最短路径算法

JS扩展

  • 通过JS可以完成对内置算法的扩展,但是自定义方法是单线程JS函数,如果用来做算法,性能堪忧,最佳选择就是选择内置的方法去实现图算法。
  • 通过JS可以实现很多算法,但是在ArangoSH下代码单线程运作,虽然arangod的JS是在多线程中运行的,但是arangosh是在单线程中运行的,且JS本身并不擅长处理计算型代码,相比之下通过内置的数据库语言而不是这种内置语言与JS混杂方式的代码会快很多;比如Neo4J,OrientDB的查询语言。

Neo4J

对于普通的遍历最短路径算法支持和ArangoDB一样都支持,但Neo4J的图的遍历深度的阈值设置比较难,且深度超过6算法会效率比较低。

相比之下,ArangoDB的算法参数设置全部依赖于Key-Value实现,算法在编码层次灵活性很高。

对于PageRank,CC等算法的实现,Neo4J提供两种方式:

  • 编写Jar包,GitHub上有个未被官方承认的Jar包
  • Cypher直接实现

OrientDB

同上,图算法也支持Jar包导入。


内置图算法

最短路径

Cypher:

match (p1:person{no:‘%s‘}),(p2:person{no:‘%s‘}) match p=shortestPath((p1)-[*..3]->(p2)) return p

OrientDB SQL:

select dijkstra((select @RID from persons where no=‘%s‘),(select @RID from persons where no=‘%s‘),‘E‘)

AQL:

for v,e in outbound shortest_path ‘%s‘ to ‘%s‘ graph ‘graphPersons‘ return [v._key,e._key]

邻接点

Cypher:

MATCH (js:person)-[:know]-(surfer) WHERE js.no = ‘%s‘ return surfer

OrientDB SQL:

select from E where out = (select @RID from persons where no=‘%s

AQL:

traversal_results = graphPersons.traverse(
      start_vertex=‘persons/‘+getSingleInfo(id).no,
      strategy=‘bfs‘,
      direction=‘outbound‘,
      edge_uniqueness=‘global‘,
      vertex_uniqueness=‘global‘,
      max_depth=1
  )

参考资料

  • 附录 B. 常见问题_Neo4J最新版本
  • ArangoDB 3.0 – A Solid Ground to Scale
  • OrientDB calculate PageRank

人气教程排行