Neo4j: what is the most efficient solution to import csv?
I have the following code and I need to use (if exists) a code more efficient because I have a lot of rows in my csv and Neo4j takes too much time to add all rows.
using periodic commit 1000
load csv with headers from "file:///registry_office.csv" as f
fieldterminator "|"
WITH f AS a
WHERE NOT a.JobName IS NULL and NOT a.JobCode IS NULL and NOT
a.JobDescription IS NULL and NOT a.JobLongDescription IS NULL
AND NOT a.Long_Description IS NULL AND NOT a.Position IS NULL
AND NOT a.birthDate IS NULL AND NOT a.startWorkingDate IS NULL
merge (b:Job{Name:a.JobName, Code:a.JobCode, Job:a.JobDescription,
JobLongDescription:a.JobLongDescription})
merge (c:Person{PersonName:a.PersonName, PersonSurname:a.PersonSurname,
CF:a.CF, birthDate:a.birthDate, address:a.address, age:a.age,
married:a.married, birthPlace:a.a.birthPlace})
merge (b)<-[:RELATED_TO{startWorkingDate:a.startWorkingDate,
JobPosition:a.Position}]-(c)
return *;
Do you have any suggestions for me?
csv graph neo4j cypher load
add a comment |
I have the following code and I need to use (if exists) a code more efficient because I have a lot of rows in my csv and Neo4j takes too much time to add all rows.
using periodic commit 1000
load csv with headers from "file:///registry_office.csv" as f
fieldterminator "|"
WITH f AS a
WHERE NOT a.JobName IS NULL and NOT a.JobCode IS NULL and NOT
a.JobDescription IS NULL and NOT a.JobLongDescription IS NULL
AND NOT a.Long_Description IS NULL AND NOT a.Position IS NULL
AND NOT a.birthDate IS NULL AND NOT a.startWorkingDate IS NULL
merge (b:Job{Name:a.JobName, Code:a.JobCode, Job:a.JobDescription,
JobLongDescription:a.JobLongDescription})
merge (c:Person{PersonName:a.PersonName, PersonSurname:a.PersonSurname,
CF:a.CF, birthDate:a.birthDate, address:a.address, age:a.age,
married:a.married, birthPlace:a.a.birthPlace})
merge (b)<-[:RELATED_TO{startWorkingDate:a.startWorkingDate,
JobPosition:a.Position}]-(c)
return *;
Do you have any suggestions for me?
csv graph neo4j cypher load
have you created some constraints in your graph ? Can yougive theEXPLAIN
of your query ?
– logisima
Nov 16 '18 at 13:28
Also include your neo4j version
– Tezra
Nov 16 '18 at 14:10
Yes, I created constraints and index and I have the last version of Neo4j.
– raf
Nov 16 '18 at 14:15
Please add EXPLAIN to the start of your cypher, and share the output.
– Tezra
Nov 16 '18 at 21:03
add a comment |
I have the following code and I need to use (if exists) a code more efficient because I have a lot of rows in my csv and Neo4j takes too much time to add all rows.
using periodic commit 1000
load csv with headers from "file:///registry_office.csv" as f
fieldterminator "|"
WITH f AS a
WHERE NOT a.JobName IS NULL and NOT a.JobCode IS NULL and NOT
a.JobDescription IS NULL and NOT a.JobLongDescription IS NULL
AND NOT a.Long_Description IS NULL AND NOT a.Position IS NULL
AND NOT a.birthDate IS NULL AND NOT a.startWorkingDate IS NULL
merge (b:Job{Name:a.JobName, Code:a.JobCode, Job:a.JobDescription,
JobLongDescription:a.JobLongDescription})
merge (c:Person{PersonName:a.PersonName, PersonSurname:a.PersonSurname,
CF:a.CF, birthDate:a.birthDate, address:a.address, age:a.age,
married:a.married, birthPlace:a.a.birthPlace})
merge (b)<-[:RELATED_TO{startWorkingDate:a.startWorkingDate,
JobPosition:a.Position}]-(c)
return *;
Do you have any suggestions for me?
csv graph neo4j cypher load
I have the following code and I need to use (if exists) a code more efficient because I have a lot of rows in my csv and Neo4j takes too much time to add all rows.
using periodic commit 1000
load csv with headers from "file:///registry_office.csv" as f
fieldterminator "|"
WITH f AS a
WHERE NOT a.JobName IS NULL and NOT a.JobCode IS NULL and NOT
a.JobDescription IS NULL and NOT a.JobLongDescription IS NULL
AND NOT a.Long_Description IS NULL AND NOT a.Position IS NULL
AND NOT a.birthDate IS NULL AND NOT a.startWorkingDate IS NULL
merge (b:Job{Name:a.JobName, Code:a.JobCode, Job:a.JobDescription,
JobLongDescription:a.JobLongDescription})
merge (c:Person{PersonName:a.PersonName, PersonSurname:a.PersonSurname,
CF:a.CF, birthDate:a.birthDate, address:a.address, age:a.age,
married:a.married, birthPlace:a.a.birthPlace})
merge (b)<-[:RELATED_TO{startWorkingDate:a.startWorkingDate,
JobPosition:a.Position}]-(c)
return *;
Do you have any suggestions for me?
csv graph neo4j cypher load
csv graph neo4j cypher load
edited Nov 16 '18 at 10:07
raf
asked Nov 16 '18 at 9:52
rafraf
446
446
have you created some constraints in your graph ? Can yougive theEXPLAIN
of your query ?
– logisima
Nov 16 '18 at 13:28
Also include your neo4j version
– Tezra
Nov 16 '18 at 14:10
Yes, I created constraints and index and I have the last version of Neo4j.
– raf
Nov 16 '18 at 14:15
Please add EXPLAIN to the start of your cypher, and share the output.
– Tezra
Nov 16 '18 at 21:03
add a comment |
have you created some constraints in your graph ? Can yougive theEXPLAIN
of your query ?
– logisima
Nov 16 '18 at 13:28
Also include your neo4j version
– Tezra
Nov 16 '18 at 14:10
Yes, I created constraints and index and I have the last version of Neo4j.
– raf
Nov 16 '18 at 14:15
Please add EXPLAIN to the start of your cypher, and share the output.
– Tezra
Nov 16 '18 at 21:03
have you created some constraints in your graph ? Can yougive the
EXPLAIN
of your query ?– logisima
Nov 16 '18 at 13:28
have you created some constraints in your graph ? Can yougive the
EXPLAIN
of your query ?– logisima
Nov 16 '18 at 13:28
Also include your neo4j version
– Tezra
Nov 16 '18 at 14:10
Also include your neo4j version
– Tezra
Nov 16 '18 at 14:10
Yes, I created constraints and index and I have the last version of Neo4j.
– raf
Nov 16 '18 at 14:15
Yes, I created constraints and index and I have the last version of Neo4j.
– raf
Nov 16 '18 at 14:15
Please add EXPLAIN to the start of your cypher, and share the output.
– Tezra
Nov 16 '18 at 21:03
Please add EXPLAIN to the start of your cypher, and share the output.
– Tezra
Nov 16 '18 at 21:03
add a comment |
1 Answer
1
active
oldest
votes
The import tool is generally much faster than LOAD CSV
.
However, your query suggests that each csv row ends as a pattern (b)<--(c), so you'd need to to some pre-processing on this csv... first filter null values, then split into 3 csvs (2 for nodes, 1 for relationships).
To do this, you have 3 main options:
Excel - not viable for huge CSVs
CLI tool - something like csvkit
Program - if you are OK with Python or JavaScript, you'll be able to do this in 20m or so.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53335295%2fneo4j-what-is-the-most-efficient-solution-to-import-csv%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The import tool is generally much faster than LOAD CSV
.
However, your query suggests that each csv row ends as a pattern (b)<--(c), so you'd need to to some pre-processing on this csv... first filter null values, then split into 3 csvs (2 for nodes, 1 for relationships).
To do this, you have 3 main options:
Excel - not viable for huge CSVs
CLI tool - something like csvkit
Program - if you are OK with Python or JavaScript, you'll be able to do this in 20m or so.
add a comment |
The import tool is generally much faster than LOAD CSV
.
However, your query suggests that each csv row ends as a pattern (b)<--(c), so you'd need to to some pre-processing on this csv... first filter null values, then split into 3 csvs (2 for nodes, 1 for relationships).
To do this, you have 3 main options:
Excel - not viable for huge CSVs
CLI tool - something like csvkit
Program - if you are OK with Python or JavaScript, you'll be able to do this in 20m or so.
add a comment |
The import tool is generally much faster than LOAD CSV
.
However, your query suggests that each csv row ends as a pattern (b)<--(c), so you'd need to to some pre-processing on this csv... first filter null values, then split into 3 csvs (2 for nodes, 1 for relationships).
To do this, you have 3 main options:
Excel - not viable for huge CSVs
CLI tool - something like csvkit
Program - if you are OK with Python or JavaScript, you'll be able to do this in 20m or so.
The import tool is generally much faster than LOAD CSV
.
However, your query suggests that each csv row ends as a pattern (b)<--(c), so you'd need to to some pre-processing on this csv... first filter null values, then split into 3 csvs (2 for nodes, 1 for relationships).
To do this, you have 3 main options:
Excel - not viable for huge CSVs
CLI tool - something like csvkit
Program - if you are OK with Python or JavaScript, you'll be able to do this in 20m or so.
edited Nov 17 '18 at 0:02
answered Nov 16 '18 at 23:47
IzhakiIzhaki
16.7k64977
16.7k64977
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53335295%2fneo4j-what-is-the-most-efficient-solution-to-import-csv%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
have you created some constraints in your graph ? Can yougive the
EXPLAIN
of your query ?– logisima
Nov 16 '18 at 13:28
Also include your neo4j version
– Tezra
Nov 16 '18 at 14:10
Yes, I created constraints and index and I have the last version of Neo4j.
– raf
Nov 16 '18 at 14:15
Please add EXPLAIN to the start of your cypher, and share the output.
– Tezra
Nov 16 '18 at 21:03