low cassandra write/second, 1500-2000 writes per second in 6 nodes cluster
Cassandra cluster specs:
Nodes: 6
Storage: 1536 GB
Cores: 48
Ram: 168 GB
Latency to cassandra cluster from my local: 330 ms-390 ms
I am using cassandra java driver, spark-cassandra-connector_2.11 version 2.3.2
cluster configuration in java driver:
private static Session connect(
final String node, final Integer port, final String userName, final String password) {
Builder b = Cluster.builder().addContactPoints(node.split(COMMA_SEPARATOR));
if (!Strings.isNullOrEmpty(userName) && !Strings.isNullOrEmpty(password)) {
b.withCredentials(userName, password);
}
if (port != null && port != 0) {
b.withPort(port);
}
PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 10000)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setNewConnectionThreshold(HostDistance.LOCAL, 3)
.setNewConnectionThreshold(HostDistance.REMOTE, 3)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 3);
b.withSocketOptions(
new SocketOptions()
.setConnectTimeoutMillis(SOCKET_CONNECT_TIMEOUT)
.setReadTimeoutMillis(SOCKET_READ_TIMEOUT));
b.withPoolingOptions(poolingOptions);
cluster = b.build();
session = cluster.connect();
return session;
}
Below is my test table:
CREATE TABLE my_keyspace.test_table (
id int PRIMARY KEY
)
To write to cassandra I am using session.executeAsync and storing the futures in list and waiting for all futures to complete.
When I do 100000 writes it takes 50-65 seconds.
Is it supposed to be this slow or there is something I am missing in configuration?
I have already tried several option in socket options and pooling options, but that's the best I got.
cassandra datastax-java-driver
add a comment |
Cassandra cluster specs:
Nodes: 6
Storage: 1536 GB
Cores: 48
Ram: 168 GB
Latency to cassandra cluster from my local: 330 ms-390 ms
I am using cassandra java driver, spark-cassandra-connector_2.11 version 2.3.2
cluster configuration in java driver:
private static Session connect(
final String node, final Integer port, final String userName, final String password) {
Builder b = Cluster.builder().addContactPoints(node.split(COMMA_SEPARATOR));
if (!Strings.isNullOrEmpty(userName) && !Strings.isNullOrEmpty(password)) {
b.withCredentials(userName, password);
}
if (port != null && port != 0) {
b.withPort(port);
}
PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 10000)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setNewConnectionThreshold(HostDistance.LOCAL, 3)
.setNewConnectionThreshold(HostDistance.REMOTE, 3)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 3);
b.withSocketOptions(
new SocketOptions()
.setConnectTimeoutMillis(SOCKET_CONNECT_TIMEOUT)
.setReadTimeoutMillis(SOCKET_READ_TIMEOUT));
b.withPoolingOptions(poolingOptions);
cluster = b.build();
session = cluster.connect();
return session;
}
Below is my test table:
CREATE TABLE my_keyspace.test_table (
id int PRIMARY KEY
)
To write to cassandra I am using session.executeAsync and storing the futures in list and waiting for all futures to complete.
When I do 100000 writes it takes 50-65 seconds.
Is it supposed to be this slow or there is something I am missing in configuration?
I have already tried several option in socket options and pooling options, but that's the best I got.
cassandra datastax-java-driver
that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?
– Alex Ott
Nov 15 '18 at 9:00
I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster
– Bikas Katwal
Nov 15 '18 at 14:11
No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly
– Alex Ott
Nov 15 '18 at 14:31
sure. Cluster is not owned by us. By i would definitely check.
– Bikas Katwal
Nov 15 '18 at 14:57
add a comment |
Cassandra cluster specs:
Nodes: 6
Storage: 1536 GB
Cores: 48
Ram: 168 GB
Latency to cassandra cluster from my local: 330 ms-390 ms
I am using cassandra java driver, spark-cassandra-connector_2.11 version 2.3.2
cluster configuration in java driver:
private static Session connect(
final String node, final Integer port, final String userName, final String password) {
Builder b = Cluster.builder().addContactPoints(node.split(COMMA_SEPARATOR));
if (!Strings.isNullOrEmpty(userName) && !Strings.isNullOrEmpty(password)) {
b.withCredentials(userName, password);
}
if (port != null && port != 0) {
b.withPort(port);
}
PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 10000)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setNewConnectionThreshold(HostDistance.LOCAL, 3)
.setNewConnectionThreshold(HostDistance.REMOTE, 3)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 3);
b.withSocketOptions(
new SocketOptions()
.setConnectTimeoutMillis(SOCKET_CONNECT_TIMEOUT)
.setReadTimeoutMillis(SOCKET_READ_TIMEOUT));
b.withPoolingOptions(poolingOptions);
cluster = b.build();
session = cluster.connect();
return session;
}
Below is my test table:
CREATE TABLE my_keyspace.test_table (
id int PRIMARY KEY
)
To write to cassandra I am using session.executeAsync and storing the futures in list and waiting for all futures to complete.
When I do 100000 writes it takes 50-65 seconds.
Is it supposed to be this slow or there is something I am missing in configuration?
I have already tried several option in socket options and pooling options, but that's the best I got.
cassandra datastax-java-driver
Cassandra cluster specs:
Nodes: 6
Storage: 1536 GB
Cores: 48
Ram: 168 GB
Latency to cassandra cluster from my local: 330 ms-390 ms
I am using cassandra java driver, spark-cassandra-connector_2.11 version 2.3.2
cluster configuration in java driver:
private static Session connect(
final String node, final Integer port, final String userName, final String password) {
Builder b = Cluster.builder().addContactPoints(node.split(COMMA_SEPARATOR));
if (!Strings.isNullOrEmpty(userName) && !Strings.isNullOrEmpty(password)) {
b.withCredentials(userName, password);
}
if (port != null && port != 0) {
b.withPort(port);
}
PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 10000)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setNewConnectionThreshold(HostDistance.LOCAL, 3)
.setNewConnectionThreshold(HostDistance.REMOTE, 3)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 3);
b.withSocketOptions(
new SocketOptions()
.setConnectTimeoutMillis(SOCKET_CONNECT_TIMEOUT)
.setReadTimeoutMillis(SOCKET_READ_TIMEOUT));
b.withPoolingOptions(poolingOptions);
cluster = b.build();
session = cluster.connect();
return session;
}
Below is my test table:
CREATE TABLE my_keyspace.test_table (
id int PRIMARY KEY
)
To write to cassandra I am using session.executeAsync and storing the futures in list and waiting for all futures to complete.
When I do 100000 writes it takes 50-65 seconds.
Is it supposed to be this slow or there is something I am missing in configuration?
I have already tried several option in socket options and pooling options, but that's the best I got.
cassandra datastax-java-driver
cassandra datastax-java-driver
edited Nov 15 '18 at 8:59
Alex Ott
28.4k35273
28.4k35273
asked Nov 15 '18 at 6:26
Bikas KatwalBikas Katwal
628616
628616
that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?
– Alex Ott
Nov 15 '18 at 9:00
I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster
– Bikas Katwal
Nov 15 '18 at 14:11
No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly
– Alex Ott
Nov 15 '18 at 14:31
sure. Cluster is not owned by us. By i would definitely check.
– Bikas Katwal
Nov 15 '18 at 14:57
add a comment |
that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?
– Alex Ott
Nov 15 '18 at 9:00
I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster
– Bikas Katwal
Nov 15 '18 at 14:11
No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly
– Alex Ott
Nov 15 '18 at 14:31
sure. Cluster is not owned by us. By i would definitely check.
– Bikas Katwal
Nov 15 '18 at 14:57
that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?
– Alex Ott
Nov 15 '18 at 9:00
that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?
– Alex Ott
Nov 15 '18 at 9:00
I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster
– Bikas Katwal
Nov 15 '18 at 14:11
I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster
– Bikas Katwal
Nov 15 '18 at 14:11
No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly
– Alex Ott
Nov 15 '18 at 14:31
No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly
– Alex Ott
Nov 15 '18 at 14:31
sure. Cluster is not owned by us. By i would definitely check.
– Bikas Katwal
Nov 15 '18 at 14:57
sure. Cluster is not owned by us. By i would definitely check.
– Bikas Katwal
Nov 15 '18 at 14:57
add a comment |
1 Answer
1
active
oldest
votes
The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:
One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.
Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.
And of course it can be many other things. It's hard to guess without more data.
thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.
– Bikas Katwal
Nov 15 '18 at 14:04
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313603%2flow-cassandra-write-second-1500-2000-writes-per-second-in-6-nodes-cluster%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:
One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.
Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.
And of course it can be many other things. It's hard to guess without more data.
thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.
– Bikas Katwal
Nov 15 '18 at 14:04
add a comment |
The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:
One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.
Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.
And of course it can be many other things. It's hard to guess without more data.
thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.
– Bikas Katwal
Nov 15 '18 at 14:04
add a comment |
The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:
One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.
Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.
And of course it can be many other things. It's hard to guess without more data.
The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:
One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.
Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.
And of course it can be many other things. It's hard to guess without more data.
answered Nov 15 '18 at 13:31
Nadav Har'ElNadav Har'El
1,317412
1,317412
thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.
– Bikas Katwal
Nov 15 '18 at 14:04
add a comment |
thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.
– Bikas Katwal
Nov 15 '18 at 14:04
thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.
– Bikas Katwal
Nov 15 '18 at 14:04
thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.
– Bikas Katwal
Nov 15 '18 at 14:04
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313603%2flow-cassandra-write-second-1500-2000-writes-per-second-in-6-nodes-cluster%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?
– Alex Ott
Nov 15 '18 at 9:00
I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster
– Bikas Katwal
Nov 15 '18 at 14:11
No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly
– Alex Ott
Nov 15 '18 at 14:31
sure. Cluster is not owned by us. By i would definitely check.
– Bikas Katwal
Nov 15 '18 at 14:57