low cassandra write/second, 1500-2000 writes per second in 6 nodes cluster












0















Cassandra cluster specs:



Nodes: 6
Storage: 1536 GB
Cores: 48
Ram: 168 GB


Latency to cassandra cluster from my local: 330 ms-390 ms



I am using cassandra java driver, spark-cassandra-connector_2.11 version 2.3.2



cluster configuration in java driver:



private static Session connect(
final String node, final Integer port, final String userName, final String password) {

Builder b = Cluster.builder().addContactPoints(node.split(COMMA_SEPARATOR));
if (!Strings.isNullOrEmpty(userName) && !Strings.isNullOrEmpty(password)) {
b.withCredentials(userName, password);
}

if (port != null && port != 0) {
b.withPort(port);
}

PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 10000)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setNewConnectionThreshold(HostDistance.LOCAL, 3)
.setNewConnectionThreshold(HostDistance.REMOTE, 3)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 3);

b.withSocketOptions(
new SocketOptions()
.setConnectTimeoutMillis(SOCKET_CONNECT_TIMEOUT)
.setReadTimeoutMillis(SOCKET_READ_TIMEOUT));
b.withPoolingOptions(poolingOptions);

cluster = b.build();


session = cluster.connect();

return session;
}


Below is my test table:



CREATE TABLE my_keyspace.test_table (
id int PRIMARY KEY
)


To write to cassandra I am using session.executeAsync and storing the futures in list and waiting for all futures to complete.



When I do 100000 writes it takes 50-65 seconds.
Is it supposed to be this slow or there is something I am missing in configuration?
I have already tried several option in socket options and pooling options, but that's the best I got.










share|improve this question

























  • that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?

    – Alex Ott
    Nov 15 '18 at 9:00











  • I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster

    – Bikas Katwal
    Nov 15 '18 at 14:11











  • No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly

    – Alex Ott
    Nov 15 '18 at 14:31











  • sure. Cluster is not owned by us. By i would definitely check.

    – Bikas Katwal
    Nov 15 '18 at 14:57
















0















Cassandra cluster specs:



Nodes: 6
Storage: 1536 GB
Cores: 48
Ram: 168 GB


Latency to cassandra cluster from my local: 330 ms-390 ms



I am using cassandra java driver, spark-cassandra-connector_2.11 version 2.3.2



cluster configuration in java driver:



private static Session connect(
final String node, final Integer port, final String userName, final String password) {

Builder b = Cluster.builder().addContactPoints(node.split(COMMA_SEPARATOR));
if (!Strings.isNullOrEmpty(userName) && !Strings.isNullOrEmpty(password)) {
b.withCredentials(userName, password);
}

if (port != null && port != 0) {
b.withPort(port);
}

PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 10000)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setNewConnectionThreshold(HostDistance.LOCAL, 3)
.setNewConnectionThreshold(HostDistance.REMOTE, 3)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 3);

b.withSocketOptions(
new SocketOptions()
.setConnectTimeoutMillis(SOCKET_CONNECT_TIMEOUT)
.setReadTimeoutMillis(SOCKET_READ_TIMEOUT));
b.withPoolingOptions(poolingOptions);

cluster = b.build();


session = cluster.connect();

return session;
}


Below is my test table:



CREATE TABLE my_keyspace.test_table (
id int PRIMARY KEY
)


To write to cassandra I am using session.executeAsync and storing the futures in list and waiting for all futures to complete.



When I do 100000 writes it takes 50-65 seconds.
Is it supposed to be this slow or there is something I am missing in configuration?
I have already tried several option in socket options and pooling options, but that's the best I got.










share|improve this question

























  • that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?

    – Alex Ott
    Nov 15 '18 at 9:00











  • I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster

    – Bikas Katwal
    Nov 15 '18 at 14:11











  • No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly

    – Alex Ott
    Nov 15 '18 at 14:31











  • sure. Cluster is not owned by us. By i would definitely check.

    – Bikas Katwal
    Nov 15 '18 at 14:57














0












0








0








Cassandra cluster specs:



Nodes: 6
Storage: 1536 GB
Cores: 48
Ram: 168 GB


Latency to cassandra cluster from my local: 330 ms-390 ms



I am using cassandra java driver, spark-cassandra-connector_2.11 version 2.3.2



cluster configuration in java driver:



private static Session connect(
final String node, final Integer port, final String userName, final String password) {

Builder b = Cluster.builder().addContactPoints(node.split(COMMA_SEPARATOR));
if (!Strings.isNullOrEmpty(userName) && !Strings.isNullOrEmpty(password)) {
b.withCredentials(userName, password);
}

if (port != null && port != 0) {
b.withPort(port);
}

PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 10000)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setNewConnectionThreshold(HostDistance.LOCAL, 3)
.setNewConnectionThreshold(HostDistance.REMOTE, 3)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 3);

b.withSocketOptions(
new SocketOptions()
.setConnectTimeoutMillis(SOCKET_CONNECT_TIMEOUT)
.setReadTimeoutMillis(SOCKET_READ_TIMEOUT));
b.withPoolingOptions(poolingOptions);

cluster = b.build();


session = cluster.connect();

return session;
}


Below is my test table:



CREATE TABLE my_keyspace.test_table (
id int PRIMARY KEY
)


To write to cassandra I am using session.executeAsync and storing the futures in list and waiting for all futures to complete.



When I do 100000 writes it takes 50-65 seconds.
Is it supposed to be this slow or there is something I am missing in configuration?
I have already tried several option in socket options and pooling options, but that's the best I got.










share|improve this question
















Cassandra cluster specs:



Nodes: 6
Storage: 1536 GB
Cores: 48
Ram: 168 GB


Latency to cassandra cluster from my local: 330 ms-390 ms



I am using cassandra java driver, spark-cassandra-connector_2.11 version 2.3.2



cluster configuration in java driver:



private static Session connect(
final String node, final Integer port, final String userName, final String password) {

Builder b = Cluster.builder().addContactPoints(node.split(COMMA_SEPARATOR));
if (!Strings.isNullOrEmpty(userName) && !Strings.isNullOrEmpty(password)) {
b.withCredentials(userName, password);
}

if (port != null && port != 0) {
b.withPort(port);
}

PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
.setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
.setMaxRequestsPerConnection(HostDistance.REMOTE, 10000)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 3)
.setMaxConnectionsPerHost(HostDistance.REMOTE, 3)
.setNewConnectionThreshold(HostDistance.LOCAL, 3)
.setNewConnectionThreshold(HostDistance.REMOTE, 3)
.setCoreConnectionsPerHost(HostDistance.LOCAL, 3)
.setCoreConnectionsPerHost(HostDistance.REMOTE, 3);

b.withSocketOptions(
new SocketOptions()
.setConnectTimeoutMillis(SOCKET_CONNECT_TIMEOUT)
.setReadTimeoutMillis(SOCKET_READ_TIMEOUT));
b.withPoolingOptions(poolingOptions);

cluster = b.build();


session = cluster.connect();

return session;
}


Below is my test table:



CREATE TABLE my_keyspace.test_table (
id int PRIMARY KEY
)


To write to cassandra I am using session.executeAsync and storing the futures in list and waiting for all futures to complete.



When I do 100000 writes it takes 50-65 seconds.
Is it supposed to be this slow or there is something I am missing in configuration?
I have already tried several option in socket options and pooling options, but that's the best I got.







cassandra datastax-java-driver






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 8:59









Alex Ott

28.4k35273




28.4k35273










asked Nov 15 '18 at 6:26









Bikas KatwalBikas Katwal

628616




628616













  • that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?

    – Alex Ott
    Nov 15 '18 at 9:00











  • I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster

    – Bikas Katwal
    Nov 15 '18 at 14:11











  • No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly

    – Alex Ott
    Nov 15 '18 at 14:31











  • sure. Cluster is not owned by us. By i would definitely check.

    – Bikas Katwal
    Nov 15 '18 at 14:57



















  • that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?

    – Alex Ott
    Nov 15 '18 at 9:00











  • I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster

    – Bikas Katwal
    Nov 15 '18 at 14:11











  • No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly

    – Alex Ott
    Nov 15 '18 at 14:31











  • sure. Cluster is not owned by us. By i would definitely check.

    – Bikas Katwal
    Nov 15 '18 at 14:57

















that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?

– Alex Ott
Nov 15 '18 at 9:00





that's very low throughput. How much memory is allocated for Cassandra? Do you see something in the logs?

– Alex Ott
Nov 15 '18 at 9:00













I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster

– Bikas Katwal
Nov 15 '18 at 14:11





I haven't checked the logs but it should be 168/6 GB. That is dedicated cassandra cluster

– Bikas Katwal
Nov 15 '18 at 14:11













No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly

– Alex Ott
Nov 15 '18 at 14:31





No, in your setup the heap will be 1/4 of the memory available on machine, so it's around 7Gb... I would recommend to increase to 12 or 16 explicitly

– Alex Ott
Nov 15 '18 at 14:31













sure. Cluster is not owned by us. By i would definitely check.

– Bikas Katwal
Nov 15 '18 at 14:57





sure. Cluster is not owned by us. By i would definitely check.

– Bikas Katwal
Nov 15 '18 at 14:57












1 Answer
1






active

oldest

votes


















2














The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:



One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.



Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.



And of course it can be many other things. It's hard to guess without more data.






share|improve this answer
























  • thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.

    – Bikas Katwal
    Nov 15 '18 at 14:04













Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313603%2flow-cassandra-write-second-1500-2000-writes-per-second-in-6-nodes-cluster%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:



One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.



Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.



And of course it can be many other things. It's hard to guess without more data.






share|improve this answer
























  • thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.

    – Bikas Katwal
    Nov 15 '18 at 14:04


















2














The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:



One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.



Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.



And of course it can be many other things. It's hard to guess without more data.






share|improve this answer
























  • thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.

    – Bikas Katwal
    Nov 15 '18 at 14:04
















2












2








2







The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:



One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.



Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.



And of course it can be many other things. It's hard to guess without more data.






share|improve this answer













The first thing I would check is whether your Cassandra server is running at 100% CPU utilization. If it isn't, and since I assume that the server is not bottlenecked on disk (no problem to do 1500 writes per second even to a spinning disk), then the bottleneck has to be somewhere else:



One possibility you should always check first is that the client isn't the bottleneck, i.e., it doesn't use 100% CPU.



Then, you said that "Latency to cassandra cluster from my local is 330ms". Is this the ping time between your test machine and the Cassandra cluster? If so, you may have two kinds of problem. First, maybe this is some sort of low-bandwidth WAN, which really can't support more than 2000 requests per second. But I doubt that. Another possibility is that your client simply doesn't have enough concurrency... With 1/3 second latency, to achieve 2000 writes per second, you need the client to do 666 requests in parallel. Is setMaxRequestsPerConnection() you set really taking effect? Because if it isn't, the default (according to https://docs.datastax.com/en/developer/java-driver/2.1/manual/pooling/ ) is 256 times the 3 connections you set, which is 768, close to the above number 666.



And of course it can be many other things. It's hard to guess without more data.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 15 '18 at 13:31









Nadav Har'ElNadav Har'El

1,317412




1,317412













  • thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.

    – Bikas Katwal
    Nov 15 '18 at 14:04





















  • thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.

    – Bikas Katwal
    Nov 15 '18 at 14:04



















thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.

– Bikas Katwal
Nov 15 '18 at 14:04







thanks @nyh. 1. Yes, its ping response. 2. I tried the same setup in my local with 2 nodes. I get 10000-15000 writes per second. So I feel it has to do with bandwidth. But yet not confirm, I will try to run same program in VM that is close to cassandra cluster, then I might get clear idea 3. Yes, setMaxRequestsPerConnection is definitely making difference.

– Bikas Katwal
Nov 15 '18 at 14:04






















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53313603%2flow-cassandra-write-second-1500-2000-writes-per-second-in-6-nodes-cluster%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

List item for chat from Array inside array React Native

Thiostrepton

Caerphilly