ordering across partitions in Kafka












0















I am writing a kafka producer and needs help in creating partitions.
I have a group and a user table. Group contains different users and at a time a user can be a part of only one group.



There can be two types of events which I will receive as input and based on that I will add them to Kafka.




  1. The events related to users.

  2. The events related to groups.


Whenever an event related to a group happens, all the users in that group must be updated in bulk at consumer end.
Whenever an event related to a user happens, it must be executed as such at the consumer end.



Also, I want to maintain ordering on basis of time.



If I create user level partitioning, then the bulk update won't be possible at consumer end.



If I create group level partitioning, then the parallel update of user events won't happen.



I am trying to figure out the possibilities I can try here.










share|improve this question





























    0















    I am writing a kafka producer and needs help in creating partitions.
    I have a group and a user table. Group contains different users and at a time a user can be a part of only one group.



    There can be two types of events which I will receive as input and based on that I will add them to Kafka.




    1. The events related to users.

    2. The events related to groups.


    Whenever an event related to a group happens, all the users in that group must be updated in bulk at consumer end.
    Whenever an event related to a user happens, it must be executed as such at the consumer end.



    Also, I want to maintain ordering on basis of time.



    If I create user level partitioning, then the bulk update won't be possible at consumer end.



    If I create group level partitioning, then the parallel update of user events won't happen.



    I am trying to figure out the possibilities I can try here.










    share|improve this question



























      0












      0








      0








      I am writing a kafka producer and needs help in creating partitions.
      I have a group and a user table. Group contains different users and at a time a user can be a part of only one group.



      There can be two types of events which I will receive as input and based on that I will add them to Kafka.




      1. The events related to users.

      2. The events related to groups.


      Whenever an event related to a group happens, all the users in that group must be updated in bulk at consumer end.
      Whenever an event related to a user happens, it must be executed as such at the consumer end.



      Also, I want to maintain ordering on basis of time.



      If I create user level partitioning, then the bulk update won't be possible at consumer end.



      If I create group level partitioning, then the parallel update of user events won't happen.



      I am trying to figure out the possibilities I can try here.










      share|improve this question
















      I am writing a kafka producer and needs help in creating partitions.
      I have a group and a user table. Group contains different users and at a time a user can be a part of only one group.



      There can be two types of events which I will receive as input and based on that I will add them to Kafka.




      1. The events related to users.

      2. The events related to groups.


      Whenever an event related to a group happens, all the users in that group must be updated in bulk at consumer end.
      Whenever an event related to a user happens, it must be executed as such at the consumer end.



      Also, I want to maintain ordering on basis of time.



      If I create user level partitioning, then the bulk update won't be possible at consumer end.



      If I create group level partitioning, then the parallel update of user events won't happen.



      I am trying to figure out the possibilities I can try here.







      apache-kafka kafka-consumer-api kafka-producer-api






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 16 '18 at 14:41









      cricket_007

      83.8k1147118




      83.8k1147118










      asked Nov 16 '18 at 10:48









      Bhawandeep SinglaBhawandeep Singla

      861715




      861715
























          2 Answers
          2






          active

          oldest

          votes


















          0















          Also, I want to maintain ordering on basis of time.




          Means that topics, no matter how many, cannot have more than one partition, as you could have received messages out-of-order.



          Obviously, unless you implement something like sequence ids in your messages (and can share that sequence across possibly multiple producers).




          If I create user level partitioning, then the bulk update won't be possible at consumer end.



          If I create group level partitioning, then the parallel update of user events won't happen.




          It sounds like a very simple messaging design, where you have a single queue (that's actually backed by a single topic with a single partition) that's consumed by multiple users. Actually any pub-sub messaging technology would be sufficient here (e.g. RabbitMQ's fanout exchanges).



          The messages on the queue contain the information whether they are group updates or user updates - the consumers then filter the input depending on what they are interested in.



          To discuss an alternative: single queue for group updates, and another for user updates - I understand that it would not be enough due to order demands - it's possible to get a group update independently of user update, breaking the ordering.






          share|improve this answer































            0














            From the kafka documentation :
            https://kafka.apache.org/documentation/#intro_consumers




            Kafka only provides a total order over records within a partition, not
            between different partitions in a topic. Per-partition ordering
            combined with the ability to partition data by key is sufficient for
            most applications. However, if you require a total order over records
            this can be achieved with a topic that has only one partition, though
            this will mean only one consumer process per consumer group.




            so the best you can do is to have single partition-single topic.






            share|improve this answer
























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53336308%2fordering-across-partitions-in-kafka%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0















              Also, I want to maintain ordering on basis of time.




              Means that topics, no matter how many, cannot have more than one partition, as you could have received messages out-of-order.



              Obviously, unless you implement something like sequence ids in your messages (and can share that sequence across possibly multiple producers).




              If I create user level partitioning, then the bulk update won't be possible at consumer end.



              If I create group level partitioning, then the parallel update of user events won't happen.




              It sounds like a very simple messaging design, where you have a single queue (that's actually backed by a single topic with a single partition) that's consumed by multiple users. Actually any pub-sub messaging technology would be sufficient here (e.g. RabbitMQ's fanout exchanges).



              The messages on the queue contain the information whether they are group updates or user updates - the consumers then filter the input depending on what they are interested in.



              To discuss an alternative: single queue for group updates, and another for user updates - I understand that it would not be enough due to order demands - it's possible to get a group update independently of user update, breaking the ordering.






              share|improve this answer




























                0















                Also, I want to maintain ordering on basis of time.




                Means that topics, no matter how many, cannot have more than one partition, as you could have received messages out-of-order.



                Obviously, unless you implement something like sequence ids in your messages (and can share that sequence across possibly multiple producers).




                If I create user level partitioning, then the bulk update won't be possible at consumer end.



                If I create group level partitioning, then the parallel update of user events won't happen.




                It sounds like a very simple messaging design, where you have a single queue (that's actually backed by a single topic with a single partition) that's consumed by multiple users. Actually any pub-sub messaging technology would be sufficient here (e.g. RabbitMQ's fanout exchanges).



                The messages on the queue contain the information whether they are group updates or user updates - the consumers then filter the input depending on what they are interested in.



                To discuss an alternative: single queue for group updates, and another for user updates - I understand that it would not be enough due to order demands - it's possible to get a group update independently of user update, breaking the ordering.






                share|improve this answer


























                  0












                  0








                  0








                  Also, I want to maintain ordering on basis of time.




                  Means that topics, no matter how many, cannot have more than one partition, as you could have received messages out-of-order.



                  Obviously, unless you implement something like sequence ids in your messages (and can share that sequence across possibly multiple producers).




                  If I create user level partitioning, then the bulk update won't be possible at consumer end.



                  If I create group level partitioning, then the parallel update of user events won't happen.




                  It sounds like a very simple messaging design, where you have a single queue (that's actually backed by a single topic with a single partition) that's consumed by multiple users. Actually any pub-sub messaging technology would be sufficient here (e.g. RabbitMQ's fanout exchanges).



                  The messages on the queue contain the information whether they are group updates or user updates - the consumers then filter the input depending on what they are interested in.



                  To discuss an alternative: single queue for group updates, and another for user updates - I understand that it would not be enough due to order demands - it's possible to get a group update independently of user update, breaking the ordering.






                  share|improve this answer














                  Also, I want to maintain ordering on basis of time.




                  Means that topics, no matter how many, cannot have more than one partition, as you could have received messages out-of-order.



                  Obviously, unless you implement something like sequence ids in your messages (and can share that sequence across possibly multiple producers).




                  If I create user level partitioning, then the bulk update won't be possible at consumer end.



                  If I create group level partitioning, then the parallel update of user events won't happen.




                  It sounds like a very simple messaging design, where you have a single queue (that's actually backed by a single topic with a single partition) that's consumed by multiple users. Actually any pub-sub messaging technology would be sufficient here (e.g. RabbitMQ's fanout exchanges).



                  The messages on the queue contain the information whether they are group updates or user updates - the consumers then filter the input depending on what they are interested in.



                  To discuss an alternative: single queue for group updates, and another for user updates - I understand that it would not be enough due to order demands - it's possible to get a group update independently of user update, breaking the ordering.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 16 '18 at 12:19









                  Adam KotwasinskiAdam Kotwasinski

                  2,543827




                  2,543827

























                      0














                      From the kafka documentation :
                      https://kafka.apache.org/documentation/#intro_consumers




                      Kafka only provides a total order over records within a partition, not
                      between different partitions in a topic. Per-partition ordering
                      combined with the ability to partition data by key is sufficient for
                      most applications. However, if you require a total order over records
                      this can be achieved with a topic that has only one partition, though
                      this will mean only one consumer process per consumer group.




                      so the best you can do is to have single partition-single topic.






                      share|improve this answer




























                        0














                        From the kafka documentation :
                        https://kafka.apache.org/documentation/#intro_consumers




                        Kafka only provides a total order over records within a partition, not
                        between different partitions in a topic. Per-partition ordering
                        combined with the ability to partition data by key is sufficient for
                        most applications. However, if you require a total order over records
                        this can be achieved with a topic that has only one partition, though
                        this will mean only one consumer process per consumer group.




                        so the best you can do is to have single partition-single topic.






                        share|improve this answer


























                          0












                          0








                          0







                          From the kafka documentation :
                          https://kafka.apache.org/documentation/#intro_consumers




                          Kafka only provides a total order over records within a partition, not
                          between different partitions in a topic. Per-partition ordering
                          combined with the ability to partition data by key is sufficient for
                          most applications. However, if you require a total order over records
                          this can be achieved with a topic that has only one partition, though
                          this will mean only one consumer process per consumer group.




                          so the best you can do is to have single partition-single topic.






                          share|improve this answer













                          From the kafka documentation :
                          https://kafka.apache.org/documentation/#intro_consumers




                          Kafka only provides a total order over records within a partition, not
                          between different partitions in a topic. Per-partition ordering
                          combined with the ability to partition data by key is sufficient for
                          most applications. However, if you require a total order over records
                          this can be achieved with a topic that has only one partition, though
                          this will mean only one consumer process per consumer group.




                          so the best you can do is to have single partition-single topic.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Nov 17 '18 at 12:14









                          bittubittu

                          407212




                          407212






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53336308%2fordering-across-partitions-in-kafka%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Bressuire

                              Vorschmack

                              Quarantine