How to aggregate several columns into a JSON file in HIVE and avoid nulls












1














user_id reservation_id  nights  price 
--------------------------------------
AAA 10001 1 100
AAA 10002 1 120
BBB 20003 7 350
ccc 10005 150
DDD 10007 3
CCC 10006 5


to



user_id    reservation_details
AAA [{"nights":"1", "price":"100"}, {"nights":"1","price":"120"}]
BBB [{"nights":"7", "price":"350"}]
CCC [{"price":"150"}, {"nights":"3"}]
DDD [{"nights":"5"}]


Here my query is



select user_id
,concat("
{",concat_ws(',',collect_list(concat(string(reservation_id),":

{'nights':",string(nights),",'price':",string(price),"}"))),"}") as
reservation_details
from mytable
group by user_id


I want to eliminate the columns which have value as nulls and convert that single quotes into double quotes which looks like a exact JSON.










share|improve this question





























    1














    user_id reservation_id  nights  price 
    --------------------------------------
    AAA 10001 1 100
    AAA 10002 1 120
    BBB 20003 7 350
    ccc 10005 150
    DDD 10007 3
    CCC 10006 5


    to



    user_id    reservation_details
    AAA [{"nights":"1", "price":"100"}, {"nights":"1","price":"120"}]
    BBB [{"nights":"7", "price":"350"}]
    CCC [{"price":"150"}, {"nights":"3"}]
    DDD [{"nights":"5"}]


    Here my query is



    select user_id
    ,concat("
    {",concat_ws(',',collect_list(concat(string(reservation_id),":

    {'nights':",string(nights),",'price':",string(price),"}"))),"}") as
    reservation_details
    from mytable
    group by user_id


    I want to eliminate the columns which have value as nulls and convert that single quotes into double quotes which looks like a exact JSON.










    share|improve this question



























      1












      1








      1







      user_id reservation_id  nights  price 
      --------------------------------------
      AAA 10001 1 100
      AAA 10002 1 120
      BBB 20003 7 350
      ccc 10005 150
      DDD 10007 3
      CCC 10006 5


      to



      user_id    reservation_details
      AAA [{"nights":"1", "price":"100"}, {"nights":"1","price":"120"}]
      BBB [{"nights":"7", "price":"350"}]
      CCC [{"price":"150"}, {"nights":"3"}]
      DDD [{"nights":"5"}]


      Here my query is



      select user_id
      ,concat("
      {",concat_ws(',',collect_list(concat(string(reservation_id),":

      {'nights':",string(nights),",'price':",string(price),"}"))),"}") as
      reservation_details
      from mytable
      group by user_id


      I want to eliminate the columns which have value as nulls and convert that single quotes into double quotes which looks like a exact JSON.










      share|improve this question















      user_id reservation_id  nights  price 
      --------------------------------------
      AAA 10001 1 100
      AAA 10002 1 120
      BBB 20003 7 350
      ccc 10005 150
      DDD 10007 3
      CCC 10006 5


      to



      user_id    reservation_details
      AAA [{"nights":"1", "price":"100"}, {"nights":"1","price":"120"}]
      BBB [{"nights":"7", "price":"350"}]
      CCC [{"price":"150"}, {"nights":"3"}]
      DDD [{"nights":"5"}]


      Here my query is



      select user_id
      ,concat("
      {",concat_ws(',',collect_list(concat(string(reservation_id),":

      {'nights':",string(nights),",'price':",string(price),"}"))),"}") as
      reservation_details
      from mytable
      group by user_id


      I want to eliminate the columns which have value as nulls and convert that single quotes into double quotes which looks like a exact JSON.







      arrays json hive






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 12 '18 at 22:35









      Vamsi Prabhala

      40.5k41939




      40.5k41939










      asked Nov 12 '18 at 22:32









      Hadoop User

      63




      63
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Using in-built datatypes map and array along with a case expression to handle nulls.



          select user_id,collect_list(map_nights_price)
          from (select user_id,
          case when nights is null then map('price',price)
          when price is null then map('nights',nights)
          else map('nights',nights,'price',price) end as map_nights_price
          from mytable
          where not (price is null and nights is null) --ignore row where price and nights are null
          ) t
          group by user_id





          share|improve this answer





















          • Tried and it error ed. UDFArgumentTypeException Only primitive type arguments are accepted but map<string,string> was passed as parameter 1.
            – Hadoop User
            Nov 13 '18 at 0:27










          • not sure about the underlying data..it works for me with the example data shown.
            – Vamsi Prabhala
            Nov 13 '18 at 1:34












          • all my hive columns dataypes are string only. Also i need in JSON format. If i have 10 more cloumns and need to check each column whether it is null or not? IS there any dyamic way of checking nulls instead of hardcoding. I know in Java we can achive thru UDF. but want to know in HIVE.
            – Hadoop User
            Nov 13 '18 at 12:36













          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271049%2fhow-to-aggregate-several-columns-into-a-json-file-in-hive-and-avoid-nulls%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          Using in-built datatypes map and array along with a case expression to handle nulls.



          select user_id,collect_list(map_nights_price)
          from (select user_id,
          case when nights is null then map('price',price)
          when price is null then map('nights',nights)
          else map('nights',nights,'price',price) end as map_nights_price
          from mytable
          where not (price is null and nights is null) --ignore row where price and nights are null
          ) t
          group by user_id





          share|improve this answer





















          • Tried and it error ed. UDFArgumentTypeException Only primitive type arguments are accepted but map<string,string> was passed as parameter 1.
            – Hadoop User
            Nov 13 '18 at 0:27










          • not sure about the underlying data..it works for me with the example data shown.
            – Vamsi Prabhala
            Nov 13 '18 at 1:34












          • all my hive columns dataypes are string only. Also i need in JSON format. If i have 10 more cloumns and need to check each column whether it is null or not? IS there any dyamic way of checking nulls instead of hardcoding. I know in Java we can achive thru UDF. but want to know in HIVE.
            – Hadoop User
            Nov 13 '18 at 12:36


















          0














          Using in-built datatypes map and array along with a case expression to handle nulls.



          select user_id,collect_list(map_nights_price)
          from (select user_id,
          case when nights is null then map('price',price)
          when price is null then map('nights',nights)
          else map('nights',nights,'price',price) end as map_nights_price
          from mytable
          where not (price is null and nights is null) --ignore row where price and nights are null
          ) t
          group by user_id





          share|improve this answer





















          • Tried and it error ed. UDFArgumentTypeException Only primitive type arguments are accepted but map<string,string> was passed as parameter 1.
            – Hadoop User
            Nov 13 '18 at 0:27










          • not sure about the underlying data..it works for me with the example data shown.
            – Vamsi Prabhala
            Nov 13 '18 at 1:34












          • all my hive columns dataypes are string only. Also i need in JSON format. If i have 10 more cloumns and need to check each column whether it is null or not? IS there any dyamic way of checking nulls instead of hardcoding. I know in Java we can achive thru UDF. but want to know in HIVE.
            – Hadoop User
            Nov 13 '18 at 12:36
















          0












          0








          0






          Using in-built datatypes map and array along with a case expression to handle nulls.



          select user_id,collect_list(map_nights_price)
          from (select user_id,
          case when nights is null then map('price',price)
          when price is null then map('nights',nights)
          else map('nights',nights,'price',price) end as map_nights_price
          from mytable
          where not (price is null and nights is null) --ignore row where price and nights are null
          ) t
          group by user_id





          share|improve this answer












          Using in-built datatypes map and array along with a case expression to handle nulls.



          select user_id,collect_list(map_nights_price)
          from (select user_id,
          case when nights is null then map('price',price)
          when price is null then map('nights',nights)
          else map('nights',nights,'price',price) end as map_nights_price
          from mytable
          where not (price is null and nights is null) --ignore row where price and nights are null
          ) t
          group by user_id






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 12 '18 at 22:57









          Vamsi Prabhala

          40.5k41939




          40.5k41939












          • Tried and it error ed. UDFArgumentTypeException Only primitive type arguments are accepted but map<string,string> was passed as parameter 1.
            – Hadoop User
            Nov 13 '18 at 0:27










          • not sure about the underlying data..it works for me with the example data shown.
            – Vamsi Prabhala
            Nov 13 '18 at 1:34












          • all my hive columns dataypes are string only. Also i need in JSON format. If i have 10 more cloumns and need to check each column whether it is null or not? IS there any dyamic way of checking nulls instead of hardcoding. I know in Java we can achive thru UDF. but want to know in HIVE.
            – Hadoop User
            Nov 13 '18 at 12:36




















          • Tried and it error ed. UDFArgumentTypeException Only primitive type arguments are accepted but map<string,string> was passed as parameter 1.
            – Hadoop User
            Nov 13 '18 at 0:27










          • not sure about the underlying data..it works for me with the example data shown.
            – Vamsi Prabhala
            Nov 13 '18 at 1:34












          • all my hive columns dataypes are string only. Also i need in JSON format. If i have 10 more cloumns and need to check each column whether it is null or not? IS there any dyamic way of checking nulls instead of hardcoding. I know in Java we can achive thru UDF. but want to know in HIVE.
            – Hadoop User
            Nov 13 '18 at 12:36


















          Tried and it error ed. UDFArgumentTypeException Only primitive type arguments are accepted but map<string,string> was passed as parameter 1.
          – Hadoop User
          Nov 13 '18 at 0:27




          Tried and it error ed. UDFArgumentTypeException Only primitive type arguments are accepted but map<string,string> was passed as parameter 1.
          – Hadoop User
          Nov 13 '18 at 0:27












          not sure about the underlying data..it works for me with the example data shown.
          – Vamsi Prabhala
          Nov 13 '18 at 1:34






          not sure about the underlying data..it works for me with the example data shown.
          – Vamsi Prabhala
          Nov 13 '18 at 1:34














          all my hive columns dataypes are string only. Also i need in JSON format. If i have 10 more cloumns and need to check each column whether it is null or not? IS there any dyamic way of checking nulls instead of hardcoding. I know in Java we can achive thru UDF. but want to know in HIVE.
          – Hadoop User
          Nov 13 '18 at 12:36






          all my hive columns dataypes are string only. Also i need in JSON format. If i have 10 more cloumns and need to check each column whether it is null or not? IS there any dyamic way of checking nulls instead of hardcoding. I know in Java we can achive thru UDF. but want to know in HIVE.
          – Hadoop User
          Nov 13 '18 at 12:36




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271049%2fhow-to-aggregate-several-columns-into-a-json-file-in-hive-and-avoid-nulls%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Bressuire

          Vorschmack

          Quarantine