How to create new Column using withColumn to concentrate two numeric conlumn as String ? [duplicate]












2
















This question already has an answer here:




  • Concatenate columns in Apache Spark DataFrame

    10 answers




I have the dataframe as follow



val employees = sc.parallelize(Array[(String, Int, BigInt)](
("Rafferty", 31, 222222222), ("Jones", 33, 111111111), ("Heisenberg", 33, 222222222), ("Robinson", 34, 111111111), ("Smith", 34, 333333333), ("Williams", 15, 222222222)
)).toDF("LastName", "DepartmentID", "Code")

employees.show()

+----------+------------+---------+
| LastName|DepartmentID| Code|
+----------+------------+---------+
| Rafferty| 31|222222222|
| Jones| 33|111111111|
|Heisenberg| 33|222222222|
| Robinson| 34|111111111|
| Smith| 34|333333333|
| Williams| 15|222222222|
+----------+------------+---------+


I want to create another column as personal_id as concentrate DepartmentId and Code. Example: Rafferty => 31222222222



So I write code as follow:



val anotherdf = employees.withColumn("personal_id", $"DepartmentID".cast("String") + $"Code".cast("String"))


+----------+------------+---------+------------+
| LastName|DepartmentID| Code| personal_id|
+----------+------------+---------+------------+
| Rafferty| 31|222222222|2.22222253E8|
| Jones| 33|111111111|1.11111144E8|
|Heisenberg| 33|222222222|2.22222255E8|
| Robinson| 34|111111111|1.11111145E8|
| Smith| 34|333333333|3.33333367E8|
| Williams| 15|222222222|2.22222237E8|
+----------+------------+---------+------------+


But I got personal_id at double.



anotherdf.printSchema

root
|-- LastName: string (nullable = true)
|-- DepartmentID: integer (nullable = false)
|-- Code: decimal(38,0) (nullable = true)
|-- personal_id: double (nullable = true)









share|improve this question













marked as duplicate by Shaido, user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 16 '18 at 10:28


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.























    2
















    This question already has an answer here:




    • Concatenate columns in Apache Spark DataFrame

      10 answers




    I have the dataframe as follow



    val employees = sc.parallelize(Array[(String, Int, BigInt)](
    ("Rafferty", 31, 222222222), ("Jones", 33, 111111111), ("Heisenberg", 33, 222222222), ("Robinson", 34, 111111111), ("Smith", 34, 333333333), ("Williams", 15, 222222222)
    )).toDF("LastName", "DepartmentID", "Code")

    employees.show()

    +----------+------------+---------+
    | LastName|DepartmentID| Code|
    +----------+------------+---------+
    | Rafferty| 31|222222222|
    | Jones| 33|111111111|
    |Heisenberg| 33|222222222|
    | Robinson| 34|111111111|
    | Smith| 34|333333333|
    | Williams| 15|222222222|
    +----------+------------+---------+


    I want to create another column as personal_id as concentrate DepartmentId and Code. Example: Rafferty => 31222222222



    So I write code as follow:



    val anotherdf = employees.withColumn("personal_id", $"DepartmentID".cast("String") + $"Code".cast("String"))


    +----------+------------+---------+------------+
    | LastName|DepartmentID| Code| personal_id|
    +----------+------------+---------+------------+
    | Rafferty| 31|222222222|2.22222253E8|
    | Jones| 33|111111111|1.11111144E8|
    |Heisenberg| 33|222222222|2.22222255E8|
    | Robinson| 34|111111111|1.11111145E8|
    | Smith| 34|333333333|3.33333367E8|
    | Williams| 15|222222222|2.22222237E8|
    +----------+------------+---------+------------+


    But I got personal_id at double.



    anotherdf.printSchema

    root
    |-- LastName: string (nullable = true)
    |-- DepartmentID: integer (nullable = false)
    |-- Code: decimal(38,0) (nullable = true)
    |-- personal_id: double (nullable = true)









    share|improve this question













    marked as duplicate by Shaido, user6910411 apache-spark
    Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

    StackExchange.ready(function() {
    if (StackExchange.options.isMobile) return;

    $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
    var $hover = $(this).addClass('hover-bound'),
    $msg = $hover.siblings('.dupe-hammer-message');

    $hover.hover(
    function() {
    $hover.showInfoMessage('', {
    messageElement: $msg.clone().show(),
    transient: false,
    position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
    dismissable: false,
    relativeToBody: true
    });
    },
    function() {
    StackExchange.helpers.removeMessages();
    }
    );
    });
    });
    Nov 16 '18 at 10:28


    This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.





















      2












      2








      2









      This question already has an answer here:




      • Concatenate columns in Apache Spark DataFrame

        10 answers




      I have the dataframe as follow



      val employees = sc.parallelize(Array[(String, Int, BigInt)](
      ("Rafferty", 31, 222222222), ("Jones", 33, 111111111), ("Heisenberg", 33, 222222222), ("Robinson", 34, 111111111), ("Smith", 34, 333333333), ("Williams", 15, 222222222)
      )).toDF("LastName", "DepartmentID", "Code")

      employees.show()

      +----------+------------+---------+
      | LastName|DepartmentID| Code|
      +----------+------------+---------+
      | Rafferty| 31|222222222|
      | Jones| 33|111111111|
      |Heisenberg| 33|222222222|
      | Robinson| 34|111111111|
      | Smith| 34|333333333|
      | Williams| 15|222222222|
      +----------+------------+---------+


      I want to create another column as personal_id as concentrate DepartmentId and Code. Example: Rafferty => 31222222222



      So I write code as follow:



      val anotherdf = employees.withColumn("personal_id", $"DepartmentID".cast("String") + $"Code".cast("String"))


      +----------+------------+---------+------------+
      | LastName|DepartmentID| Code| personal_id|
      +----------+------------+---------+------------+
      | Rafferty| 31|222222222|2.22222253E8|
      | Jones| 33|111111111|1.11111144E8|
      |Heisenberg| 33|222222222|2.22222255E8|
      | Robinson| 34|111111111|1.11111145E8|
      | Smith| 34|333333333|3.33333367E8|
      | Williams| 15|222222222|2.22222237E8|
      +----------+------------+---------+------------+


      But I got personal_id at double.



      anotherdf.printSchema

      root
      |-- LastName: string (nullable = true)
      |-- DepartmentID: integer (nullable = false)
      |-- Code: decimal(38,0) (nullable = true)
      |-- personal_id: double (nullable = true)









      share|improve this question















      This question already has an answer here:




      • Concatenate columns in Apache Spark DataFrame

        10 answers




      I have the dataframe as follow



      val employees = sc.parallelize(Array[(String, Int, BigInt)](
      ("Rafferty", 31, 222222222), ("Jones", 33, 111111111), ("Heisenberg", 33, 222222222), ("Robinson", 34, 111111111), ("Smith", 34, 333333333), ("Williams", 15, 222222222)
      )).toDF("LastName", "DepartmentID", "Code")

      employees.show()

      +----------+------------+---------+
      | LastName|DepartmentID| Code|
      +----------+------------+---------+
      | Rafferty| 31|222222222|
      | Jones| 33|111111111|
      |Heisenberg| 33|222222222|
      | Robinson| 34|111111111|
      | Smith| 34|333333333|
      | Williams| 15|222222222|
      +----------+------------+---------+


      I want to create another column as personal_id as concentrate DepartmentId and Code. Example: Rafferty => 31222222222



      So I write code as follow:



      val anotherdf = employees.withColumn("personal_id", $"DepartmentID".cast("String") + $"Code".cast("String"))


      +----------+------------+---------+------------+
      | LastName|DepartmentID| Code| personal_id|
      +----------+------------+---------+------------+
      | Rafferty| 31|222222222|2.22222253E8|
      | Jones| 33|111111111|1.11111144E8|
      |Heisenberg| 33|222222222|2.22222255E8|
      | Robinson| 34|111111111|1.11111145E8|
      | Smith| 34|333333333|3.33333367E8|
      | Williams| 15|222222222|2.22222237E8|
      +----------+------------+---------+------------+


      But I got personal_id at double.



      anotherdf.printSchema

      root
      |-- LastName: string (nullable = true)
      |-- DepartmentID: integer (nullable = false)
      |-- Code: decimal(38,0) (nullable = true)
      |-- personal_id: double (nullable = true)




      This question already has an answer here:




      • Concatenate columns in Apache Spark DataFrame

        10 answers








      scala apache-spark apache-spark-sql






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 16 '18 at 4:44









      Haha TTproHaha TTpro

      1,51831035




      1,51831035




      marked as duplicate by Shaido, user6910411 apache-spark
      Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

      StackExchange.ready(function() {
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function() {
      $hover.showInfoMessage('', {
      messageElement: $msg.clone().show(),
      transient: false,
      position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
      dismissable: false,
      relativeToBody: true
      });
      },
      function() {
      StackExchange.helpers.removeMessages();
      }
      );
      });
      });
      Nov 16 '18 at 10:28


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









      marked as duplicate by Shaido, user6910411 apache-spark
      Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

      StackExchange.ready(function() {
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function() {
      $hover.showInfoMessage('', {
      messageElement: $msg.clone().show(),
      transient: false,
      position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
      dismissable: false,
      relativeToBody: true
      });
      },
      function() {
      StackExchange.helpers.removeMessages();
      }
      );
      });
      });
      Nov 16 '18 at 10:28


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.


























          1 Answer
          1






          active

          oldest

          votes


















          2














          I should use concat



          import org.apache.spark.sql.functions.concat
          val anotherdf2 = employees.withColumn("personal_id", concat($"DepartmentID".cast("String"), $"Code".cast("String")))


          +----------+------------+---------+-----------+
          | LastName|DepartmentID| Code|personal_id|
          +----------+------------+---------+-----------+
          | Rafferty| 31|222222222|31222222222|
          | Jones| 33|111111111|33111111111|
          |Heisenberg| 33|222222222|33222222222|
          | Robinson| 34|111111111|34111111111|
          | Smith| 34|333333333|34333333333|
          | Williams| 15|222222222|15222222222|
          +----------+------------+---------+-----------+





          share|improve this answer






























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2














            I should use concat



            import org.apache.spark.sql.functions.concat
            val anotherdf2 = employees.withColumn("personal_id", concat($"DepartmentID".cast("String"), $"Code".cast("String")))


            +----------+------------+---------+-----------+
            | LastName|DepartmentID| Code|personal_id|
            +----------+------------+---------+-----------+
            | Rafferty| 31|222222222|31222222222|
            | Jones| 33|111111111|33111111111|
            |Heisenberg| 33|222222222|33222222222|
            | Robinson| 34|111111111|34111111111|
            | Smith| 34|333333333|34333333333|
            | Williams| 15|222222222|15222222222|
            +----------+------------+---------+-----------+





            share|improve this answer




























              2














              I should use concat



              import org.apache.spark.sql.functions.concat
              val anotherdf2 = employees.withColumn("personal_id", concat($"DepartmentID".cast("String"), $"Code".cast("String")))


              +----------+------------+---------+-----------+
              | LastName|DepartmentID| Code|personal_id|
              +----------+------------+---------+-----------+
              | Rafferty| 31|222222222|31222222222|
              | Jones| 33|111111111|33111111111|
              |Heisenberg| 33|222222222|33222222222|
              | Robinson| 34|111111111|34111111111|
              | Smith| 34|333333333|34333333333|
              | Williams| 15|222222222|15222222222|
              +----------+------------+---------+-----------+





              share|improve this answer


























                2












                2








                2







                I should use concat



                import org.apache.spark.sql.functions.concat
                val anotherdf2 = employees.withColumn("personal_id", concat($"DepartmentID".cast("String"), $"Code".cast("String")))


                +----------+------------+---------+-----------+
                | LastName|DepartmentID| Code|personal_id|
                +----------+------------+---------+-----------+
                | Rafferty| 31|222222222|31222222222|
                | Jones| 33|111111111|33111111111|
                |Heisenberg| 33|222222222|33222222222|
                | Robinson| 34|111111111|34111111111|
                | Smith| 34|333333333|34333333333|
                | Williams| 15|222222222|15222222222|
                +----------+------------+---------+-----------+





                share|improve this answer













                I should use concat



                import org.apache.spark.sql.functions.concat
                val anotherdf2 = employees.withColumn("personal_id", concat($"DepartmentID".cast("String"), $"Code".cast("String")))


                +----------+------------+---------+-----------+
                | LastName|DepartmentID| Code|personal_id|
                +----------+------------+---------+-----------+
                | Rafferty| 31|222222222|31222222222|
                | Jones| 33|111111111|33111111111|
                |Heisenberg| 33|222222222|33222222222|
                | Robinson| 34|111111111|34111111111|
                | Smith| 34|333333333|34333333333|
                | Williams| 15|222222222|15222222222|
                +----------+------------+---------+-----------+






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 16 '18 at 4:51









                Haha TTproHaha TTpro

                1,51831035




                1,51831035

















                    Popular posts from this blog

                    List item for chat from Array inside array React Native

                    Thiostrepton

                    Caerphilly