Adding a new column in the first ordinal position in a pyspark dataframe
I have a pyspark data frame like:
+--------+-------+-------+
| col1 | col2 | col3 |
+--------+-------+-------+
| 25 | 01 | 2 |
| 23 | 12 | 5 |
| 11 | 22 | 8 |
+--------+-------+-------+
and I want to create new dataframe by adding a new column like this:
+--------------+-------+-------+-------+
| new_column | col1 | col2 | col3 |
+--------------+-------+-------+-------+
| 0 | 01 | 2 | 0 |
| 0 | 12 | 5 | 0 |
| 0 | 22 | 8 | 0 |
+--------------+-------+-------+-------+
I know I can add column by:
df.withColumn("new_column", lit(0))
but it adds column at last like this:
+--------------+-------+-------+-------------+
| col1 | col1 | col2 | new_column |
+--------------+-------+-------+-------------+
| 25 | 01 | 2 | 0 |
| 23 | 12 | 5 | 0 |
| 11 | 22 | 8 | 0 |
+--------------+-------+-------+-------------+
python apache-spark pyspark apache-spark-sql
add a comment |
I have a pyspark data frame like:
+--------+-------+-------+
| col1 | col2 | col3 |
+--------+-------+-------+
| 25 | 01 | 2 |
| 23 | 12 | 5 |
| 11 | 22 | 8 |
+--------+-------+-------+
and I want to create new dataframe by adding a new column like this:
+--------------+-------+-------+-------+
| new_column | col1 | col2 | col3 |
+--------------+-------+-------+-------+
| 0 | 01 | 2 | 0 |
| 0 | 12 | 5 | 0 |
| 0 | 22 | 8 | 0 |
+--------------+-------+-------+-------+
I know I can add column by:
df.withColumn("new_column", lit(0))
but it adds column at last like this:
+--------------+-------+-------+-------------+
| col1 | col1 | col2 | new_column |
+--------------+-------+-------+-------------+
| 25 | 01 | 2 | 0 |
| 23 | 12 | 5 | 0 |
| 11 | 22 | 8 | 0 |
+--------------+-------+-------+-------------+
python apache-spark pyspark apache-spark-sql
add using withColumn and select('new_column',other columns).
– Suresh
Nov 16 '18 at 12:46
add a comment |
I have a pyspark data frame like:
+--------+-------+-------+
| col1 | col2 | col3 |
+--------+-------+-------+
| 25 | 01 | 2 |
| 23 | 12 | 5 |
| 11 | 22 | 8 |
+--------+-------+-------+
and I want to create new dataframe by adding a new column like this:
+--------------+-------+-------+-------+
| new_column | col1 | col2 | col3 |
+--------------+-------+-------+-------+
| 0 | 01 | 2 | 0 |
| 0 | 12 | 5 | 0 |
| 0 | 22 | 8 | 0 |
+--------------+-------+-------+-------+
I know I can add column by:
df.withColumn("new_column", lit(0))
but it adds column at last like this:
+--------------+-------+-------+-------------+
| col1 | col1 | col2 | new_column |
+--------------+-------+-------+-------------+
| 25 | 01 | 2 | 0 |
| 23 | 12 | 5 | 0 |
| 11 | 22 | 8 | 0 |
+--------------+-------+-------+-------------+
python apache-spark pyspark apache-spark-sql
I have a pyspark data frame like:
+--------+-------+-------+
| col1 | col2 | col3 |
+--------+-------+-------+
| 25 | 01 | 2 |
| 23 | 12 | 5 |
| 11 | 22 | 8 |
+--------+-------+-------+
and I want to create new dataframe by adding a new column like this:
+--------------+-------+-------+-------+
| new_column | col1 | col2 | col3 |
+--------------+-------+-------+-------+
| 0 | 01 | 2 | 0 |
| 0 | 12 | 5 | 0 |
| 0 | 22 | 8 | 0 |
+--------------+-------+-------+-------+
I know I can add column by:
df.withColumn("new_column", lit(0))
but it adds column at last like this:
+--------------+-------+-------+-------------+
| col1 | col1 | col2 | new_column |
+--------------+-------+-------+-------------+
| 25 | 01 | 2 | 0 |
| 23 | 12 | 5 | 0 |
| 11 | 22 | 8 | 0 |
+--------------+-------+-------+-------------+
python apache-spark pyspark apache-spark-sql
python apache-spark pyspark apache-spark-sql
edited Nov 16 '18 at 15:38
pault
16.6k42652
16.6k42652
asked Nov 16 '18 at 11:16
PRASHANT KUMAR GUPTAPRASHANT KUMAR GUPTA
61
61
add using withColumn and select('new_column',other columns).
– Suresh
Nov 16 '18 at 12:46
add a comment |
add using withColumn and select('new_column',other columns).
– Suresh
Nov 16 '18 at 12:46
add using withColumn and select('new_column',other columns).
– Suresh
Nov 16 '18 at 12:46
add using withColumn and select('new_column',other columns).
– Suresh
Nov 16 '18 at 12:46
add a comment |
3 Answers
3
active
oldest
votes
you can reorder columns using select.
df = df.select('new_column','col1','col2','col3')
df.show()
add a comment |
df.select(['new_column', 'col1', 'col2', 'col3'])
add a comment |
You can always reorder the columns in a spark DataFrame using select, as shown in this post.
In this case, you can also achieve the desired output in one step using select and alias as follows:
df = df.select(lit(0).alias("new_column"), "*")
Which is logically equivalent to the following SQL code:
SELECT 0 AS new_column, * FROM df
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53336781%2fadding-a-new-column-in-the-first-ordinal-position-in-a-pyspark-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
you can reorder columns using select.
df = df.select('new_column','col1','col2','col3')
df.show()
add a comment |
you can reorder columns using select.
df = df.select('new_column','col1','col2','col3')
df.show()
add a comment |
you can reorder columns using select.
df = df.select('new_column','col1','col2','col3')
df.show()
you can reorder columns using select.
df = df.select('new_column','col1','col2','col3')
df.show()
answered Nov 16 '18 at 13:45
TerryTerry
429516
429516
add a comment |
add a comment |
df.select(['new_column', 'col1', 'col2', 'col3'])
add a comment |
df.select(['new_column', 'col1', 'col2', 'col3'])
add a comment |
df.select(['new_column', 'col1', 'col2', 'col3'])
df.select(['new_column', 'col1', 'col2', 'col3'])
answered Nov 16 '18 at 15:13
KrisKris
819611
819611
add a comment |
add a comment |
You can always reorder the columns in a spark DataFrame using select, as shown in this post.
In this case, you can also achieve the desired output in one step using select and alias as follows:
df = df.select(lit(0).alias("new_column"), "*")
Which is logically equivalent to the following SQL code:
SELECT 0 AS new_column, * FROM df
add a comment |
You can always reorder the columns in a spark DataFrame using select, as shown in this post.
In this case, you can also achieve the desired output in one step using select and alias as follows:
df = df.select(lit(0).alias("new_column"), "*")
Which is logically equivalent to the following SQL code:
SELECT 0 AS new_column, * FROM df
add a comment |
You can always reorder the columns in a spark DataFrame using select, as shown in this post.
In this case, you can also achieve the desired output in one step using select and alias as follows:
df = df.select(lit(0).alias("new_column"), "*")
Which is logically equivalent to the following SQL code:
SELECT 0 AS new_column, * FROM df
You can always reorder the columns in a spark DataFrame using select, as shown in this post.
In this case, you can also achieve the desired output in one step using select and alias as follows:
df = df.select(lit(0).alias("new_column"), "*")
Which is logically equivalent to the following SQL code:
SELECT 0 AS new_column, * FROM df
answered Nov 16 '18 at 15:37
paultpault
16.6k42652
16.6k42652
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53336781%2fadding-a-new-column-in-the-first-ordinal-position-in-a-pyspark-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
add using withColumn and select('new_column',other columns).
– Suresh
Nov 16 '18 at 12:46