How to join Panda DataFrames based on List values in a column [duplicate]





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0
















This question already has an answer here:




  • How to unnest (explode) a column in a pandas DataFrame?

    6 answers




There are two Pandas DataFrame



df_A = pd.DataFrame([['r1', ['a','b']], ['r2',['aabb','b']], ['r3', ['xyz']]], columns=['col1', 'col2'])

col1 col2
r1 [a, b]
r2 [aabb, b]
r3 [xyz]


df_B = pd.DataFrame([['a', 10], ['b',2]], columns=['C1', 'C2'])

C1 C2
a 10
b 2


I want to join both dataframes such as df_C is



col1 C1  C2
r1 a 10
r1 b 2
r2 aabb 0
r2 b 2
r3 xyz 0









share|improve this question













marked as duplicate by Sandeep Kadapa, cs95 pandas
Users with the  pandas badge can single-handedly close pandas questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 17 '18 at 8:04


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • Thanks, I guess, it might need expert-level knowledge to understand "unsetting" is the same as what I was looking for.

    – Watt
    Nov 17 '18 at 20:27




















0
















This question already has an answer here:




  • How to unnest (explode) a column in a pandas DataFrame?

    6 answers




There are two Pandas DataFrame



df_A = pd.DataFrame([['r1', ['a','b']], ['r2',['aabb','b']], ['r3', ['xyz']]], columns=['col1', 'col2'])

col1 col2
r1 [a, b]
r2 [aabb, b]
r3 [xyz]


df_B = pd.DataFrame([['a', 10], ['b',2]], columns=['C1', 'C2'])

C1 C2
a 10
b 2


I want to join both dataframes such as df_C is



col1 C1  C2
r1 a 10
r1 b 2
r2 aabb 0
r2 b 2
r3 xyz 0









share|improve this question













marked as duplicate by Sandeep Kadapa, cs95 pandas
Users with the  pandas badge can single-handedly close pandas questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 17 '18 at 8:04


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • Thanks, I guess, it might need expert-level knowledge to understand "unsetting" is the same as what I was looking for.

    – Watt
    Nov 17 '18 at 20:27
















0












0








0









This question already has an answer here:




  • How to unnest (explode) a column in a pandas DataFrame?

    6 answers




There are two Pandas DataFrame



df_A = pd.DataFrame([['r1', ['a','b']], ['r2',['aabb','b']], ['r3', ['xyz']]], columns=['col1', 'col2'])

col1 col2
r1 [a, b]
r2 [aabb, b]
r3 [xyz]


df_B = pd.DataFrame([['a', 10], ['b',2]], columns=['C1', 'C2'])

C1 C2
a 10
b 2


I want to join both dataframes such as df_C is



col1 C1  C2
r1 a 10
r1 b 2
r2 aabb 0
r2 b 2
r3 xyz 0









share|improve this question















This question already has an answer here:




  • How to unnest (explode) a column in a pandas DataFrame?

    6 answers




There are two Pandas DataFrame



df_A = pd.DataFrame([['r1', ['a','b']], ['r2',['aabb','b']], ['r3', ['xyz']]], columns=['col1', 'col2'])

col1 col2
r1 [a, b]
r2 [aabb, b]
r3 [xyz]


df_B = pd.DataFrame([['a', 10], ['b',2]], columns=['C1', 'C2'])

C1 C2
a 10
b 2


I want to join both dataframes such as df_C is



col1 C1  C2
r1 a 10
r1 b 2
r2 aabb 0
r2 b 2
r3 xyz 0




This question already has an answer here:




  • How to unnest (explode) a column in a pandas DataFrame?

    6 answers








python pandas






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 17 '18 at 7:25









WattWatt

1,670104268




1,670104268




marked as duplicate by Sandeep Kadapa, cs95 pandas
Users with the  pandas badge can single-handedly close pandas questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 17 '18 at 8:04


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by Sandeep Kadapa, cs95 pandas
Users with the  pandas badge can single-handedly close pandas questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 17 '18 at 8:04


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • Thanks, I guess, it might need expert-level knowledge to understand "unsetting" is the same as what I was looking for.

    – Watt
    Nov 17 '18 at 20:27





















  • Thanks, I guess, it might need expert-level knowledge to understand "unsetting" is the same as what I was looking for.

    – Watt
    Nov 17 '18 at 20:27



















Thanks, I guess, it might need expert-level knowledge to understand "unsetting" is the same as what I was looking for.

– Watt
Nov 17 '18 at 20:27







Thanks, I guess, it might need expert-level knowledge to understand "unsetting" is the same as what I was looking for.

– Watt
Nov 17 '18 at 20:27














1 Answer
1






active

oldest

votes


















1














You need:



df = pd.DataFrame([['r1', ['a','b']], ['r2',['aabb','b']], ['r3', ['xyz']]], columns=['col1', 'col2'])

df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()),
'C1':np.concatenate(df.col2.values)})

df_B = pd.DataFrame([['a', 10], ['b',2]], columns=['C1', 'C2'])
df_B = dict(zip(df_B.C1, df_B.C2))
# {'a': 10, 'b': 2}

df['C2']= df['C1'].apply(lambda x: df_B[x] if x in df_B.keys() else 0)

print(df)


Output:



    col1  C1    C2
0 r1 a 10
1 r1 b 2
2 r2 aabb 0
3 r2 b 2
4 r3 xyz 0


Edit



The below code will give you the length of the list in each row.



print(df.col2.str.len())

# 0 2
# 1 2
# 2 1


np.repeat will repeat the values from col1 based length obtained using above.
eg. r1,r2 will repeat twice.



print(np.repeat(df.col1.values, df.col2.str.len())
# ['r1' 'r1' 'r2' 'r2' 'r3']


Using np.concatenate on col2.values will result in plain 1D List



print(np.concatenate(df.col2.values))
# ['a' 'b' 'aabb' 'b' 'xyz']





share|improve this answer


























  • Thanks, can you please explain what you are doing here df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()), 'C1':np.concatenate(df.col2.values)})

    – Watt
    Nov 17 '18 at 20:41






  • 1





    @Watt I have edited my answer. Hope it is helpful.

    – AkshayNevrekar
    Nov 18 '18 at 3:39


















1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














You need:



df = pd.DataFrame([['r1', ['a','b']], ['r2',['aabb','b']], ['r3', ['xyz']]], columns=['col1', 'col2'])

df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()),
'C1':np.concatenate(df.col2.values)})

df_B = pd.DataFrame([['a', 10], ['b',2]], columns=['C1', 'C2'])
df_B = dict(zip(df_B.C1, df_B.C2))
# {'a': 10, 'b': 2}

df['C2']= df['C1'].apply(lambda x: df_B[x] if x in df_B.keys() else 0)

print(df)


Output:



    col1  C1    C2
0 r1 a 10
1 r1 b 2
2 r2 aabb 0
3 r2 b 2
4 r3 xyz 0


Edit



The below code will give you the length of the list in each row.



print(df.col2.str.len())

# 0 2
# 1 2
# 2 1


np.repeat will repeat the values from col1 based length obtained using above.
eg. r1,r2 will repeat twice.



print(np.repeat(df.col1.values, df.col2.str.len())
# ['r1' 'r1' 'r2' 'r2' 'r3']


Using np.concatenate on col2.values will result in plain 1D List



print(np.concatenate(df.col2.values))
# ['a' 'b' 'aabb' 'b' 'xyz']





share|improve this answer


























  • Thanks, can you please explain what you are doing here df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()), 'C1':np.concatenate(df.col2.values)})

    – Watt
    Nov 17 '18 at 20:41






  • 1





    @Watt I have edited my answer. Hope it is helpful.

    – AkshayNevrekar
    Nov 18 '18 at 3:39
















1














You need:



df = pd.DataFrame([['r1', ['a','b']], ['r2',['aabb','b']], ['r3', ['xyz']]], columns=['col1', 'col2'])

df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()),
'C1':np.concatenate(df.col2.values)})

df_B = pd.DataFrame([['a', 10], ['b',2]], columns=['C1', 'C2'])
df_B = dict(zip(df_B.C1, df_B.C2))
# {'a': 10, 'b': 2}

df['C2']= df['C1'].apply(lambda x: df_B[x] if x in df_B.keys() else 0)

print(df)


Output:



    col1  C1    C2
0 r1 a 10
1 r1 b 2
2 r2 aabb 0
3 r2 b 2
4 r3 xyz 0


Edit



The below code will give you the length of the list in each row.



print(df.col2.str.len())

# 0 2
# 1 2
# 2 1


np.repeat will repeat the values from col1 based length obtained using above.
eg. r1,r2 will repeat twice.



print(np.repeat(df.col1.values, df.col2.str.len())
# ['r1' 'r1' 'r2' 'r2' 'r3']


Using np.concatenate on col2.values will result in plain 1D List



print(np.concatenate(df.col2.values))
# ['a' 'b' 'aabb' 'b' 'xyz']





share|improve this answer


























  • Thanks, can you please explain what you are doing here df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()), 'C1':np.concatenate(df.col2.values)})

    – Watt
    Nov 17 '18 at 20:41






  • 1





    @Watt I have edited my answer. Hope it is helpful.

    – AkshayNevrekar
    Nov 18 '18 at 3:39














1












1








1







You need:



df = pd.DataFrame([['r1', ['a','b']], ['r2',['aabb','b']], ['r3', ['xyz']]], columns=['col1', 'col2'])

df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()),
'C1':np.concatenate(df.col2.values)})

df_B = pd.DataFrame([['a', 10], ['b',2]], columns=['C1', 'C2'])
df_B = dict(zip(df_B.C1, df_B.C2))
# {'a': 10, 'b': 2}

df['C2']= df['C1'].apply(lambda x: df_B[x] if x in df_B.keys() else 0)

print(df)


Output:



    col1  C1    C2
0 r1 a 10
1 r1 b 2
2 r2 aabb 0
3 r2 b 2
4 r3 xyz 0


Edit



The below code will give you the length of the list in each row.



print(df.col2.str.len())

# 0 2
# 1 2
# 2 1


np.repeat will repeat the values from col1 based length obtained using above.
eg. r1,r2 will repeat twice.



print(np.repeat(df.col1.values, df.col2.str.len())
# ['r1' 'r1' 'r2' 'r2' 'r3']


Using np.concatenate on col2.values will result in plain 1D List



print(np.concatenate(df.col2.values))
# ['a' 'b' 'aabb' 'b' 'xyz']





share|improve this answer















You need:



df = pd.DataFrame([['r1', ['a','b']], ['r2',['aabb','b']], ['r3', ['xyz']]], columns=['col1', 'col2'])

df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()),
'C1':np.concatenate(df.col2.values)})

df_B = pd.DataFrame([['a', 10], ['b',2]], columns=['C1', 'C2'])
df_B = dict(zip(df_B.C1, df_B.C2))
# {'a': 10, 'b': 2}

df['C2']= df['C1'].apply(lambda x: df_B[x] if x in df_B.keys() else 0)

print(df)


Output:



    col1  C1    C2
0 r1 a 10
1 r1 b 2
2 r2 aabb 0
3 r2 b 2
4 r3 xyz 0


Edit



The below code will give you the length of the list in each row.



print(df.col2.str.len())

# 0 2
# 1 2
# 2 1


np.repeat will repeat the values from col1 based length obtained using above.
eg. r1,r2 will repeat twice.



print(np.repeat(df.col1.values, df.col2.str.len())
# ['r1' 'r1' 'r2' 'r2' 'r3']


Using np.concatenate on col2.values will result in plain 1D List



print(np.concatenate(df.col2.values))
# ['a' 'b' 'aabb' 'b' 'xyz']






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 18 '18 at 3:39

























answered Nov 17 '18 at 7:57









AkshayNevrekarAkshayNevrekar

6,261102143




6,261102143













  • Thanks, can you please explain what you are doing here df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()), 'C1':np.concatenate(df.col2.values)})

    – Watt
    Nov 17 '18 at 20:41






  • 1





    @Watt I have edited my answer. Hope it is helpful.

    – AkshayNevrekar
    Nov 18 '18 at 3:39



















  • Thanks, can you please explain what you are doing here df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()), 'C1':np.concatenate(df.col2.values)})

    – Watt
    Nov 17 '18 at 20:41






  • 1





    @Watt I have edited my answer. Hope it is helpful.

    – AkshayNevrekar
    Nov 18 '18 at 3:39

















Thanks, can you please explain what you are doing here df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()), 'C1':np.concatenate(df.col2.values)})

– Watt
Nov 17 '18 at 20:41





Thanks, can you please explain what you are doing here df= pd.DataFrame({'col1':np.repeat(df.col1.values, df.col2.str.len()), 'C1':np.concatenate(df.col2.values)})

– Watt
Nov 17 '18 at 20:41




1




1





@Watt I have edited my answer. Hope it is helpful.

– AkshayNevrekar
Nov 18 '18 at 3:39





@Watt I have edited my answer. Hope it is helpful.

– AkshayNevrekar
Nov 18 '18 at 3:39





Popular posts from this blog

Xamarin.iOS Cant Deploy on Iphone

Glorious Revolution

Dulmage-Mendelsohn matrix decomposition in Python