what is the difference between `json.loads()` and `.apply(json.loads)`?

up vote
-1
down vote

favorite

I am quite new to coding, and now I am trying to work on TMDB_5000 dataset from kaggle.

I ran into a problem when trying to deal with json format data like this.

[{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]

I am trying to use json.loads() to deal with data, the code is credits['cast'] = json.loads(credits['cast']). But it give me an error like this

---------------------------------------------------------------------------

TypeError Traceback (most recent call
last)
in ()
----> 1 credits['cast'] = json.loads(credits['cast'])
/anaconda3/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant,
object_pairs_hook, **kw)
346 if not isinstance(s, (bytes, bytearray)):
347 raise TypeError('the JSON object must be str, bytes or bytearray, '
--> 348 'not {!r}'.format(s.class.name))
349 s = s.decode(detect_encoding(s), 'surrogatepass')
350
TypeError: the JSON object must be str, bytes or bytearray, not 'Series'

However, the code credits['cast'] = credits['cast'].apply(json.loads)works. So I am very confused, because I think there isn't difference between this two lines of code.

Can anyone explain that to me?

edited Nov 11 at 7:21

pygo

1,654416

asked Nov 11 at 3:53

Qiaoyi Li

to make it clear, the cell number 7 works
– Qiaoyi Li
Nov 11 at 3:53

when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
– Qiaoyi Li
Nov 11 at 4:04

Will not be good to first load your data and then do panadas operation?
– pygo
Nov 11 at 4:32

add a comment |

up vote
-1
down vote

favorite

I am quite new to coding, and now I am trying to work on TMDB_5000 dataset from kaggle.

I ran into a problem when trying to deal with json format data like this.

I am trying to use json.loads() to deal with data, the code is credits['cast'] = json.loads(credits['cast']). But it give me an error like this

---------------------------------------------------------------------------

TypeError Traceback (most recent call
last)
in ()
----> 1 credits['cast'] = json.loads(credits['cast'])
/anaconda3/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant,
object_pairs_hook, **kw)
346 if not isinstance(s, (bytes, bytearray)):
347 raise TypeError('the JSON object must be str, bytes or bytearray, '
--> 348 'not {!r}'.format(s.class.name))
349 s = s.decode(detect_encoding(s), 'surrogatepass')
350
TypeError: the JSON object must be str, bytes or bytearray, not 'Series'

However, the code credits['cast'] = credits['cast'].apply(json.loads)works. So I am very confused, because I think there isn't difference between this two lines of code.

Can anyone explain that to me?

edited Nov 11 at 7:21

pygo

1,654416

asked Nov 11 at 3:53

Qiaoyi Li

to make it clear, the cell number 7 works
– Qiaoyi Li
Nov 11 at 3:53

when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
– Qiaoyi Li
Nov 11 at 4:04

Will not be good to first load your data and then do panadas operation?
– pygo
Nov 11 at 4:32

add a comment |

up vote
-1
down vote

favorite

I am quite new to coding, and now I am trying to work on TMDB_5000 dataset from kaggle.

I ran into a problem when trying to deal with json format data like this.

I am trying to use json.loads() to deal with data, the code is credits['cast'] = json.loads(credits['cast']). But it give me an error like this

---------------------------------------------------------------------------

TypeError Traceback (most recent call
last)
in ()
----> 1 credits['cast'] = json.loads(credits['cast'])
/anaconda3/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant,
object_pairs_hook, **kw)
346 if not isinstance(s, (bytes, bytearray)):
347 raise TypeError('the JSON object must be str, bytes or bytearray, '
--> 348 'not {!r}'.format(s.class.name))
349 s = s.decode(detect_encoding(s), 'surrogatepass')
350
TypeError: the JSON object must be str, bytes or bytearray, not 'Series'

However, the code credits['cast'] = credits['cast'].apply(json.loads)works. So I am very confused, because I think there isn't difference between this two lines of code.

Can anyone explain that to me?

edited Nov 11 at 7:21

pygo

1,654416

asked Nov 11 at 3:53

Qiaoyi Li

I am quite new to coding, and now I am trying to work on TMDB_5000 dataset from kaggle.

I ran into a problem when trying to deal with json format data like this.

I am trying to use json.loads() to deal with data, the code is credits['cast'] = json.loads(credits['cast']). But it give me an error like this

---------------------------------------------------------------------------

TypeError Traceback (most recent call
last)
in ()
----> 1 credits['cast'] = json.loads(credits['cast'])
/anaconda3/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant,
object_pairs_hook, **kw)
346 if not isinstance(s, (bytes, bytearray)):
347 raise TypeError('the JSON object must be str, bytes or bytearray, '
--> 348 'not {!r}'.format(s.class.name))
349 s = s.decode(detect_encoding(s), 'surrogatepass')
350
TypeError: the JSON object must be str, bytes or bytearray, not 'Series'

However, the code credits['cast'] = credits['cast'].apply(json.loads)works. So I am very confused, because I think there isn't difference between this two lines of code.

Can anyone explain that to me?

python pandas

edited Nov 11 at 7:21

pygo

1,654416

asked Nov 11 at 3:53

Qiaoyi Li

edited Nov 11 at 7:21

pygo

1,654416

asked Nov 11 at 3:53

Qiaoyi Li

edited Nov 11 at 7:21

pygo

1,654416

edited Nov 11 at 7:21

pygo

1,654416

edited Nov 11 at 7:21

pygo

1,654416

asked Nov 11 at 3:53

Qiaoyi Li

asked Nov 11 at 3:53

Qiaoyi Li

asked Nov 11 at 3:53

Qiaoyi Li

to make it clear, the cell number 7 works
– Qiaoyi Li
Nov 11 at 3:53

when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
– Qiaoyi Li
Nov 11 at 4:04

Will not be good to first load your data and then do panadas operation?
– pygo
Nov 11 at 4:32

add a comment |

to make it clear, the cell number 7 works
– Qiaoyi Li
Nov 11 at 3:53

when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
– Qiaoyi Li
Nov 11 at 4:04

Will not be good to first load your data and then do panadas operation?
– pygo
Nov 11 at 4:32

to make it clear, the cell number 7 works
– Qiaoyi Li
Nov 11 at 3:53

when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
– Qiaoyi Li
Nov 11 at 4:04

Will not be good to first load your data and then do panadas operation?
– pygo
Nov 11 at 4:32

add a comment |

3 Answers
3

active

oldest

votes

up vote
0
down vote

accepted

The issue is that your credits variable is a Pandas DataFrame and so credits['cast'] is a Series). The json.loads function doesn't know how to deal with data types from pandas, so you get an error when you do json.loads(credits['cast']).

The Series type however has an apply method that accepts a function to be called on each value it contains. That's why credits['cast'].apply(json.loads) works, it passes json.loads as the argument to apply.

answered Nov 11 at 4:23

Blckknght

61.2k55599

add a comment |

up vote
0
down vote

The following code:

credits['cast'] = credits['cast'].apply(json.loads)

applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.

The following code:

credits['cast'] = json.loads(credits['cast'])

attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.

answered Nov 11 at 4:23

DYZ

24.1k61948

Thank you, it is very explicitly explained~😊
– Qiaoyi Li
Nov 11 at 6:33

add a comment |

up vote
0
down vote

However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]

Create a DataFrame with using DataFrame.from_dict

df = pd.DataFrame.from_dict(d_list)

print(df)



cast_id   character                 credit_id  gender       id             name  order

0      242  Jake Sully  5602a8a7c3a3685532001c9a     2.0  65731.0  Sam Worthington    0.0

1        3     Neytiri                       NaN     NaN      NaN              NaN    NaN

Another way around which suited for this ppurpose is pd.read_json with orient='records'.

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]

df = pd.read_json(d_list, orient='records')

print(df

edited Nov 11 at 4:52

answered Nov 11 at 4:45

pygo

1,654416

You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
– pygo
Nov 11 at 6:41

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245697%2fwhat-is-the-difference-between-json-loads-and-applyjson-loads%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
0
down vote

accepted

answered Nov 11 at 4:23

Blckknght

61.2k55599

add a comment |

up vote
0
down vote

accepted

answered Nov 11 at 4:23

Blckknght

61.2k55599

add a comment |

up vote
0
down vote

accepted

answered Nov 11 at 4:23

Blckknght

61.2k55599

answered Nov 11 at 4:23

Blckknght

61.2k55599

answered Nov 11 at 4:23

Blckknght

61.2k55599

answered Nov 11 at 4:23

Blckknght

61.2k55599

answered Nov 11 at 4:23

Blckknght

61.2k55599

add a comment |

up vote
0
down vote

The following code:

credits['cast'] = credits['cast'].apply(json.loads)

applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.

The following code:

credits['cast'] = json.loads(credits['cast'])

attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.

answered Nov 11 at 4:23

DYZ

24.1k61948

Thank you, it is very explicitly explained~😊
– Qiaoyi Li
Nov 11 at 6:33

add a comment |

up vote
0
down vote

The following code:

credits['cast'] = credits['cast'].apply(json.loads)

applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.

The following code:

credits['cast'] = json.loads(credits['cast'])

attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.

answered Nov 11 at 4:23

DYZ

24.1k61948

Thank you, it is very explicitly explained~😊
– Qiaoyi Li
Nov 11 at 6:33

add a comment |

up vote
0
down vote

The following code:

credits['cast'] = credits['cast'].apply(json.loads)

applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.

The following code:

credits['cast'] = json.loads(credits['cast'])

attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.

answered Nov 11 at 4:23

DYZ

24.1k61948

The following code:

credits['cast'] = credits['cast'].apply(json.loads)

applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.

The following code:

credits['cast'] = json.loads(credits['cast'])

attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.

answered Nov 11 at 4:23

DYZ

24.1k61948

answered Nov 11 at 4:23

DYZ

24.1k61948

answered Nov 11 at 4:23

DYZ

24.1k61948

answered Nov 11 at 4:23

DYZ

24.1k61948

Thank you, it is very explicitly explained~😊
– Qiaoyi Li
Nov 11 at 6:33

add a comment |

Thank you, it is very explicitly explained~😊
– Qiaoyi Li
Nov 11 at 6:33

Thank you, it is very explicitly explained~😊
– Qiaoyi Li
Nov 11 at 6:33

add a comment |

up vote
0
down vote

However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]

Create a DataFrame with using DataFrame.from_dict

df = pd.DataFrame.from_dict(d_list)

print(df)



cast_id   character                 credit_id  gender       id             name  order

0      242  Jake Sully  5602a8a7c3a3685532001c9a     2.0  65731.0  Sam Worthington    0.0

1        3     Neytiri                       NaN     NaN      NaN              NaN    NaN

Another way around which suited for this ppurpose is pd.read_json with orient='records'.

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]

df = pd.read_json(d_list, orient='records')

print(df

edited Nov 11 at 4:52

answered Nov 11 at 4:45

pygo

1,654416

You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
– pygo
Nov 11 at 6:41

add a comment |

up vote
0
down vote

However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]

Create a DataFrame with using DataFrame.from_dict

df = pd.DataFrame.from_dict(d_list)

print(df)



cast_id   character                 credit_id  gender       id             name  order

0      242  Jake Sully  5602a8a7c3a3685532001c9a     2.0  65731.0  Sam Worthington    0.0

1        3     Neytiri                       NaN     NaN      NaN              NaN    NaN

Another way around which suited for this ppurpose is pd.read_json with orient='records'.

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]

df = pd.read_json(d_list, orient='records')

print(df

edited Nov 11 at 4:52

answered Nov 11 at 4:45

pygo

1,654416

You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
– pygo
Nov 11 at 6:41

add a comment |

up vote
0
down vote

However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]

Create a DataFrame with using DataFrame.from_dict

df = pd.DataFrame.from_dict(d_list)

print(df)



cast_id   character                 credit_id  gender       id             name  order

0      242  Jake Sully  5602a8a7c3a3685532001c9a     2.0  65731.0  Sam Worthington    0.0

1        3     Neytiri                       NaN     NaN      NaN              NaN    NaN

Another way around which suited for this ppurpose is pd.read_json with orient='records'.

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]

df = pd.read_json(d_list, orient='records')

print(df

edited Nov 11 at 4:52

answered Nov 11 at 4:45

pygo

1,654416

However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]

Create a DataFrame with using DataFrame.from_dict

df = pd.DataFrame.from_dict(d_list)

print(df)



cast_id   character                 credit_id  gender       id             name  order

0      242  Jake Sully  5602a8a7c3a3685532001c9a     2.0  65731.0  Sam Worthington    0.0

1        3     Neytiri                       NaN     NaN      NaN              NaN    NaN

Another way around which suited for this ppurpose is pd.read_json with orient='records'.

import pandas as pd

d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]

df = pd.read_json(d_list, orient='records')

print(df

edited Nov 11 at 4:52

answered Nov 11 at 4:45

pygo

1,654416

edited Nov 11 at 4:52

answered Nov 11 at 4:45

pygo

1,654416

answered Nov 11 at 4:45

pygo

1,654416

answered Nov 11 at 4:45

pygo

1,654416

You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
– pygo
Nov 11 at 6:41

add a comment |

You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
– pygo
Nov 11 at 6:41

You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
– pygo
Nov 11 at 6:41

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

7cX5iBNut3C,Nt,WQj1Izm,rrgVvJ zLb,EcbclSAz9SWzp

搜尋此網誌

Vfrdtyky