what is the difference between `json.loads()` and `.apply(json.loads)`?











up vote
-1
down vote

favorite












I am quite new to coding, and now I am trying to work on TMDB_5000 dataset from kaggle.



I ran into a problem when trying to deal with json format data like this.



[{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]



I am trying to use json.loads() to deal with data, the code is credits['cast'] = json.loads(credits['cast']). But it give me an error like this



---------------------------------------------------------------------------



TypeError Traceback (most recent call
last)
in ()
----> 1 credits['cast'] = json.loads(credits['cast'])



/anaconda3/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant,


object_pairs_hook, **kw)
346 if not isinstance(s, (bytes, bytearray)):
347 raise TypeError('the JSON object must be str, bytes or bytearray, '
--> 348 'not {!r}'.format(s.class.name))
349 s = s.decode(detect_encoding(s), 'surrogatepass')
350



TypeError: the JSON object must be str, bytes or bytearray, not 'Series'



However, the code credits['cast'] = credits['cast'].apply(json.loads)works. So I am very confused, because I think there isn't difference between this two lines of code.



Can anyone explain that to me?










share|improve this question
























  • to make it clear, the cell number 7 works
    – Qiaoyi Li
    Nov 11 at 3:53










  • when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
    – Qiaoyi Li
    Nov 11 at 4:04












  • Will not be good to first load your data and then do panadas operation?
    – pygo
    Nov 11 at 4:32















up vote
-1
down vote

favorite












I am quite new to coding, and now I am trying to work on TMDB_5000 dataset from kaggle.



I ran into a problem when trying to deal with json format data like this.



[{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]



I am trying to use json.loads() to deal with data, the code is credits['cast'] = json.loads(credits['cast']). But it give me an error like this



---------------------------------------------------------------------------



TypeError Traceback (most recent call
last)
in ()
----> 1 credits['cast'] = json.loads(credits['cast'])



/anaconda3/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant,


object_pairs_hook, **kw)
346 if not isinstance(s, (bytes, bytearray)):
347 raise TypeError('the JSON object must be str, bytes or bytearray, '
--> 348 'not {!r}'.format(s.class.name))
349 s = s.decode(detect_encoding(s), 'surrogatepass')
350



TypeError: the JSON object must be str, bytes or bytearray, not 'Series'



However, the code credits['cast'] = credits['cast'].apply(json.loads)works. So I am very confused, because I think there isn't difference between this two lines of code.



Can anyone explain that to me?










share|improve this question
























  • to make it clear, the cell number 7 works
    – Qiaoyi Li
    Nov 11 at 3:53










  • when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
    – Qiaoyi Li
    Nov 11 at 4:04












  • Will not be good to first load your data and then do panadas operation?
    – pygo
    Nov 11 at 4:32













up vote
-1
down vote

favorite









up vote
-1
down vote

favorite











I am quite new to coding, and now I am trying to work on TMDB_5000 dataset from kaggle.



I ran into a problem when trying to deal with json format data like this.



[{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]



I am trying to use json.loads() to deal with data, the code is credits['cast'] = json.loads(credits['cast']). But it give me an error like this



---------------------------------------------------------------------------



TypeError Traceback (most recent call
last)
in ()
----> 1 credits['cast'] = json.loads(credits['cast'])



/anaconda3/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant,


object_pairs_hook, **kw)
346 if not isinstance(s, (bytes, bytearray)):
347 raise TypeError('the JSON object must be str, bytes or bytearray, '
--> 348 'not {!r}'.format(s.class.name))
349 s = s.decode(detect_encoding(s), 'surrogatepass')
350



TypeError: the JSON object must be str, bytes or bytearray, not 'Series'



However, the code credits['cast'] = credits['cast'].apply(json.loads)works. So I am very confused, because I think there isn't difference between this two lines of code.



Can anyone explain that to me?










share|improve this question















I am quite new to coding, and now I am trying to work on TMDB_5000 dataset from kaggle.



I ran into a problem when trying to deal with json format data like this.



[{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]



I am trying to use json.loads() to deal with data, the code is credits['cast'] = json.loads(credits['cast']). But it give me an error like this



---------------------------------------------------------------------------



TypeError Traceback (most recent call
last)
in ()
----> 1 credits['cast'] = json.loads(credits['cast'])



/anaconda3/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant,


object_pairs_hook, **kw)
346 if not isinstance(s, (bytes, bytearray)):
347 raise TypeError('the JSON object must be str, bytes or bytearray, '
--> 348 'not {!r}'.format(s.class.name))
349 s = s.decode(detect_encoding(s), 'surrogatepass')
350



TypeError: the JSON object must be str, bytes or bytearray, not 'Series'



However, the code credits['cast'] = credits['cast'].apply(json.loads)works. So I am very confused, because I think there isn't difference between this two lines of code.



Can anyone explain that to me?







python pandas






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 7:21









pygo

1,654416




1,654416










asked Nov 11 at 3:53









Qiaoyi Li

82




82












  • to make it clear, the cell number 7 works
    – Qiaoyi Li
    Nov 11 at 3:53










  • when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
    – Qiaoyi Li
    Nov 11 at 4:04












  • Will not be good to first load your data and then do panadas operation?
    – pygo
    Nov 11 at 4:32


















  • to make it clear, the cell number 7 works
    – Qiaoyi Li
    Nov 11 at 3:53










  • when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
    – Qiaoyi Li
    Nov 11 at 4:04












  • Will not be good to first load your data and then do panadas operation?
    – pygo
    Nov 11 at 4:32
















to make it clear, the cell number 7 works
– Qiaoyi Li
Nov 11 at 3:53




to make it clear, the cell number 7 works
– Qiaoyi Li
Nov 11 at 3:53












when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
– Qiaoyi Li
Nov 11 at 4:04






when I am trying to load json format data, this one credits['cast'] = json.loads(credits['cast']) doesn't work and gives me error"the JSON object must be str, bytes or bytearray, not 'Series'". However, this one works ` credits['cast'] = credits['cast'].apply(json.loads). I don't understand, is there any difference between this two lines of code?
– Qiaoyi Li
Nov 11 at 4:04














Will not be good to first load your data and then do panadas operation?
– pygo
Nov 11 at 4:32




Will not be good to first load your data and then do panadas operation?
– pygo
Nov 11 at 4:32












3 Answers
3






active

oldest

votes

















up vote
0
down vote



accepted










The issue is that your credits variable is a Pandas DataFrame and so credits['cast'] is a Series). The json.loads function doesn't know how to deal with data types from pandas, so you get an error when you do json.loads(credits['cast']).



The Series type however has an apply method that accepts a function to be called on each value it contains. That's why credits['cast'].apply(json.loads) works, it passes json.loads as the argument to apply.






share|improve this answer




























    up vote
    0
    down vote













    The following code:



    credits['cast'] = credits['cast'].apply(json.loads)


    applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.



    The following code:



    credits['cast'] = json.loads(credits['cast'])


    attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.






    share|improve this answer





















    • Thank you, it is very explicitly explained~😊
      – Qiaoyi Li
      Nov 11 at 6:33


















    up vote
    0
    down vote













    However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:



    import pandas as pd
    d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]


    Create a DataFrame with using DataFrame.from_dict



    df = pd.DataFrame.from_dict(d_list)
    print(df)

    cast_id character credit_id gender id name order
    0 242 Jake Sully 5602a8a7c3a3685532001c9a 2.0 65731.0 Sam Worthington 0.0
    1 3 Neytiri NaN NaN NaN NaN NaN


    Another way around which suited for this ppurpose is pd.read_json with orient='records'.



    import pandas as pd
    d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]
    df = pd.read_json(d_list, orient='records')
    print(df





    share|improve this answer























    • You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
      – pygo
      Nov 11 at 6:41











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245697%2fwhat-is-the-difference-between-json-loads-and-applyjson-loads%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote



    accepted










    The issue is that your credits variable is a Pandas DataFrame and so credits['cast'] is a Series). The json.loads function doesn't know how to deal with data types from pandas, so you get an error when you do json.loads(credits['cast']).



    The Series type however has an apply method that accepts a function to be called on each value it contains. That's why credits['cast'].apply(json.loads) works, it passes json.loads as the argument to apply.






    share|improve this answer

























      up vote
      0
      down vote



      accepted










      The issue is that your credits variable is a Pandas DataFrame and so credits['cast'] is a Series). The json.loads function doesn't know how to deal with data types from pandas, so you get an error when you do json.loads(credits['cast']).



      The Series type however has an apply method that accepts a function to be called on each value it contains. That's why credits['cast'].apply(json.loads) works, it passes json.loads as the argument to apply.






      share|improve this answer























        up vote
        0
        down vote



        accepted







        up vote
        0
        down vote



        accepted






        The issue is that your credits variable is a Pandas DataFrame and so credits['cast'] is a Series). The json.loads function doesn't know how to deal with data types from pandas, so you get an error when you do json.loads(credits['cast']).



        The Series type however has an apply method that accepts a function to be called on each value it contains. That's why credits['cast'].apply(json.loads) works, it passes json.loads as the argument to apply.






        share|improve this answer












        The issue is that your credits variable is a Pandas DataFrame and so credits['cast'] is a Series). The json.loads function doesn't know how to deal with data types from pandas, so you get an error when you do json.loads(credits['cast']).



        The Series type however has an apply method that accepts a function to be called on each value it contains. That's why credits['cast'].apply(json.loads) works, it passes json.loads as the argument to apply.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 11 at 4:23









        Blckknght

        61.2k55599




        61.2k55599
























            up vote
            0
            down vote













            The following code:



            credits['cast'] = credits['cast'].apply(json.loads)


            applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.



            The following code:



            credits['cast'] = json.loads(credits['cast'])


            attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.






            share|improve this answer





















            • Thank you, it is very explicitly explained~😊
              – Qiaoyi Li
              Nov 11 at 6:33















            up vote
            0
            down vote













            The following code:



            credits['cast'] = credits['cast'].apply(json.loads)


            applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.



            The following code:



            credits['cast'] = json.loads(credits['cast'])


            attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.






            share|improve this answer





















            • Thank you, it is very explicitly explained~😊
              – Qiaoyi Li
              Nov 11 at 6:33













            up vote
            0
            down vote










            up vote
            0
            down vote









            The following code:



            credits['cast'] = credits['cast'].apply(json.loads)


            applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.



            The following code:



            credits['cast'] = json.loads(credits['cast'])


            attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.






            share|improve this answer












            The following code:



            credits['cast'] = credits['cast'].apply(json.loads)


            applies function json.loads to each row of credits['cast'] (each row being a string). The result is a series of decoded objects.



            The following code:



            credits['cast'] = json.loads(credits['cast'])


            attempts to apply the same function to the Series credits['cast'], but the function cannot be applied to a Series.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 11 at 4:23









            DYZ

            24.1k61948




            24.1k61948












            • Thank you, it is very explicitly explained~😊
              – Qiaoyi Li
              Nov 11 at 6:33


















            • Thank you, it is very explicitly explained~😊
              – Qiaoyi Li
              Nov 11 at 6:33
















            Thank you, it is very explicitly explained~😊
            – Qiaoyi Li
            Nov 11 at 6:33




            Thank you, it is very explicitly explained~😊
            – Qiaoyi Li
            Nov 11 at 6:33










            up vote
            0
            down vote













            However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:



            import pandas as pd
            d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]


            Create a DataFrame with using DataFrame.from_dict



            df = pd.DataFrame.from_dict(d_list)
            print(df)

            cast_id character credit_id gender id name order
            0 242 Jake Sully 5602a8a7c3a3685532001c9a 2.0 65731.0 Sam Worthington 0.0
            1 3 Neytiri NaN NaN NaN NaN NaN


            Another way around which suited for this ppurpose is pd.read_json with orient='records'.



            import pandas as pd
            d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]
            df = pd.read_json(d_list, orient='records')
            print(df





            share|improve this answer























            • You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
              – pygo
              Nov 11 at 6:41















            up vote
            0
            down vote













            However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:



            import pandas as pd
            d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]


            Create a DataFrame with using DataFrame.from_dict



            df = pd.DataFrame.from_dict(d_list)
            print(df)

            cast_id character credit_id gender id name order
            0 242 Jake Sully 5602a8a7c3a3685532001c9a 2.0 65731.0 Sam Worthington 0.0
            1 3 Neytiri NaN NaN NaN NaN NaN


            Another way around which suited for this ppurpose is pd.read_json with orient='records'.



            import pandas as pd
            d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]
            df = pd.read_json(d_list, orient='records')
            print(df





            share|improve this answer























            • You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
              – pygo
              Nov 11 at 6:41













            up vote
            0
            down vote










            up vote
            0
            down vote









            However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:



            import pandas as pd
            d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]


            Create a DataFrame with using DataFrame.from_dict



            df = pd.DataFrame.from_dict(d_list)
            print(df)

            cast_id character credit_id gender id name order
            0 242 Jake Sully 5602a8a7c3a3685532001c9a 2.0 65731.0 Sam Worthington 0.0
            1 3 Neytiri NaN NaN NaN NaN NaN


            Another way around which suited for this ppurpose is pd.read_json with orient='records'.



            import pandas as pd
            d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]
            df = pd.read_json(d_list, orient='records')
            print(df





            share|improve this answer














            However explanation with great details already been provided, but would like to add in case you are using pandas to read and process data then you can use:



            import pandas as pd
            d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri"}]


            Create a DataFrame with using DataFrame.from_dict



            df = pd.DataFrame.from_dict(d_list)
            print(df)

            cast_id character credit_id gender id name order
            0 242 Jake Sully 5602a8a7c3a3685532001c9a 2.0 65731.0 Sam Worthington 0.0
            1 3 Neytiri NaN NaN NaN NaN NaN


            Another way around which suited for this ppurpose is pd.read_json with orient='records'.



            import pandas as pd
            d_list = [{"cast_id": 242, "character": "Jake Sully", "credit_id": "5602a8a7c3a3685532001c9a", "gender": 2, "id": 65731, "name": "Sam Worthington", "order": 0}, {"cast_id": 3, "character": "Neytiri", "credit_i...}]
            df = pd.read_json(d_list, orient='records')
            print(df






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 11 at 4:52

























            answered Nov 11 at 4:45









            pygo

            1,654416




            1,654416












            • You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
              – pygo
              Nov 11 at 6:41


















            • You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
              – pygo
              Nov 11 at 6:41
















            You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
            – pygo
            Nov 11 at 6:41




            You can accept an answer which is really useful in your case by marking it green beside the answer in left hand side.
            – pygo
            Nov 11 at 6:41


















             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245697%2fwhat-is-the-difference-between-json-loads-and-applyjson-loads%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Xamarin.iOS Cant Deploy on Iphone

            Glorious Revolution

            Dulmage-Mendelsohn matrix decomposition in Python