Parsing large string values in Pandas

I have a .csv which I've generated a dataframe from. This csv has raw data outputs from a system that follows this format:

{"DataType1":"Value","DataType2":"Value","DataType3":"Value",.....}

Each row in the dataframe has just this in 1 column. I'm trying to break this out so that the data types become column headers and the values populate the rows. One other aspect is that not all rows have the same data types, some have additional data types that might not be present in other rows. For example row 1 may have DataType1, DataType2, and DataType3 and row 2 may have DataType2, DataType4, and DataType5. Ideally I'd like for the output to have the column headers incorporate all data types whether that row has a value for it or not. So the final dataframe would this structure:

-------------------------------------------------------------

| DataType1 | DataType2 | DataType3 | DataType4 | DataType5 |

-------------------------------------------------------------

| Value     | Value     | Value     |   NaN     |   NaN     |

-------------------------------------------------------------

|  NaN      |  Value    | NaN       | Value     |  Value    |

-------------------------------------------------------------

edited Nov 16 '18 at 7:40

Aqueous Carlos

373415

asked Nov 16 '18 at 5:58

Danny

Hi, welcome to Stack Overflow. Please look around SO for similar problems, e.g stackoverflow.com/questions/14745022/…, stackoverflow.com/questions/29370211/…, stackoverflow.com/questions/39553392/… etc.

– Evan
Nov 16 '18 at 6:02

Possible duplicate of Split strings in tuples into columns, in Pandas

– Evan
Nov 16 '18 at 6:02

If you know, is the data JSON, or a Python dictionary? What have you tried so far?

– Evan
Nov 16 '18 at 6:03

The data is in a csv table as listed above. Each row just has 1 column with 1 string. It follows that dictionary format

– Danny
Nov 17 '18 at 20:26

add a comment |

I have a .csv which I've generated a dataframe from. This csv has raw data outputs from a system that follows this format:

{"DataType1":"Value","DataType2":"Value","DataType3":"Value",.....}

-------------------------------------------------------------

| DataType1 | DataType2 | DataType3 | DataType4 | DataType5 |

-------------------------------------------------------------

| Value     | Value     | Value     |   NaN     |   NaN     |

-------------------------------------------------------------

|  NaN      |  Value    | NaN       | Value     |  Value    |

-------------------------------------------------------------

edited Nov 16 '18 at 7:40

Aqueous Carlos

373415

asked Nov 16 '18 at 5:58

Danny

Hi, welcome to Stack Overflow. Please look around SO for similar problems, e.g stackoverflow.com/questions/14745022/…, stackoverflow.com/questions/29370211/…, stackoverflow.com/questions/39553392/… etc.

– Evan
Nov 16 '18 at 6:02

Possible duplicate of Split strings in tuples into columns, in Pandas

– Evan
Nov 16 '18 at 6:02

If you know, is the data JSON, or a Python dictionary? What have you tried so far?

– Evan
Nov 16 '18 at 6:03

The data is in a csv table as listed above. Each row just has 1 column with 1 string. It follows that dictionary format

– Danny
Nov 17 '18 at 20:26

add a comment |

I have a .csv which I've generated a dataframe from. This csv has raw data outputs from a system that follows this format:

{"DataType1":"Value","DataType2":"Value","DataType3":"Value",.....}

-------------------------------------------------------------

| DataType1 | DataType2 | DataType3 | DataType4 | DataType5 |

-------------------------------------------------------------

| Value     | Value     | Value     |   NaN     |   NaN     |

-------------------------------------------------------------

|  NaN      |  Value    | NaN       | Value     |  Value    |

-------------------------------------------------------------

edited Nov 16 '18 at 7:40

Aqueous Carlos

373415

asked Nov 16 '18 at 5:58

Danny

I have a .csv which I've generated a dataframe from. This csv has raw data outputs from a system that follows this format:

{"DataType1":"Value","DataType2":"Value","DataType3":"Value",.....}

-------------------------------------------------------------

| DataType1 | DataType2 | DataType3 | DataType4 | DataType5 |

-------------------------------------------------------------

| Value     | Value     | Value     |   NaN     |   NaN     |

-------------------------------------------------------------

|  NaN      |  Value    | NaN       | Value     |  Value    |

-------------------------------------------------------------

python pandas csv dataframe

edited Nov 16 '18 at 7:40

Aqueous Carlos

373415

asked Nov 16 '18 at 5:58

Danny

edited Nov 16 '18 at 7:40

Aqueous Carlos

373415

asked Nov 16 '18 at 5:58

Danny

edited Nov 16 '18 at 7:40

Aqueous Carlos

373415

edited Nov 16 '18 at 7:40

Aqueous Carlos

373415

edited Nov 16 '18 at 7:40

Aqueous Carlos

373415

asked Nov 16 '18 at 5:58

Danny

asked Nov 16 '18 at 5:58

Danny

asked Nov 16 '18 at 5:58

Danny

Hi, welcome to Stack Overflow. Please look around SO for similar problems, e.g stackoverflow.com/questions/14745022/…, stackoverflow.com/questions/29370211/…, stackoverflow.com/questions/39553392/… etc.

– Evan
Nov 16 '18 at 6:02

Possible duplicate of Split strings in tuples into columns, in Pandas

– Evan
Nov 16 '18 at 6:02

If you know, is the data JSON, or a Python dictionary? What have you tried so far?

– Evan
Nov 16 '18 at 6:03

The data is in a csv table as listed above. Each row just has 1 column with 1 string. It follows that dictionary format

– Danny
Nov 17 '18 at 20:26

add a comment |

Hi, welcome to Stack Overflow. Please look around SO for similar problems, e.g stackoverflow.com/questions/14745022/…, stackoverflow.com/questions/29370211/…, stackoverflow.com/questions/39553392/… etc.

– Evan
Nov 16 '18 at 6:02

Possible duplicate of Split strings in tuples into columns, in Pandas

– Evan
Nov 16 '18 at 6:02

If you know, is the data JSON, or a Python dictionary? What have you tried so far?

– Evan
Nov 16 '18 at 6:03

The data is in a csv table as listed above. Each row just has 1 column with 1 string. It follows that dictionary format

– Danny
Nov 17 '18 at 20:26

Hi, welcome to Stack Overflow. Please look around SO for similar problems, e.g stackoverflow.com/questions/14745022/…, stackoverflow.com/questions/29370211/…, stackoverflow.com/questions/39553392/… etc.

– Evan
Nov 16 '18 at 6:02

Possible duplicate of Split strings in tuples into columns, in Pandas

– Evan
Nov 16 '18 at 6:02

If you know, is the data JSON, or a Python dictionary? What have you tried so far?

– Evan
Nov 16 '18 at 6:03

The data is in a csv table as listed above. Each row just has 1 column with 1 string. It follows that dictionary format

– Danny
Nov 17 '18 at 20:26

add a comment |

1 Answer
1

active

oldest

votes

Dataframes follow this format when converted from a dictionery:

dict = {'column 1':[1,2], 'column 2':[3,4], ...}

Notice that the length of values in each key is same or

pd.DataFrame(dict)

will throw an error.

To surpass the error, you can iterate over the dict and make the DataFrame by parsing it.

pd.DataFrame(dict([(k,pd.Series(v)) for k,v in dict.items() ]))

*Assuming 'dict' is your dictionery name.

This way you'll have the desired output.

answered Nov 16 '18 at 6:12

gauravtolani

564

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53332235%2fparsing-large-string-values-in-pandas%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Dataframes follow this format when converted from a dictionery:

dict = {'column 1':[1,2], 'column 2':[3,4], ...}

Notice that the length of values in each key is same or

pd.DataFrame(dict)

will throw an error.

To surpass the error, you can iterate over the dict and make the DataFrame by parsing it.

pd.DataFrame(dict([(k,pd.Series(v)) for k,v in dict.items() ]))

*Assuming 'dict' is your dictionery name.

This way you'll have the desired output.

answered Nov 16 '18 at 6:12

gauravtolani

564

add a comment |

Dataframes follow this format when converted from a dictionery:

dict = {'column 1':[1,2], 'column 2':[3,4], ...}

Notice that the length of values in each key is same or

pd.DataFrame(dict)

will throw an error.

To surpass the error, you can iterate over the dict and make the DataFrame by parsing it.

pd.DataFrame(dict([(k,pd.Series(v)) for k,v in dict.items() ]))

*Assuming 'dict' is your dictionery name.

This way you'll have the desired output.

answered Nov 16 '18 at 6:12

gauravtolani

564

add a comment |

Dataframes follow this format when converted from a dictionery:

dict = {'column 1':[1,2], 'column 2':[3,4], ...}

Notice that the length of values in each key is same or

pd.DataFrame(dict)

will throw an error.

To surpass the error, you can iterate over the dict and make the DataFrame by parsing it.

pd.DataFrame(dict([(k,pd.Series(v)) for k,v in dict.items() ]))

*Assuming 'dict' is your dictionery name.

This way you'll have the desired output.

answered Nov 16 '18 at 6:12

gauravtolani

564

Dataframes follow this format when converted from a dictionery:

dict = {'column 1':[1,2], 'column 2':[3,4], ...}

Notice that the length of values in each key is same or

pd.DataFrame(dict)

will throw an error.

To surpass the error, you can iterate over the dict and make the DataFrame by parsing it.

pd.DataFrame(dict([(k,pd.Series(v)) for k,v in dict.items() ]))

*Assuming 'dict' is your dictionery name.

This way you'll have the desired output.

answered Nov 16 '18 at 6:12

gauravtolani

564

answered Nov 16 '18 at 6:12

gauravtolani

564

answered Nov 16 '18 at 6:12

gauravtolani

564

answered Nov 16 '18 at 6:12

gauravtolani

564

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vfrdtyky