Python: combining rows of 2 csv files on multiple conditions [on hold]

up vote
-2
down vote

favorite

I want to combine 2 CSV files with a different number of columns and each having unique ones too. Both files are thousands of lines long.
the files look like this:

file1:

Sample Chromosome Name Total coverage Strand coverage Context coverage

0 x3ng 7 CpG 2 1 1

1 x3ng 7 CpG 2 1 1

2 x3ng 7 CpG 1 1 1

3 x3ng 7 CpG 1 1 1

4 x3ng 7 CpG 1 1 1

Methylated coverage Methylation level position complement

0 1 1 207605 non

1 1 1 207616 non

2 1 1 210880 non

3 1 1 218029 non

4 1 1 218064 non

file2:

Chromosome Gene ENSEMBL start end complement

0 7 AC093627.11 ENSG00000239715 174920 176013 non

1 7 FAM20C ENSG00000177706 192969 300711 non

2 11 RP11-359E10.1 ENSG00000270607 19732480 19734165 complement

3 11 MIR4694 ENSG00000264309 19781550 19781629 complement

4 11 NAV2-AS3 ENSG00000254542 20000245 20002883 complement

5 11 NAV2-AS2 ENSG00000254453 20065392 20070849 complement

and the result should look like this:

Sample Chromosome Name Total coverage Strand coverage Context coverage Methylated coverage Methylation level position complement Gene ENSEMBL start end

x3ng 7 CpG 2 1 1 1 1 207605 non FAM20C ENSG00000177706 192969 300711

x3ng 7 CpG 2 1 1 1 1 207616 non FAM20C ENSG00000177707 192969 300711

x3ng 7 CpG 1 1 1 1 1 210880 non FAM20C ENSG00000177708 192969 300711

x3ng 7 CpG 1 1 1 1 1 218029 non FAM20C ENSG00000177709 192969 300711

x3ng 7 CpG 1 1 1 1 1 218064 non FAM20C ENSG00000177710 192969 300711

as you can see, the conditions are:

'Chromosome' in file1 = 'Chromosome' in file2

'complement' in file1 = 'complement' in file2

'position' of file 1 between 'start' and 'end' of file2

Thank you in advance.

edited 11 hours ago

asked yesterday

P.Stegen

New contributor

put on hold as unclear what you're asking by timgeb, legoscia, greg-449, Robert Columbia, Rob yesterday

Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.

1

Please be clearer about what the exact problem here is. Do we have to start with how to read in a csv file? If yes, the question is too broad.
– timgeb
yesterday

No i know that, i just don't know how to work with multiple conditions.
– P.Stegen
yesterday

Good, then please provide a Minimal, Complete, and Verifiable example and your own attempt.
– timgeb
yesterday

@P.Stegen, are you saying to combine two csv files for the rows while removing duplicates between them? also please provide the code which you have tried so far and update that in text form rather in image form.
– pygo
yesterday

@P.Stegen, try this if you are looking for this pd.concat([df1, df2], axis = 1, join = 'inner', join_axes = [df1.index])[df1.columns]
– pygo
yesterday

|
show 1 more comment

up vote
-2
down vote

favorite

I want to combine 2 CSV files with a different number of columns and each having unique ones too. Both files are thousands of lines long.
the files look like this:

file1:

Sample Chromosome Name Total coverage Strand coverage Context coverage

0 x3ng 7 CpG 2 1 1

1 x3ng 7 CpG 2 1 1

2 x3ng 7 CpG 1 1 1

3 x3ng 7 CpG 1 1 1

4 x3ng 7 CpG 1 1 1

Methylated coverage Methylation level position complement

0 1 1 207605 non

1 1 1 207616 non

2 1 1 210880 non

3 1 1 218029 non

4 1 1 218064 non

file2:

and the result should look like this:

as you can see, the conditions are:

'Chromosome' in file1 = 'Chromosome' in file2

'complement' in file1 = 'complement' in file2

'position' of file 1 between 'start' and 'end' of file2

Thank you in advance.

edited 11 hours ago

asked yesterday

P.Stegen

New contributor

put on hold as unclear what you're asking by timgeb, legoscia, greg-449, Robert Columbia, Rob yesterday

1

Please be clearer about what the exact problem here is. Do we have to start with how to read in a csv file? If yes, the question is too broad.
– timgeb
yesterday

No i know that, i just don't know how to work with multiple conditions.
– P.Stegen
yesterday

Good, then please provide a Minimal, Complete, and Verifiable example and your own attempt.
– timgeb
yesterday

@P.Stegen, are you saying to combine two csv files for the rows while removing duplicates between them? also please provide the code which you have tried so far and update that in text form rather in image form.
– pygo
yesterday

@P.Stegen, try this if you are looking for this pd.concat([df1, df2], axis = 1, join = 'inner', join_axes = [df1.index])[df1.columns]
– pygo
yesterday

|
show 1 more comment

up vote
-2
down vote

favorite

I want to combine 2 CSV files with a different number of columns and each having unique ones too. Both files are thousands of lines long.
the files look like this:

file1:

Sample Chromosome Name Total coverage Strand coverage Context coverage

0 x3ng 7 CpG 2 1 1

1 x3ng 7 CpG 2 1 1

2 x3ng 7 CpG 1 1 1

3 x3ng 7 CpG 1 1 1

4 x3ng 7 CpG 1 1 1

Methylated coverage Methylation level position complement

0 1 1 207605 non

1 1 1 207616 non

2 1 1 210880 non

3 1 1 218029 non

4 1 1 218064 non

file2:

and the result should look like this:

as you can see, the conditions are:

'Chromosome' in file1 = 'Chromosome' in file2

'complement' in file1 = 'complement' in file2

'position' of file 1 between 'start' and 'end' of file2

Thank you in advance.

edited 11 hours ago

asked yesterday

P.Stegen

New contributor

I want to combine 2 CSV files with a different number of columns and each having unique ones too. Both files are thousands of lines long.
the files look like this:

file1:

Sample Chromosome Name Total coverage Strand coverage Context coverage

0 x3ng 7 CpG 2 1 1

1 x3ng 7 CpG 2 1 1

2 x3ng 7 CpG 1 1 1

3 x3ng 7 CpG 1 1 1

4 x3ng 7 CpG 1 1 1

Methylated coverage Methylation level position complement

0 1 1 207605 non

1 1 1 207616 non

2 1 1 210880 non

3 1 1 218029 non

4 1 1 218064 non

file2:

and the result should look like this:

as you can see, the conditions are:

'Chromosome' in file1 = 'Chromosome' in file2

'complement' in file1 = 'complement' in file2

'position' of file 1 between 'start' and 'end' of file2

Thank you in advance.

python csv dataframe

edited 11 hours ago

asked yesterday

P.Stegen

New contributor

edited 11 hours ago

asked yesterday

P.Stegen

New contributor

edited 11 hours ago

asked yesterday

P.Stegen

New contributor

asked yesterday

P.Stegen

asked yesterday

P.Stegen

New contributor

P.Stegen is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

put on hold as unclear what you're asking by timgeb, legoscia, greg-449, Robert Columbia, Rob yesterday

1

Please be clearer about what the exact problem here is. Do we have to start with how to read in a csv file? If yes, the question is too broad.
– timgeb
yesterday

No i know that, i just don't know how to work with multiple conditions.
– P.Stegen
yesterday

Good, then please provide a Minimal, Complete, and Verifiable example and your own attempt.
– timgeb
yesterday

@P.Stegen, are you saying to combine two csv files for the rows while removing duplicates between them? also please provide the code which you have tried so far and update that in text form rather in image form.
– pygo
yesterday

@P.Stegen, try this if you are looking for this pd.concat([df1, df2], axis = 1, join = 'inner', join_axes = [df1.index])[df1.columns]
– pygo
yesterday

|
show 1 more comment

1

Please be clearer about what the exact problem here is. Do we have to start with how to read in a csv file? If yes, the question is too broad.
– timgeb
yesterday

No i know that, i just don't know how to work with multiple conditions.
– P.Stegen
yesterday

Good, then please provide a Minimal, Complete, and Verifiable example and your own attempt.
– timgeb
yesterday

@P.Stegen, are you saying to combine two csv files for the rows while removing duplicates between them? also please provide the code which you have tried so far and update that in text form rather in image form.
– pygo
yesterday

@P.Stegen, try this if you are looking for this pd.concat([df1, df2], axis = 1, join = 'inner', join_axes = [df1.index])[df1.columns]
– pygo
yesterday

Please be clearer about what the exact problem here is. Do we have to start with how to read in a csv file? If yes, the question is too broad.
– timgeb
yesterday

No i know that, i just don't know how to work with multiple conditions.
– P.Stegen
yesterday

Good, then please provide a Minimal, Complete, and Verifiable example and your own attempt.
– timgeb
yesterday

@P.Stegen, are you saying to combine two csv files for the rows while removing duplicates between them? also please provide the code which you have tried so far and update that in text form rather in image form.
– pygo
yesterday

@P.Stegen, try this if you are looking for this pd.concat([df1, df2], axis = 1, join = 'inner', join_axes = [df1.index])[df1.columns]
– pygo
yesterday

|
show 1 more comment

active

oldest

votes

This page is only for reference, If you need detailed information, please check here

N onYaFlcCgi9J4Onxy WgyJLbzQcba v

搜尋此網誌

Vfrdtyky