Python + regex: How to extract values between two underscores in Python?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I am trying to extract values between two underscores. For that I have written this code:
patient_ids =
for file in files:
print(file)
patient_id = re.findall("_(.*?)_", file)
patient_ids.append(patient_id)
print(patient_ids)
Output:
PT_112_NIM 26-04-2017_merged.csv
PT_114_NIM_merged.csv
PT_115_NIM_merged.csv
PT_116_NIM_merged.csv
PT_117_NIM_merged.csv
PT_118_NIM_merged.csv
PT_119_NIM_merged.csv
[['112'], ['114'], ['115'], ['116'], ['117'], ['118'], ['119'], ['120'], ['121'], ['122'], ['123'], ['124'], ['125'], ['126'], ['127'], ['128'], ['129'], ['130'], ['131'], ['132'], ['133'], ['134'], ['135'], ['136'], ['137'], ['138'], ['139'], ['140'], ['141'], ['142'], ['143'], ['144'], ['145'], ['146'], ['147'], ['150'], ['151'], ['152'], ['153'], ['154'], ['155'], ['156'], ['157'], ['158'], ['159'], ['160'], ['161'], ['162'], ['163'], ['165']]
So extracted values are in this form: ['121']. I want them in this form: 121 , i.e., just the number inside two underscores.
What change should I make to my code?
python regex
add a comment |
I am trying to extract values between two underscores. For that I have written this code:
patient_ids =
for file in files:
print(file)
patient_id = re.findall("_(.*?)_", file)
patient_ids.append(patient_id)
print(patient_ids)
Output:
PT_112_NIM 26-04-2017_merged.csv
PT_114_NIM_merged.csv
PT_115_NIM_merged.csv
PT_116_NIM_merged.csv
PT_117_NIM_merged.csv
PT_118_NIM_merged.csv
PT_119_NIM_merged.csv
[['112'], ['114'], ['115'], ['116'], ['117'], ['118'], ['119'], ['120'], ['121'], ['122'], ['123'], ['124'], ['125'], ['126'], ['127'], ['128'], ['129'], ['130'], ['131'], ['132'], ['133'], ['134'], ['135'], ['136'], ['137'], ['138'], ['139'], ['140'], ['141'], ['142'], ['143'], ['144'], ['145'], ['146'], ['147'], ['150'], ['151'], ['152'], ['153'], ['154'], ['155'], ['156'], ['157'], ['158'], ['159'], ['160'], ['161'], ['162'], ['163'], ['165']]
So extracted values are in this form: ['121']. I want them in this form: 121 , i.e., just the number inside two underscores.
What change should I make to my code?
python regex
How aboutint(patient_id[0])
?
– zipa
Nov 16 '18 at 12:50
Tryint(re.search("_(d*?)_", file).group(1))
– schwobaseggl
Nov 16 '18 at 12:50
1
@usr2564301 I think you misunderstood...
– schwobaseggl
Nov 16 '18 at 12:51
add a comment |
I am trying to extract values between two underscores. For that I have written this code:
patient_ids =
for file in files:
print(file)
patient_id = re.findall("_(.*?)_", file)
patient_ids.append(patient_id)
print(patient_ids)
Output:
PT_112_NIM 26-04-2017_merged.csv
PT_114_NIM_merged.csv
PT_115_NIM_merged.csv
PT_116_NIM_merged.csv
PT_117_NIM_merged.csv
PT_118_NIM_merged.csv
PT_119_NIM_merged.csv
[['112'], ['114'], ['115'], ['116'], ['117'], ['118'], ['119'], ['120'], ['121'], ['122'], ['123'], ['124'], ['125'], ['126'], ['127'], ['128'], ['129'], ['130'], ['131'], ['132'], ['133'], ['134'], ['135'], ['136'], ['137'], ['138'], ['139'], ['140'], ['141'], ['142'], ['143'], ['144'], ['145'], ['146'], ['147'], ['150'], ['151'], ['152'], ['153'], ['154'], ['155'], ['156'], ['157'], ['158'], ['159'], ['160'], ['161'], ['162'], ['163'], ['165']]
So extracted values are in this form: ['121']. I want them in this form: 121 , i.e., just the number inside two underscores.
What change should I make to my code?
python regex
I am trying to extract values between two underscores. For that I have written this code:
patient_ids =
for file in files:
print(file)
patient_id = re.findall("_(.*?)_", file)
patient_ids.append(patient_id)
print(patient_ids)
Output:
PT_112_NIM 26-04-2017_merged.csv
PT_114_NIM_merged.csv
PT_115_NIM_merged.csv
PT_116_NIM_merged.csv
PT_117_NIM_merged.csv
PT_118_NIM_merged.csv
PT_119_NIM_merged.csv
[['112'], ['114'], ['115'], ['116'], ['117'], ['118'], ['119'], ['120'], ['121'], ['122'], ['123'], ['124'], ['125'], ['126'], ['127'], ['128'], ['129'], ['130'], ['131'], ['132'], ['133'], ['134'], ['135'], ['136'], ['137'], ['138'], ['139'], ['140'], ['141'], ['142'], ['143'], ['144'], ['145'], ['146'], ['147'], ['150'], ['151'], ['152'], ['153'], ['154'], ['155'], ['156'], ['157'], ['158'], ['159'], ['160'], ['161'], ['162'], ['163'], ['165']]
So extracted values are in this form: ['121']. I want them in this form: 121 , i.e., just the number inside two underscores.
What change should I make to my code?
python regex
python regex
asked Nov 16 '18 at 12:47
DebbieDebbie
361314
361314
How aboutint(patient_id[0])
?
– zipa
Nov 16 '18 at 12:50
Tryint(re.search("_(d*?)_", file).group(1))
– schwobaseggl
Nov 16 '18 at 12:50
1
@usr2564301 I think you misunderstood...
– schwobaseggl
Nov 16 '18 at 12:51
add a comment |
How aboutint(patient_id[0])
?
– zipa
Nov 16 '18 at 12:50
Tryint(re.search("_(d*?)_", file).group(1))
– schwobaseggl
Nov 16 '18 at 12:50
1
@usr2564301 I think you misunderstood...
– schwobaseggl
Nov 16 '18 at 12:51
How about
int(patient_id[0])
?– zipa
Nov 16 '18 at 12:50
How about
int(patient_id[0])
?– zipa
Nov 16 '18 at 12:50
Try
int(re.search("_(d*?)_", file).group(1))
– schwobaseggl
Nov 16 '18 at 12:50
Try
int(re.search("_(d*?)_", file).group(1))
– schwobaseggl
Nov 16 '18 at 12:50
1
1
@usr2564301 I think you misunderstood...
– schwobaseggl
Nov 16 '18 at 12:51
@usr2564301 I think you misunderstood...
– schwobaseggl
Nov 16 '18 at 12:51
add a comment |
4 Answers
4
active
oldest
votes
Really, an easy way would be, instead of appending a list to another list, just make that list equivalent:
patient_ids =
for file in files:
print(file)
patient_ids.extend(re.findall("_(.*?)_", file))
print(patient_ids)
add a comment |
Just replace the last line of your for loop by :
patient_ids.extend(int(patient_id))
extend will flatten your results, and int(patient_id) will convert the string to int
add a comment |
You need to flatten your results, e.g. like that:
patient_ids = [item for sublist in patient_ids for item in sublist]
print flat_list
# => ['112', '114', '115', '116', '117', '118', '119', '120', '121', '122', '123', '124', '125', '126', '127', '128', '129', '130', '131', '132', '133', '134', '135', '136', '137', '138', '139', '140', '141', '142', '143', '144', '145', '146', '147', '150', '151', '152', '153', '154', '155', '156', '157', '158', '159', '160', '161', '162', '163', '165']
add a comment |
You have a list of findall results (which only ever is 1 result per file it seems) - you can either just convert the strings to integers or also flatten the result:
patient_ids= [['112'], ['114','4711'], ['115'], ['116'], ['117'], ['118'], ['119']]
# ^^^^^ ^^^^^^ modified to have 2 ids for demo-purposes
# if you want to keep the boxing
numms = [ list(map(int,m)) for m in patient_ids]
# converted and flattened
numms2 = [ x for y in [list(map(int,m)) for m in patient_ids] for x in y]
print(numms)
print(numms2)
Output:
# this keeps the findall results together in inner lists
[[112], [114, 4711], [115], [116], [117], [118], [119]]
# this flattens all results
[112, 114, 4711, 115, 116, 117, 118, 119]
Doku:
- you can find the doku for
map()
andint()
at Overview of built in functions
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53338234%2fpython-regex-how-to-extract-values-between-two-underscores-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
Really, an easy way would be, instead of appending a list to another list, just make that list equivalent:
patient_ids =
for file in files:
print(file)
patient_ids.extend(re.findall("_(.*?)_", file))
print(patient_ids)
add a comment |
Really, an easy way would be, instead of appending a list to another list, just make that list equivalent:
patient_ids =
for file in files:
print(file)
patient_ids.extend(re.findall("_(.*?)_", file))
print(patient_ids)
add a comment |
Really, an easy way would be, instead of appending a list to another list, just make that list equivalent:
patient_ids =
for file in files:
print(file)
patient_ids.extend(re.findall("_(.*?)_", file))
print(patient_ids)
Really, an easy way would be, instead of appending a list to another list, just make that list equivalent:
patient_ids =
for file in files:
print(file)
patient_ids.extend(re.findall("_(.*?)_", file))
print(patient_ids)
edited Nov 16 '18 at 13:02
answered Nov 16 '18 at 13:01
connectyourchargerconnectyourcharger
599424
599424
add a comment |
add a comment |
Just replace the last line of your for loop by :
patient_ids.extend(int(patient_id))
extend will flatten your results, and int(patient_id) will convert the string to int
add a comment |
Just replace the last line of your for loop by :
patient_ids.extend(int(patient_id))
extend will flatten your results, and int(patient_id) will convert the string to int
add a comment |
Just replace the last line of your for loop by :
patient_ids.extend(int(patient_id))
extend will flatten your results, and int(patient_id) will convert the string to int
Just replace the last line of your for loop by :
patient_ids.extend(int(patient_id))
extend will flatten your results, and int(patient_id) will convert the string to int
answered Nov 16 '18 at 12:52
Matina GMatina G
629213
629213
add a comment |
add a comment |
You need to flatten your results, e.g. like that:
patient_ids = [item for sublist in patient_ids for item in sublist]
print flat_list
# => ['112', '114', '115', '116', '117', '118', '119', '120', '121', '122', '123', '124', '125', '126', '127', '128', '129', '130', '131', '132', '133', '134', '135', '136', '137', '138', '139', '140', '141', '142', '143', '144', '145', '146', '147', '150', '151', '152', '153', '154', '155', '156', '157', '158', '159', '160', '161', '162', '163', '165']
add a comment |
You need to flatten your results, e.g. like that:
patient_ids = [item for sublist in patient_ids for item in sublist]
print flat_list
# => ['112', '114', '115', '116', '117', '118', '119', '120', '121', '122', '123', '124', '125', '126', '127', '128', '129', '130', '131', '132', '133', '134', '135', '136', '137', '138', '139', '140', '141', '142', '143', '144', '145', '146', '147', '150', '151', '152', '153', '154', '155', '156', '157', '158', '159', '160', '161', '162', '163', '165']
add a comment |
You need to flatten your results, e.g. like that:
patient_ids = [item for sublist in patient_ids for item in sublist]
print flat_list
# => ['112', '114', '115', '116', '117', '118', '119', '120', '121', '122', '123', '124', '125', '126', '127', '128', '129', '130', '131', '132', '133', '134', '135', '136', '137', '138', '139', '140', '141', '142', '143', '144', '145', '146', '147', '150', '151', '152', '153', '154', '155', '156', '157', '158', '159', '160', '161', '162', '163', '165']
You need to flatten your results, e.g. like that:
patient_ids = [item for sublist in patient_ids for item in sublist]
print flat_list
# => ['112', '114', '115', '116', '117', '118', '119', '120', '121', '122', '123', '124', '125', '126', '127', '128', '129', '130', '131', '132', '133', '134', '135', '136', '137', '138', '139', '140', '141', '142', '143', '144', '145', '146', '147', '150', '151', '152', '153', '154', '155', '156', '157', '158', '159', '160', '161', '162', '163', '165']
answered Nov 16 '18 at 12:52
mrzasamrzasa
10.7k104079
10.7k104079
add a comment |
add a comment |
You have a list of findall results (which only ever is 1 result per file it seems) - you can either just convert the strings to integers or also flatten the result:
patient_ids= [['112'], ['114','4711'], ['115'], ['116'], ['117'], ['118'], ['119']]
# ^^^^^ ^^^^^^ modified to have 2 ids for demo-purposes
# if you want to keep the boxing
numms = [ list(map(int,m)) for m in patient_ids]
# converted and flattened
numms2 = [ x for y in [list(map(int,m)) for m in patient_ids] for x in y]
print(numms)
print(numms2)
Output:
# this keeps the findall results together in inner lists
[[112], [114, 4711], [115], [116], [117], [118], [119]]
# this flattens all results
[112, 114, 4711, 115, 116, 117, 118, 119]
Doku:
- you can find the doku for
map()
andint()
at Overview of built in functions
add a comment |
You have a list of findall results (which only ever is 1 result per file it seems) - you can either just convert the strings to integers or also flatten the result:
patient_ids= [['112'], ['114','4711'], ['115'], ['116'], ['117'], ['118'], ['119']]
# ^^^^^ ^^^^^^ modified to have 2 ids for demo-purposes
# if you want to keep the boxing
numms = [ list(map(int,m)) for m in patient_ids]
# converted and flattened
numms2 = [ x for y in [list(map(int,m)) for m in patient_ids] for x in y]
print(numms)
print(numms2)
Output:
# this keeps the findall results together in inner lists
[[112], [114, 4711], [115], [116], [117], [118], [119]]
# this flattens all results
[112, 114, 4711, 115, 116, 117, 118, 119]
Doku:
- you can find the doku for
map()
andint()
at Overview of built in functions
add a comment |
You have a list of findall results (which only ever is 1 result per file it seems) - you can either just convert the strings to integers or also flatten the result:
patient_ids= [['112'], ['114','4711'], ['115'], ['116'], ['117'], ['118'], ['119']]
# ^^^^^ ^^^^^^ modified to have 2 ids for demo-purposes
# if you want to keep the boxing
numms = [ list(map(int,m)) for m in patient_ids]
# converted and flattened
numms2 = [ x for y in [list(map(int,m)) for m in patient_ids] for x in y]
print(numms)
print(numms2)
Output:
# this keeps the findall results together in inner lists
[[112], [114, 4711], [115], [116], [117], [118], [119]]
# this flattens all results
[112, 114, 4711, 115, 116, 117, 118, 119]
Doku:
- you can find the doku for
map()
andint()
at Overview of built in functions
You have a list of findall results (which only ever is 1 result per file it seems) - you can either just convert the strings to integers or also flatten the result:
patient_ids= [['112'], ['114','4711'], ['115'], ['116'], ['117'], ['118'], ['119']]
# ^^^^^ ^^^^^^ modified to have 2 ids for demo-purposes
# if you want to keep the boxing
numms = [ list(map(int,m)) for m in patient_ids]
# converted and flattened
numms2 = [ x for y in [list(map(int,m)) for m in patient_ids] for x in y]
print(numms)
print(numms2)
Output:
# this keeps the findall results together in inner lists
[[112], [114, 4711], [115], [116], [117], [118], [119]]
# this flattens all results
[112, 114, 4711, 115, 116, 117, 118, 119]
Doku:
- you can find the doku for
map()
andint()
at Overview of built in functions
answered Nov 16 '18 at 13:01
Patrick ArtnerPatrick Artner
26.4k62544
26.4k62544
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53338234%2fpython-regex-how-to-extract-values-between-two-underscores-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
How about
int(patient_id[0])
?– zipa
Nov 16 '18 at 12:50
Try
int(re.search("_(d*?)_", file).group(1))
– schwobaseggl
Nov 16 '18 at 12:50
1
@usr2564301 I think you misunderstood...
– schwobaseggl
Nov 16 '18 at 12:51