Parsing full names from a list of names
I am using namesparser to extract full names from a list of names.
from namesparser import HumanNames
names = HumanNames('Randy Heimerman, James Durham, Nate Green')
print(names.human_names[0])
Namesparser works well in most cases, but the above example is getting hung up. I believe it is because the name "Randy" includes "and", which namesparser is treating as a separator.
When I move Randy's name to the end of the string, the correct name is printed (James Durham). If I try to print either of the 2 other names, though, the wrong strings are returned.
Any ideas on how I can resolve this?
python parsing
|
show 3 more comments
I am using namesparser to extract full names from a list of names.
from namesparser import HumanNames
names = HumanNames('Randy Heimerman, James Durham, Nate Green')
print(names.human_names[0])
Namesparser works well in most cases, but the above example is getting hung up. I believe it is because the name "Randy" includes "and", which namesparser is treating as a separator.
When I move Randy's name to the end of the string, the correct name is printed (James Durham). If I try to print either of the 2 other names, though, the wrong strings are returned.
Any ideas on how I can resolve this?
python parsing
Can you provide what is the current output that you get?
– Andreas
Nov 14 '18 at 2:00
Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...
– duhaime
Nov 14 '18 at 2:01
whatsHumanNames
I don't see that class in the docs, onlyHumanName
which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library
– aws_apprentice
Nov 14 '18 at 2:03
github.com/gwu-libraries/namesparser
– Steve Just
Nov 14 '18 at 2:29
the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on theand
– aws_apprentice
Nov 14 '18 at 2:53
|
show 3 more comments
I am using namesparser to extract full names from a list of names.
from namesparser import HumanNames
names = HumanNames('Randy Heimerman, James Durham, Nate Green')
print(names.human_names[0])
Namesparser works well in most cases, but the above example is getting hung up. I believe it is because the name "Randy" includes "and", which namesparser is treating as a separator.
When I move Randy's name to the end of the string, the correct name is printed (James Durham). If I try to print either of the 2 other names, though, the wrong strings are returned.
Any ideas on how I can resolve this?
python parsing
I am using namesparser to extract full names from a list of names.
from namesparser import HumanNames
names = HumanNames('Randy Heimerman, James Durham, Nate Green')
print(names.human_names[0])
Namesparser works well in most cases, but the above example is getting hung up. I believe it is because the name "Randy" includes "and", which namesparser is treating as a separator.
When I move Randy's name to the end of the string, the correct name is printed (James Durham). If I try to print either of the 2 other names, though, the wrong strings are returned.
Any ideas on how I can resolve this?
python parsing
python parsing
edited Nov 15 '18 at 14:29
rici
153k19135200
153k19135200
asked Nov 14 '18 at 1:53
Steve JustSteve Just
164
164
Can you provide what is the current output that you get?
– Andreas
Nov 14 '18 at 2:00
Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...
– duhaime
Nov 14 '18 at 2:01
whatsHumanNames
I don't see that class in the docs, onlyHumanName
which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library
– aws_apprentice
Nov 14 '18 at 2:03
github.com/gwu-libraries/namesparser
– Steve Just
Nov 14 '18 at 2:29
the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on theand
– aws_apprentice
Nov 14 '18 at 2:53
|
show 3 more comments
Can you provide what is the current output that you get?
– Andreas
Nov 14 '18 at 2:00
Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...
– duhaime
Nov 14 '18 at 2:01
whatsHumanNames
I don't see that class in the docs, onlyHumanName
which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library
– aws_apprentice
Nov 14 '18 at 2:03
github.com/gwu-libraries/namesparser
– Steve Just
Nov 14 '18 at 2:29
the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on theand
– aws_apprentice
Nov 14 '18 at 2:53
Can you provide what is the current output that you get?
– Andreas
Nov 14 '18 at 2:00
Can you provide what is the current output that you get?
– Andreas
Nov 14 '18 at 2:00
Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...
– duhaime
Nov 14 '18 at 2:01
Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...
– duhaime
Nov 14 '18 at 2:01
whats
HumanNames
I don't see that class in the docs, only HumanName
which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library– aws_apprentice
Nov 14 '18 at 2:03
whats
HumanNames
I don't see that class in the docs, only HumanName
which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library– aws_apprentice
Nov 14 '18 at 2:03
github.com/gwu-libraries/namesparser
– Steve Just
Nov 14 '18 at 2:29
github.com/gwu-libraries/namesparser
– Steve Just
Nov 14 '18 at 2:29
the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on the
and
– aws_apprentice
Nov 14 '18 at 2:53
the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on the
and
– aws_apprentice
Nov 14 '18 at 2:53
|
show 3 more comments
1 Answer
1
active
oldest
votes
I think you should use the comma ,
as your delimiter.
def print_names( name_string ):
return (name.strip() for name in name_string.split(","))
what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.
Now that you have a generator of names, you can pass it into other things for example:
humans = [HumanName(name) for name in print_names(name_string)]
but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.
If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.
I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.
– Steve Just
Nov 14 '18 at 2:31
I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation
– Fallenreaper
Nov 14 '18 at 4:21
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292082%2fparsing-full-names-from-a-list-of-names%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think you should use the comma ,
as your delimiter.
def print_names( name_string ):
return (name.strip() for name in name_string.split(","))
what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.
Now that you have a generator of names, you can pass it into other things for example:
humans = [HumanName(name) for name in print_names(name_string)]
but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.
If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.
I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.
– Steve Just
Nov 14 '18 at 2:31
I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation
– Fallenreaper
Nov 14 '18 at 4:21
add a comment |
I think you should use the comma ,
as your delimiter.
def print_names( name_string ):
return (name.strip() for name in name_string.split(","))
what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.
Now that you have a generator of names, you can pass it into other things for example:
humans = [HumanName(name) for name in print_names(name_string)]
but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.
If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.
I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.
– Steve Just
Nov 14 '18 at 2:31
I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation
– Fallenreaper
Nov 14 '18 at 4:21
add a comment |
I think you should use the comma ,
as your delimiter.
def print_names( name_string ):
return (name.strip() for name in name_string.split(","))
what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.
Now that you have a generator of names, you can pass it into other things for example:
humans = [HumanName(name) for name in print_names(name_string)]
but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.
If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.
I think you should use the comma ,
as your delimiter.
def print_names( name_string ):
return (name.strip() for name in name_string.split(","))
what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.
Now that you have a generator of names, you can pass it into other things for example:
humans = [HumanName(name) for name in print_names(name_string)]
but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.
If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.
edited Nov 14 '18 at 2:08
answered Nov 14 '18 at 2:01
FallenreaperFallenreaper
3,95883483
3,95883483
I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.
– Steve Just
Nov 14 '18 at 2:31
I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation
– Fallenreaper
Nov 14 '18 at 4:21
add a comment |
I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.
– Steve Just
Nov 14 '18 at 2:31
I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation
– Fallenreaper
Nov 14 '18 at 4:21
I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.
– Steve Just
Nov 14 '18 at 2:31
I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.
– Steve Just
Nov 14 '18 at 2:31
I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation
– Fallenreaper
Nov 14 '18 at 4:21
I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation
– Fallenreaper
Nov 14 '18 at 4:21
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292082%2fparsing-full-names-from-a-list-of-names%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Can you provide what is the current output that you get?
– Andreas
Nov 14 '18 at 2:00
Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...
– duhaime
Nov 14 '18 at 2:01
whats
HumanNames
I don't see that class in the docs, onlyHumanName
which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library– aws_apprentice
Nov 14 '18 at 2:03
github.com/gwu-libraries/namesparser
– Steve Just
Nov 14 '18 at 2:29
the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on the
and
– aws_apprentice
Nov 14 '18 at 2:53