Parsing full names from a list of names












2















I am using namesparser to extract full names from a list of names.



from namesparser import HumanNames
names = HumanNames('Randy Heimerman, James Durham, Nate Green')
print(names.human_names[0])


Namesparser works well in most cases, but the above example is getting hung up. I believe it is because the name "Randy" includes "and", which namesparser is treating as a separator.



When I move Randy's name to the end of the string, the correct name is printed (James Durham). If I try to print either of the 2 other names, though, the wrong strings are returned.



Any ideas on how I can resolve this?










share|improve this question

























  • Can you provide what is the current output that you get?

    – Andreas
    Nov 14 '18 at 2:00











  • Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...

    – duhaime
    Nov 14 '18 at 2:01











  • whats HumanNames I don't see that class in the docs, only HumanName which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library

    – aws_apprentice
    Nov 14 '18 at 2:03











  • github.com/gwu-libraries/namesparser

    – Steve Just
    Nov 14 '18 at 2:29











  • the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on the and

    – aws_apprentice
    Nov 14 '18 at 2:53
















2















I am using namesparser to extract full names from a list of names.



from namesparser import HumanNames
names = HumanNames('Randy Heimerman, James Durham, Nate Green')
print(names.human_names[0])


Namesparser works well in most cases, but the above example is getting hung up. I believe it is because the name "Randy" includes "and", which namesparser is treating as a separator.



When I move Randy's name to the end of the string, the correct name is printed (James Durham). If I try to print either of the 2 other names, though, the wrong strings are returned.



Any ideas on how I can resolve this?










share|improve this question

























  • Can you provide what is the current output that you get?

    – Andreas
    Nov 14 '18 at 2:00











  • Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...

    – duhaime
    Nov 14 '18 at 2:01











  • whats HumanNames I don't see that class in the docs, only HumanName which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library

    – aws_apprentice
    Nov 14 '18 at 2:03











  • github.com/gwu-libraries/namesparser

    – Steve Just
    Nov 14 '18 at 2:29











  • the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on the and

    – aws_apprentice
    Nov 14 '18 at 2:53














2












2








2








I am using namesparser to extract full names from a list of names.



from namesparser import HumanNames
names = HumanNames('Randy Heimerman, James Durham, Nate Green')
print(names.human_names[0])


Namesparser works well in most cases, but the above example is getting hung up. I believe it is because the name "Randy" includes "and", which namesparser is treating as a separator.



When I move Randy's name to the end of the string, the correct name is printed (James Durham). If I try to print either of the 2 other names, though, the wrong strings are returned.



Any ideas on how I can resolve this?










share|improve this question
















I am using namesparser to extract full names from a list of names.



from namesparser import HumanNames
names = HumanNames('Randy Heimerman, James Durham, Nate Green')
print(names.human_names[0])


Namesparser works well in most cases, but the above example is getting hung up. I believe it is because the name "Randy" includes "and", which namesparser is treating as a separator.



When I move Randy's name to the end of the string, the correct name is printed (James Durham). If I try to print either of the 2 other names, though, the wrong strings are returned.



Any ideas on how I can resolve this?







python parsing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 14:29









rici

153k19135200




153k19135200










asked Nov 14 '18 at 1:53









Steve JustSteve Just

164




164













  • Can you provide what is the current output that you get?

    – Andreas
    Nov 14 '18 at 2:00











  • Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...

    – duhaime
    Nov 14 '18 at 2:01











  • whats HumanNames I don't see that class in the docs, only HumanName which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library

    – aws_apprentice
    Nov 14 '18 at 2:03











  • github.com/gwu-libraries/namesparser

    – Steve Just
    Nov 14 '18 at 2:29











  • the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on the and

    – aws_apprentice
    Nov 14 '18 at 2:53



















  • Can you provide what is the current output that you get?

    – Andreas
    Nov 14 '18 at 2:00











  • Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...

    – duhaime
    Nov 14 '18 at 2:01











  • whats HumanNames I don't see that class in the docs, only HumanName which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library

    – aws_apprentice
    Nov 14 '18 at 2:03











  • github.com/gwu-libraries/namesparser

    – Steve Just
    Nov 14 '18 at 2:29











  • the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on the and

    – aws_apprentice
    Nov 14 '18 at 2:53

















Can you provide what is the current output that you get?

– Andreas
Nov 14 '18 at 2:00





Can you provide what is the current output that you get?

– Andreas
Nov 14 '18 at 2:00













Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...

– duhaime
Nov 14 '18 at 2:01





Are all the full names literally comma separated? Also, have you considered using a named entity recognition pipeline? Stanford's CoreNLP would parse this no sweat, and then you'd just use while loops to collect consecutive tokens with the person attribute...

– duhaime
Nov 14 '18 at 2:01













whats HumanNames I don't see that class in the docs, only HumanName which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library

– aws_apprentice
Nov 14 '18 at 2:03





whats HumanNames I don't see that class in the docs, only HumanName which takes one person at a time, if you could provide more info there is probably a way already how to do this in the library

– aws_apprentice
Nov 14 '18 at 2:03













github.com/gwu-libraries/namesparser

– Steve Just
Nov 14 '18 at 2:29





github.com/gwu-libraries/namesparser

– Steve Just
Nov 14 '18 at 2:29













the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on the and

– aws_apprentice
Nov 14 '18 at 2:53





the issue is here -> github.com/gwu-libraries/namesparser/blob/master/… you'll have to change that line otherwise it will keep splitting the name on the and

– aws_apprentice
Nov 14 '18 at 2:53












1 Answer
1






active

oldest

votes


















0














I think you should use the comma , as your delimiter.



def print_names( name_string ):
return (name.strip() for name in name_string.split(","))


what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.



Now that you have a generator of names, you can pass it into other things for example:



humans = [HumanName(name) for name in print_names(name_string)]


but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.



If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.






share|improve this answer


























  • I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.

    – Steve Just
    Nov 14 '18 at 2:31











  • I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation

    – Fallenreaper
    Nov 14 '18 at 4:21













Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292082%2fparsing-full-names-from-a-list-of-names%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














I think you should use the comma , as your delimiter.



def print_names( name_string ):
return (name.strip() for name in name_string.split(","))


what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.



Now that you have a generator of names, you can pass it into other things for example:



humans = [HumanName(name) for name in print_names(name_string)]


but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.



If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.






share|improve this answer


























  • I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.

    – Steve Just
    Nov 14 '18 at 2:31











  • I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation

    – Fallenreaper
    Nov 14 '18 at 4:21


















0














I think you should use the comma , as your delimiter.



def print_names( name_string ):
return (name.strip() for name in name_string.split(","))


what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.



Now that you have a generator of names, you can pass it into other things for example:



humans = [HumanName(name) for name in print_names(name_string)]


but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.



If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.






share|improve this answer


























  • I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.

    – Steve Just
    Nov 14 '18 at 2:31











  • I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation

    – Fallenreaper
    Nov 14 '18 at 4:21
















0












0








0







I think you should use the comma , as your delimiter.



def print_names( name_string ):
return (name.strip() for name in name_string.split(","))


what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.



Now that you have a generator of names, you can pass it into other things for example:



humans = [HumanName(name) for name in print_names(name_string)]


but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.



If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.






share|improve this answer















I think you should use the comma , as your delimiter.



def print_names( name_string ):
return (name.strip() for name in name_string.split(","))


what this does is split your string on the comma, and then strip trailing and leading spaces, etc... before returning an array of names.



Now that you have a generator of names, you can pass it into other things for example:



humans = [HumanName(name) for name in print_names(name_string)]


but then again, I dont know what your class HumanNames / HumanName really means, and you didnt put a class defition.



If you are looking at this module: https://pypi.org/project/nameparser/ in which it takes a string consisting of a singular name, the above will still work no problem.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 14 '18 at 2:08

























answered Nov 14 '18 at 2:01









FallenreaperFallenreaper

3,95883483




3,95883483













  • I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.

    – Steve Just
    Nov 14 '18 at 2:31











  • I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation

    – Fallenreaper
    Nov 14 '18 at 4:21





















  • I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.

    – Steve Just
    Nov 14 '18 at 2:31











  • I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation

    – Fallenreaper
    Nov 14 '18 at 4:21



















I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.

– Steve Just
Nov 14 '18 at 2:31





I can't simply split based on comma because of names like "John Smith, Jr." Namesparser is supposed to account for these situations (and in fact, works well). It just seems to be getting tripped up by this scenario where "and" falls within the name itself.

– Steve Just
Nov 14 '18 at 2:31













I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation

– Fallenreaper
Nov 14 '18 at 4:21







I think you need to point me to a class which defines "HumanNames" because HumanName takes only 1. I dont know what the constraints are for the plural version. Can you link it? I should be able to assist a bit more if i can see its implementation

– Fallenreaper
Nov 14 '18 at 4:21




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292082%2fparsing-full-names-from-a-list-of-names%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Bressuire

Vorschmack

Quarantine