Why is rune in golang an alias for int32 and not uint32?












22















The type rune in Go is defined as




an alias for int32 and is equivalent to int32 in all ways. It is
used, by convention, to distinguish character values from integer
values.




If the intention is to use this type to represent character values, why did the authors of the Go language do not use uint32 instead of int32? How do they expect a rune value to be handled in a program, when it is negative? The other similar type, byte, is an alias for uint8 (and not int8), which seems reasonable.










share|improve this question




















  • 1





    Note: byte is an alias for uint8, not uint.

    – Filipe Gonçalves
    Aug 26 '15 at 23:11











  • You selected the right answer before, what has changed?

    – VonC
    May 17 '18 at 18:44
















22















The type rune in Go is defined as




an alias for int32 and is equivalent to int32 in all ways. It is
used, by convention, to distinguish character values from integer
values.




If the intention is to use this type to represent character values, why did the authors of the Go language do not use uint32 instead of int32? How do they expect a rune value to be handled in a program, when it is negative? The other similar type, byte, is an alias for uint8 (and not int8), which seems reasonable.










share|improve this question




















  • 1





    Note: byte is an alias for uint8, not uint.

    – Filipe Gonçalves
    Aug 26 '15 at 23:11











  • You selected the right answer before, what has changed?

    – VonC
    May 17 '18 at 18:44














22












22








22


7






The type rune in Go is defined as




an alias for int32 and is equivalent to int32 in all ways. It is
used, by convention, to distinguish character values from integer
values.




If the intention is to use this type to represent character values, why did the authors of the Go language do not use uint32 instead of int32? How do they expect a rune value to be handled in a program, when it is negative? The other similar type, byte, is an alias for uint8 (and not int8), which seems reasonable.










share|improve this question
















The type rune in Go is defined as




an alias for int32 and is equivalent to int32 in all ways. It is
used, by convention, to distinguish character values from integer
values.




If the intention is to use this type to represent character values, why did the authors of the Go language do not use uint32 instead of int32? How do they expect a rune value to be handled in a program, when it is negative? The other similar type, byte, is an alias for uint8 (and not int8), which seems reasonable.







go






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 7:28









Rene Knop

1,3633722




1,3633722










asked Jul 12 '14 at 15:55









Tapan KarechaTapan Karecha

952817




952817








  • 1





    Note: byte is an alias for uint8, not uint.

    – Filipe Gonçalves
    Aug 26 '15 at 23:11











  • You selected the right answer before, what has changed?

    – VonC
    May 17 '18 at 18:44














  • 1





    Note: byte is an alias for uint8, not uint.

    – Filipe Gonçalves
    Aug 26 '15 at 23:11











  • You selected the right answer before, what has changed?

    – VonC
    May 17 '18 at 18:44








1




1





Note: byte is an alias for uint8, not uint.

– Filipe Gonçalves
Aug 26 '15 at 23:11





Note: byte is an alias for uint8, not uint.

– Filipe Gonçalves
Aug 26 '15 at 23:11













You selected the right answer before, what has changed?

– VonC
May 17 '18 at 18:44





You selected the right answer before, what has changed?

– VonC
May 17 '18 at 18:44












3 Answers
3






active

oldest

votes


















12














I googled and found this:
https://groups.google.com/forum/#!topic/golang-nuts/d3_GPK8bwBg




This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types.







share|improve this answer


























  • All answers in that thread argue that there is enough space to reference all code points of Unicode in a signed 32 bit integer. Hence, I do understand how rune is big enough to address the Unicode range. The question still remains about the choice of type. Why not uint16 (which has comparable range of values for positive integers) but uses only half the space as int32?

    – Tapan Karecha
    Jul 12 '14 at 16:20






  • 2





    @TapanKarecha: uint16 doesn’t fit all of Unicode, though. It fits a really big chunk of it, but Unicode ends at 0x10fffd.

    – Ry-
    Jul 12 '14 at 16:21








  • 3





    Christoph Hack: "This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types."

    – chendesheng
    Jul 12 '14 at 16:27






  • 1





    @chendesheng, please add your comment into your answer. It is the most important part, in my opinion.

    – andybalholm
    Jul 12 '14 at 17:55






  • 2





    Yes: uint can have hard-to-debug behavior like a-b > 1000 when a=1 and b=2 (play). So Go uses int where it can.

    – twotwotwo
    Jul 13 '14 at 2:21





















4














It doesn’t become negative. There are currently 1,114,112 codepoints in Unicode, which is far from 2,147,483,647 (0x7fffffff) – even considering all the reserved blocks.






share|improve this answer



















  • 2





    Thanks! Though a rune may address a range much larger than needed by unicode at this time, the question is about the fact that a negative value can be assigned to a rune. This could have been avoided if it was an unsigned integer. But there may be other considerations that make sense for a rune to still be a signed type, and I wonder what those are.

    – Tapan Karecha
    Jul 12 '14 at 16:10













  • @TapanKarecha: Sure, but you could also assign a positive value outside of Unicode’s range. Neither one would be valid Unicode. (Negative numbers might be more obvious to check for as an error condition, as a habit taken from C?)

    – Ry-
    Jul 12 '14 at 16:23













  • .@false: Yes, there will be invalid values on the positive end of the type range, but having invalid values on both ends of the type range is something I am having trouble dealing with as a concept. As you said, if the type was unsigned, I wont have to worry about checking for the negative value, which is one less check during validation.

    – Tapan Karecha
    Jul 12 '14 at 16:32













  • @TapanKarecha: No, I was saying that a negative return value on something that ought to return Unicode would be an obvious error (not something that Go needs, but something that you might commonly do in other languages), but checking the positive isn’t convenient at all. Judging by Unicode’s stability policy, it might not even be possible.

    – Ry-
    Jul 12 '14 at 16:35






  • 6





    I think chendesheng's quote gets at the root cause best: Go uses a lot of signed values, not just for runes but array indices, Read/Write byte counts, etc. That's because uints, in any language, behave confusingly unless you guard every piece of arithmetic against overflow (for example if var a, b uint = 1, 2, a-b > 0 and a-b > 1000000: play.golang.org/p/lsdiZJiN7V). ints behave more like numbers in everyday life, which is a compelling reason to use them, and there is no equally compelling reason not to use them.

    – twotwotwo
    Jul 13 '14 at 2:03





















4














"Golang, Go : what is rune by the way?" mentioned:




With the recent Unicode 6.3, there are over 110,000 symbols defined. This requires at least 21-bit representation of each code point, so a rune is like int32 and has plenty of bits.




But regarding the overflow or negative value issues, note that the implementation of some of the unicode functions like unicode.IsGraphic do include:




We convert to uint32 to avoid the extra test for negative




Code:



const MaxLatin1 = 'u00FF' // maximum Latin-1 value.

// IsGraphic reports whether the rune is defined as a Graphic by Unicode.
// Such characters include letters, marks, numbers, punctuation, symbols, and
// spaces, from categories L, M, N, P, S, Zs.
func IsGraphic(r rune) bool {
// We convert to uint32 to avoid the extra test for negative,
// and in the index we convert to uint8 to avoid the range check.
if uint32(r) <= MaxLatin1 {
return properties[uint8(r)]&pg != 0
}
return In(r, GraphicRanges...)
}


That maybe because a rune is supposed to be constant (as mentioned in "Go rune type explanation", where a rune could be in an int32 or uint32 or even float32 or ...: its constant value authorizes it to be stored in any of those numeric types).






share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f24714665%2fwhy-is-rune-in-golang-an-alias-for-int32-and-not-uint32%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    12














    I googled and found this:
    https://groups.google.com/forum/#!topic/golang-nuts/d3_GPK8bwBg




    This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types.







    share|improve this answer


























    • All answers in that thread argue that there is enough space to reference all code points of Unicode in a signed 32 bit integer. Hence, I do understand how rune is big enough to address the Unicode range. The question still remains about the choice of type. Why not uint16 (which has comparable range of values for positive integers) but uses only half the space as int32?

      – Tapan Karecha
      Jul 12 '14 at 16:20






    • 2





      @TapanKarecha: uint16 doesn’t fit all of Unicode, though. It fits a really big chunk of it, but Unicode ends at 0x10fffd.

      – Ry-
      Jul 12 '14 at 16:21








    • 3





      Christoph Hack: "This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types."

      – chendesheng
      Jul 12 '14 at 16:27






    • 1





      @chendesheng, please add your comment into your answer. It is the most important part, in my opinion.

      – andybalholm
      Jul 12 '14 at 17:55






    • 2





      Yes: uint can have hard-to-debug behavior like a-b > 1000 when a=1 and b=2 (play). So Go uses int where it can.

      – twotwotwo
      Jul 13 '14 at 2:21


















    12














    I googled and found this:
    https://groups.google.com/forum/#!topic/golang-nuts/d3_GPK8bwBg




    This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types.







    share|improve this answer


























    • All answers in that thread argue that there is enough space to reference all code points of Unicode in a signed 32 bit integer. Hence, I do understand how rune is big enough to address the Unicode range. The question still remains about the choice of type. Why not uint16 (which has comparable range of values for positive integers) but uses only half the space as int32?

      – Tapan Karecha
      Jul 12 '14 at 16:20






    • 2





      @TapanKarecha: uint16 doesn’t fit all of Unicode, though. It fits a really big chunk of it, but Unicode ends at 0x10fffd.

      – Ry-
      Jul 12 '14 at 16:21








    • 3





      Christoph Hack: "This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types."

      – chendesheng
      Jul 12 '14 at 16:27






    • 1





      @chendesheng, please add your comment into your answer. It is the most important part, in my opinion.

      – andybalholm
      Jul 12 '14 at 17:55






    • 2





      Yes: uint can have hard-to-debug behavior like a-b > 1000 when a=1 and b=2 (play). So Go uses int where it can.

      – twotwotwo
      Jul 13 '14 at 2:21
















    12












    12








    12







    I googled and found this:
    https://groups.google.com/forum/#!topic/golang-nuts/d3_GPK8bwBg




    This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types.







    share|improve this answer















    I googled and found this:
    https://groups.google.com/forum/#!topic/golang-nuts/d3_GPK8bwBg




    This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types.








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Sep 6 '16 at 15:22









    Trevor Hickey

    16.8k1494187




    16.8k1494187










    answered Jul 12 '14 at 16:08









    chendeshengchendesheng

    1,169611




    1,169611













    • All answers in that thread argue that there is enough space to reference all code points of Unicode in a signed 32 bit integer. Hence, I do understand how rune is big enough to address the Unicode range. The question still remains about the choice of type. Why not uint16 (which has comparable range of values for positive integers) but uses only half the space as int32?

      – Tapan Karecha
      Jul 12 '14 at 16:20






    • 2





      @TapanKarecha: uint16 doesn’t fit all of Unicode, though. It fits a really big chunk of it, but Unicode ends at 0x10fffd.

      – Ry-
      Jul 12 '14 at 16:21








    • 3





      Christoph Hack: "This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types."

      – chendesheng
      Jul 12 '14 at 16:27






    • 1





      @chendesheng, please add your comment into your answer. It is the most important part, in my opinion.

      – andybalholm
      Jul 12 '14 at 17:55






    • 2





      Yes: uint can have hard-to-debug behavior like a-b > 1000 when a=1 and b=2 (play). So Go uses int where it can.

      – twotwotwo
      Jul 13 '14 at 2:21





















    • All answers in that thread argue that there is enough space to reference all code points of Unicode in a signed 32 bit integer. Hence, I do understand how rune is big enough to address the Unicode range. The question still remains about the choice of type. Why not uint16 (which has comparable range of values for positive integers) but uses only half the space as int32?

      – Tapan Karecha
      Jul 12 '14 at 16:20






    • 2





      @TapanKarecha: uint16 doesn’t fit all of Unicode, though. It fits a really big chunk of it, but Unicode ends at 0x10fffd.

      – Ry-
      Jul 12 '14 at 16:21








    • 3





      Christoph Hack: "This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types."

      – chendesheng
      Jul 12 '14 at 16:27






    • 1





      @chendesheng, please add your comment into your answer. It is the most important part, in my opinion.

      – andybalholm
      Jul 12 '14 at 17:55






    • 2





      Yes: uint can have hard-to-debug behavior like a-b > 1000 when a=1 and b=2 (play). So Go uses int where it can.

      – twotwotwo
      Jul 13 '14 at 2:21



















    All answers in that thread argue that there is enough space to reference all code points of Unicode in a signed 32 bit integer. Hence, I do understand how rune is big enough to address the Unicode range. The question still remains about the choice of type. Why not uint16 (which has comparable range of values for positive integers) but uses only half the space as int32?

    – Tapan Karecha
    Jul 12 '14 at 16:20





    All answers in that thread argue that there is enough space to reference all code points of Unicode in a signed 32 bit integer. Hence, I do understand how rune is big enough to address the Unicode range. The question still remains about the choice of type. Why not uint16 (which has comparable range of values for positive integers) but uses only half the space as int32?

    – Tapan Karecha
    Jul 12 '14 at 16:20




    2




    2





    @TapanKarecha: uint16 doesn’t fit all of Unicode, though. It fits a really big chunk of it, but Unicode ends at 0x10fffd.

    – Ry-
    Jul 12 '14 at 16:21







    @TapanKarecha: uint16 doesn’t fit all of Unicode, though. It fits a really big chunk of it, but Unicode ends at 0x10fffd.

    – Ry-
    Jul 12 '14 at 16:21






    3




    3





    Christoph Hack: "This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types."

    – chendesheng
    Jul 12 '14 at 16:27





    Christoph Hack: "This has been asked several times. rune occupies 4 bytes and not just one because it is supposed to store unicode codepoints and not just ASCII characters. Like array indices, the datatype is signed so that you can easily detect overflows or other errors while doing arithmetic with those types."

    – chendesheng
    Jul 12 '14 at 16:27




    1




    1





    @chendesheng, please add your comment into your answer. It is the most important part, in my opinion.

    – andybalholm
    Jul 12 '14 at 17:55





    @chendesheng, please add your comment into your answer. It is the most important part, in my opinion.

    – andybalholm
    Jul 12 '14 at 17:55




    2




    2





    Yes: uint can have hard-to-debug behavior like a-b > 1000 when a=1 and b=2 (play). So Go uses int where it can.

    – twotwotwo
    Jul 13 '14 at 2:21







    Yes: uint can have hard-to-debug behavior like a-b > 1000 when a=1 and b=2 (play). So Go uses int where it can.

    – twotwotwo
    Jul 13 '14 at 2:21















    4














    It doesn’t become negative. There are currently 1,114,112 codepoints in Unicode, which is far from 2,147,483,647 (0x7fffffff) – even considering all the reserved blocks.






    share|improve this answer



















    • 2





      Thanks! Though a rune may address a range much larger than needed by unicode at this time, the question is about the fact that a negative value can be assigned to a rune. This could have been avoided if it was an unsigned integer. But there may be other considerations that make sense for a rune to still be a signed type, and I wonder what those are.

      – Tapan Karecha
      Jul 12 '14 at 16:10













    • @TapanKarecha: Sure, but you could also assign a positive value outside of Unicode’s range. Neither one would be valid Unicode. (Negative numbers might be more obvious to check for as an error condition, as a habit taken from C?)

      – Ry-
      Jul 12 '14 at 16:23













    • .@false: Yes, there will be invalid values on the positive end of the type range, but having invalid values on both ends of the type range is something I am having trouble dealing with as a concept. As you said, if the type was unsigned, I wont have to worry about checking for the negative value, which is one less check during validation.

      – Tapan Karecha
      Jul 12 '14 at 16:32













    • @TapanKarecha: No, I was saying that a negative return value on something that ought to return Unicode would be an obvious error (not something that Go needs, but something that you might commonly do in other languages), but checking the positive isn’t convenient at all. Judging by Unicode’s stability policy, it might not even be possible.

      – Ry-
      Jul 12 '14 at 16:35






    • 6





      I think chendesheng's quote gets at the root cause best: Go uses a lot of signed values, not just for runes but array indices, Read/Write byte counts, etc. That's because uints, in any language, behave confusingly unless you guard every piece of arithmetic against overflow (for example if var a, b uint = 1, 2, a-b > 0 and a-b > 1000000: play.golang.org/p/lsdiZJiN7V). ints behave more like numbers in everyday life, which is a compelling reason to use them, and there is no equally compelling reason not to use them.

      – twotwotwo
      Jul 13 '14 at 2:03


















    4














    It doesn’t become negative. There are currently 1,114,112 codepoints in Unicode, which is far from 2,147,483,647 (0x7fffffff) – even considering all the reserved blocks.






    share|improve this answer



















    • 2





      Thanks! Though a rune may address a range much larger than needed by unicode at this time, the question is about the fact that a negative value can be assigned to a rune. This could have been avoided if it was an unsigned integer. But there may be other considerations that make sense for a rune to still be a signed type, and I wonder what those are.

      – Tapan Karecha
      Jul 12 '14 at 16:10













    • @TapanKarecha: Sure, but you could also assign a positive value outside of Unicode’s range. Neither one would be valid Unicode. (Negative numbers might be more obvious to check for as an error condition, as a habit taken from C?)

      – Ry-
      Jul 12 '14 at 16:23













    • .@false: Yes, there will be invalid values on the positive end of the type range, but having invalid values on both ends of the type range is something I am having trouble dealing with as a concept. As you said, if the type was unsigned, I wont have to worry about checking for the negative value, which is one less check during validation.

      – Tapan Karecha
      Jul 12 '14 at 16:32













    • @TapanKarecha: No, I was saying that a negative return value on something that ought to return Unicode would be an obvious error (not something that Go needs, but something that you might commonly do in other languages), but checking the positive isn’t convenient at all. Judging by Unicode’s stability policy, it might not even be possible.

      – Ry-
      Jul 12 '14 at 16:35






    • 6





      I think chendesheng's quote gets at the root cause best: Go uses a lot of signed values, not just for runes but array indices, Read/Write byte counts, etc. That's because uints, in any language, behave confusingly unless you guard every piece of arithmetic against overflow (for example if var a, b uint = 1, 2, a-b > 0 and a-b > 1000000: play.golang.org/p/lsdiZJiN7V). ints behave more like numbers in everyday life, which is a compelling reason to use them, and there is no equally compelling reason not to use them.

      – twotwotwo
      Jul 13 '14 at 2:03
















    4












    4








    4







    It doesn’t become negative. There are currently 1,114,112 codepoints in Unicode, which is far from 2,147,483,647 (0x7fffffff) – even considering all the reserved blocks.






    share|improve this answer













    It doesn’t become negative. There are currently 1,114,112 codepoints in Unicode, which is far from 2,147,483,647 (0x7fffffff) – even considering all the reserved blocks.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Jul 12 '14 at 16:00









    Ry-Ry-

    169k40344360




    169k40344360








    • 2





      Thanks! Though a rune may address a range much larger than needed by unicode at this time, the question is about the fact that a negative value can be assigned to a rune. This could have been avoided if it was an unsigned integer. But there may be other considerations that make sense for a rune to still be a signed type, and I wonder what those are.

      – Tapan Karecha
      Jul 12 '14 at 16:10













    • @TapanKarecha: Sure, but you could also assign a positive value outside of Unicode’s range. Neither one would be valid Unicode. (Negative numbers might be more obvious to check for as an error condition, as a habit taken from C?)

      – Ry-
      Jul 12 '14 at 16:23













    • .@false: Yes, there will be invalid values on the positive end of the type range, but having invalid values on both ends of the type range is something I am having trouble dealing with as a concept. As you said, if the type was unsigned, I wont have to worry about checking for the negative value, which is one less check during validation.

      – Tapan Karecha
      Jul 12 '14 at 16:32













    • @TapanKarecha: No, I was saying that a negative return value on something that ought to return Unicode would be an obvious error (not something that Go needs, but something that you might commonly do in other languages), but checking the positive isn’t convenient at all. Judging by Unicode’s stability policy, it might not even be possible.

      – Ry-
      Jul 12 '14 at 16:35






    • 6





      I think chendesheng's quote gets at the root cause best: Go uses a lot of signed values, not just for runes but array indices, Read/Write byte counts, etc. That's because uints, in any language, behave confusingly unless you guard every piece of arithmetic against overflow (for example if var a, b uint = 1, 2, a-b > 0 and a-b > 1000000: play.golang.org/p/lsdiZJiN7V). ints behave more like numbers in everyday life, which is a compelling reason to use them, and there is no equally compelling reason not to use them.

      – twotwotwo
      Jul 13 '14 at 2:03
















    • 2





      Thanks! Though a rune may address a range much larger than needed by unicode at this time, the question is about the fact that a negative value can be assigned to a rune. This could have been avoided if it was an unsigned integer. But there may be other considerations that make sense for a rune to still be a signed type, and I wonder what those are.

      – Tapan Karecha
      Jul 12 '14 at 16:10













    • @TapanKarecha: Sure, but you could also assign a positive value outside of Unicode’s range. Neither one would be valid Unicode. (Negative numbers might be more obvious to check for as an error condition, as a habit taken from C?)

      – Ry-
      Jul 12 '14 at 16:23













    • .@false: Yes, there will be invalid values on the positive end of the type range, but having invalid values on both ends of the type range is something I am having trouble dealing with as a concept. As you said, if the type was unsigned, I wont have to worry about checking for the negative value, which is one less check during validation.

      – Tapan Karecha
      Jul 12 '14 at 16:32













    • @TapanKarecha: No, I was saying that a negative return value on something that ought to return Unicode would be an obvious error (not something that Go needs, but something that you might commonly do in other languages), but checking the positive isn’t convenient at all. Judging by Unicode’s stability policy, it might not even be possible.

      – Ry-
      Jul 12 '14 at 16:35






    • 6





      I think chendesheng's quote gets at the root cause best: Go uses a lot of signed values, not just for runes but array indices, Read/Write byte counts, etc. That's because uints, in any language, behave confusingly unless you guard every piece of arithmetic against overflow (for example if var a, b uint = 1, 2, a-b > 0 and a-b > 1000000: play.golang.org/p/lsdiZJiN7V). ints behave more like numbers in everyday life, which is a compelling reason to use them, and there is no equally compelling reason not to use them.

      – twotwotwo
      Jul 13 '14 at 2:03










    2




    2





    Thanks! Though a rune may address a range much larger than needed by unicode at this time, the question is about the fact that a negative value can be assigned to a rune. This could have been avoided if it was an unsigned integer. But there may be other considerations that make sense for a rune to still be a signed type, and I wonder what those are.

    – Tapan Karecha
    Jul 12 '14 at 16:10







    Thanks! Though a rune may address a range much larger than needed by unicode at this time, the question is about the fact that a negative value can be assigned to a rune. This could have been avoided if it was an unsigned integer. But there may be other considerations that make sense for a rune to still be a signed type, and I wonder what those are.

    – Tapan Karecha
    Jul 12 '14 at 16:10















    @TapanKarecha: Sure, but you could also assign a positive value outside of Unicode’s range. Neither one would be valid Unicode. (Negative numbers might be more obvious to check for as an error condition, as a habit taken from C?)

    – Ry-
    Jul 12 '14 at 16:23







    @TapanKarecha: Sure, but you could also assign a positive value outside of Unicode’s range. Neither one would be valid Unicode. (Negative numbers might be more obvious to check for as an error condition, as a habit taken from C?)

    – Ry-
    Jul 12 '14 at 16:23















    .@false: Yes, there will be invalid values on the positive end of the type range, but having invalid values on both ends of the type range is something I am having trouble dealing with as a concept. As you said, if the type was unsigned, I wont have to worry about checking for the negative value, which is one less check during validation.

    – Tapan Karecha
    Jul 12 '14 at 16:32







    .@false: Yes, there will be invalid values on the positive end of the type range, but having invalid values on both ends of the type range is something I am having trouble dealing with as a concept. As you said, if the type was unsigned, I wont have to worry about checking for the negative value, which is one less check during validation.

    – Tapan Karecha
    Jul 12 '14 at 16:32















    @TapanKarecha: No, I was saying that a negative return value on something that ought to return Unicode would be an obvious error (not something that Go needs, but something that you might commonly do in other languages), but checking the positive isn’t convenient at all. Judging by Unicode’s stability policy, it might not even be possible.

    – Ry-
    Jul 12 '14 at 16:35





    @TapanKarecha: No, I was saying that a negative return value on something that ought to return Unicode would be an obvious error (not something that Go needs, but something that you might commonly do in other languages), but checking the positive isn’t convenient at all. Judging by Unicode’s stability policy, it might not even be possible.

    – Ry-
    Jul 12 '14 at 16:35




    6




    6





    I think chendesheng's quote gets at the root cause best: Go uses a lot of signed values, not just for runes but array indices, Read/Write byte counts, etc. That's because uints, in any language, behave confusingly unless you guard every piece of arithmetic against overflow (for example if var a, b uint = 1, 2, a-b > 0 and a-b > 1000000: play.golang.org/p/lsdiZJiN7V). ints behave more like numbers in everyday life, which is a compelling reason to use them, and there is no equally compelling reason not to use them.

    – twotwotwo
    Jul 13 '14 at 2:03







    I think chendesheng's quote gets at the root cause best: Go uses a lot of signed values, not just for runes but array indices, Read/Write byte counts, etc. That's because uints, in any language, behave confusingly unless you guard every piece of arithmetic against overflow (for example if var a, b uint = 1, 2, a-b > 0 and a-b > 1000000: play.golang.org/p/lsdiZJiN7V). ints behave more like numbers in everyday life, which is a compelling reason to use them, and there is no equally compelling reason not to use them.

    – twotwotwo
    Jul 13 '14 at 2:03













    4














    "Golang, Go : what is rune by the way?" mentioned:




    With the recent Unicode 6.3, there are over 110,000 symbols defined. This requires at least 21-bit representation of each code point, so a rune is like int32 and has plenty of bits.




    But regarding the overflow or negative value issues, note that the implementation of some of the unicode functions like unicode.IsGraphic do include:




    We convert to uint32 to avoid the extra test for negative




    Code:



    const MaxLatin1 = 'u00FF' // maximum Latin-1 value.

    // IsGraphic reports whether the rune is defined as a Graphic by Unicode.
    // Such characters include letters, marks, numbers, punctuation, symbols, and
    // spaces, from categories L, M, N, P, S, Zs.
    func IsGraphic(r rune) bool {
    // We convert to uint32 to avoid the extra test for negative,
    // and in the index we convert to uint8 to avoid the range check.
    if uint32(r) <= MaxLatin1 {
    return properties[uint8(r)]&pg != 0
    }
    return In(r, GraphicRanges...)
    }


    That maybe because a rune is supposed to be constant (as mentioned in "Go rune type explanation", where a rune could be in an int32 or uint32 or even float32 or ...: its constant value authorizes it to be stored in any of those numeric types).






    share|improve this answer






























      4














      "Golang, Go : what is rune by the way?" mentioned:




      With the recent Unicode 6.3, there are over 110,000 symbols defined. This requires at least 21-bit representation of each code point, so a rune is like int32 and has plenty of bits.




      But regarding the overflow or negative value issues, note that the implementation of some of the unicode functions like unicode.IsGraphic do include:




      We convert to uint32 to avoid the extra test for negative




      Code:



      const MaxLatin1 = 'u00FF' // maximum Latin-1 value.

      // IsGraphic reports whether the rune is defined as a Graphic by Unicode.
      // Such characters include letters, marks, numbers, punctuation, symbols, and
      // spaces, from categories L, M, N, P, S, Zs.
      func IsGraphic(r rune) bool {
      // We convert to uint32 to avoid the extra test for negative,
      // and in the index we convert to uint8 to avoid the range check.
      if uint32(r) <= MaxLatin1 {
      return properties[uint8(r)]&pg != 0
      }
      return In(r, GraphicRanges...)
      }


      That maybe because a rune is supposed to be constant (as mentioned in "Go rune type explanation", where a rune could be in an int32 or uint32 or even float32 or ...: its constant value authorizes it to be stored in any of those numeric types).






      share|improve this answer




























        4












        4








        4







        "Golang, Go : what is rune by the way?" mentioned:




        With the recent Unicode 6.3, there are over 110,000 symbols defined. This requires at least 21-bit representation of each code point, so a rune is like int32 and has plenty of bits.




        But regarding the overflow or negative value issues, note that the implementation of some of the unicode functions like unicode.IsGraphic do include:




        We convert to uint32 to avoid the extra test for negative




        Code:



        const MaxLatin1 = 'u00FF' // maximum Latin-1 value.

        // IsGraphic reports whether the rune is defined as a Graphic by Unicode.
        // Such characters include letters, marks, numbers, punctuation, symbols, and
        // spaces, from categories L, M, N, P, S, Zs.
        func IsGraphic(r rune) bool {
        // We convert to uint32 to avoid the extra test for negative,
        // and in the index we convert to uint8 to avoid the range check.
        if uint32(r) <= MaxLatin1 {
        return properties[uint8(r)]&pg != 0
        }
        return In(r, GraphicRanges...)
        }


        That maybe because a rune is supposed to be constant (as mentioned in "Go rune type explanation", where a rune could be in an int32 or uint32 or even float32 or ...: its constant value authorizes it to be stored in any of those numeric types).






        share|improve this answer















        "Golang, Go : what is rune by the way?" mentioned:




        With the recent Unicode 6.3, there are over 110,000 symbols defined. This requires at least 21-bit representation of each code point, so a rune is like int32 and has plenty of bits.




        But regarding the overflow or negative value issues, note that the implementation of some of the unicode functions like unicode.IsGraphic do include:




        We convert to uint32 to avoid the extra test for negative




        Code:



        const MaxLatin1 = 'u00FF' // maximum Latin-1 value.

        // IsGraphic reports whether the rune is defined as a Graphic by Unicode.
        // Such characters include letters, marks, numbers, punctuation, symbols, and
        // spaces, from categories L, M, N, P, S, Zs.
        func IsGraphic(r rune) bool {
        // We convert to uint32 to avoid the extra test for negative,
        // and in the index we convert to uint8 to avoid the range check.
        if uint32(r) <= MaxLatin1 {
        return properties[uint8(r)]&pg != 0
        }
        return In(r, GraphicRanges...)
        }


        That maybe because a rune is supposed to be constant (as mentioned in "Go rune type explanation", where a rune could be in an int32 or uint32 or even float32 or ...: its constant value authorizes it to be stored in any of those numeric types).







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited May 23 '17 at 12:25









        Community

        11




        11










        answered Jul 12 '14 at 18:21









        VonCVonC

        844k29426773230




        844k29426773230






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f24714665%2fwhy-is-rune-in-golang-an-alias-for-int32-and-not-uint32%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Bressuire

            Vorschmack

            Quarantine