Why not allow an external interface to provide hashCode/equals for a HashMap?












13















With a TreeMap it's trivial to provide a custom Comparator, thus overriding the semantics provided by Comparable objects added to the map. HashMaps however cannot be controlled in this manner; the functions providing hash values and equality checks cannot be 'side-loaded'.



I suspect it would be both easy and useful to design an interface and to retrofit this into HashMap (or a new class)? Something like this, except with better names:



  interface Hasharator<T> {
int alternativeHashCode(T t);
boolean alternativeEquals(T t1, T t2);
}

class HasharatorMap<K, V> {
HasharatorMap(Hasharator<? super K> hasharator) { ... }
}

class HasharatorSet<T> {
HasharatorSet(Hasharator<? super T> hasharator) { ... }
}


The case insensitive Map problem gets a trivial solution:



 new HasharatorMap(String.CASE_INSENSITIVE_EQUALITY);


Would this be doable, or can you see any fundamental problems with this approach?



Is the approach used in any existing (non-JRE) libs? (Tried google, no luck.)



EDIT: Nice workaround presented by hazzen, but I'm afraid this is the workaround I'm trying to avoid... ;)



EDIT: Changed title to no longer mention "Comparator"; I suspect this was a bit confusing.



EDIT: Accepted answer with relation to performance; would love a more specific answer!



EDIT: There is an implementation; see the accepted answer below.



EDIT: Rephrased the first sentence to indicate more clearly that it's the side-loading I'm after (and not ordering; ordering does not belong in HashMap).










share|improve this question

























  • "This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time." -- HashMap's Javadocs. In other words, HashMap isn't ordered.

    – Powerlord
    Dec 9 '09 at 20:55











  • This statement allows ANY hashCode implementation to be used and also allows the Map to resize itself as it goes. So this is a feature and not a problem in this context?

    – volley
    Dec 9 '09 at 21:01
















13















With a TreeMap it's trivial to provide a custom Comparator, thus overriding the semantics provided by Comparable objects added to the map. HashMaps however cannot be controlled in this manner; the functions providing hash values and equality checks cannot be 'side-loaded'.



I suspect it would be both easy and useful to design an interface and to retrofit this into HashMap (or a new class)? Something like this, except with better names:



  interface Hasharator<T> {
int alternativeHashCode(T t);
boolean alternativeEquals(T t1, T t2);
}

class HasharatorMap<K, V> {
HasharatorMap(Hasharator<? super K> hasharator) { ... }
}

class HasharatorSet<T> {
HasharatorSet(Hasharator<? super T> hasharator) { ... }
}


The case insensitive Map problem gets a trivial solution:



 new HasharatorMap(String.CASE_INSENSITIVE_EQUALITY);


Would this be doable, or can you see any fundamental problems with this approach?



Is the approach used in any existing (non-JRE) libs? (Tried google, no luck.)



EDIT: Nice workaround presented by hazzen, but I'm afraid this is the workaround I'm trying to avoid... ;)



EDIT: Changed title to no longer mention "Comparator"; I suspect this was a bit confusing.



EDIT: Accepted answer with relation to performance; would love a more specific answer!



EDIT: There is an implementation; see the accepted answer below.



EDIT: Rephrased the first sentence to indicate more clearly that it's the side-loading I'm after (and not ordering; ordering does not belong in HashMap).










share|improve this question

























  • "This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time." -- HashMap's Javadocs. In other words, HashMap isn't ordered.

    – Powerlord
    Dec 9 '09 at 20:55











  • This statement allows ANY hashCode implementation to be used and also allows the Map to resize itself as it goes. So this is a feature and not a problem in this context?

    – volley
    Dec 9 '09 at 21:01














13












13








13


2






With a TreeMap it's trivial to provide a custom Comparator, thus overriding the semantics provided by Comparable objects added to the map. HashMaps however cannot be controlled in this manner; the functions providing hash values and equality checks cannot be 'side-loaded'.



I suspect it would be both easy and useful to design an interface and to retrofit this into HashMap (or a new class)? Something like this, except with better names:



  interface Hasharator<T> {
int alternativeHashCode(T t);
boolean alternativeEquals(T t1, T t2);
}

class HasharatorMap<K, V> {
HasharatorMap(Hasharator<? super K> hasharator) { ... }
}

class HasharatorSet<T> {
HasharatorSet(Hasharator<? super T> hasharator) { ... }
}


The case insensitive Map problem gets a trivial solution:



 new HasharatorMap(String.CASE_INSENSITIVE_EQUALITY);


Would this be doable, or can you see any fundamental problems with this approach?



Is the approach used in any existing (non-JRE) libs? (Tried google, no luck.)



EDIT: Nice workaround presented by hazzen, but I'm afraid this is the workaround I'm trying to avoid... ;)



EDIT: Changed title to no longer mention "Comparator"; I suspect this was a bit confusing.



EDIT: Accepted answer with relation to performance; would love a more specific answer!



EDIT: There is an implementation; see the accepted answer below.



EDIT: Rephrased the first sentence to indicate more clearly that it's the side-loading I'm after (and not ordering; ordering does not belong in HashMap).










share|improve this question
















With a TreeMap it's trivial to provide a custom Comparator, thus overriding the semantics provided by Comparable objects added to the map. HashMaps however cannot be controlled in this manner; the functions providing hash values and equality checks cannot be 'side-loaded'.



I suspect it would be both easy and useful to design an interface and to retrofit this into HashMap (or a new class)? Something like this, except with better names:



  interface Hasharator<T> {
int alternativeHashCode(T t);
boolean alternativeEquals(T t1, T t2);
}

class HasharatorMap<K, V> {
HasharatorMap(Hasharator<? super K> hasharator) { ... }
}

class HasharatorSet<T> {
HasharatorSet(Hasharator<? super T> hasharator) { ... }
}


The case insensitive Map problem gets a trivial solution:



 new HasharatorMap(String.CASE_INSENSITIVE_EQUALITY);


Would this be doable, or can you see any fundamental problems with this approach?



Is the approach used in any existing (non-JRE) libs? (Tried google, no luck.)



EDIT: Nice workaround presented by hazzen, but I'm afraid this is the workaround I'm trying to avoid... ;)



EDIT: Changed title to no longer mention "Comparator"; I suspect this was a bit confusing.



EDIT: Accepted answer with relation to performance; would love a more specific answer!



EDIT: There is an implementation; see the accepted answer below.



EDIT: Rephrased the first sentence to indicate more clearly that it's the side-loading I'm after (and not ordering; ordering does not belong in HashMap).







java collections hashmap trove4j






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 23 '17 at 11:53









Community

11




11










asked Oct 17 '08 at 23:06









volleyvolley

5,77912327




5,77912327













  • "This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time." -- HashMap's Javadocs. In other words, HashMap isn't ordered.

    – Powerlord
    Dec 9 '09 at 20:55











  • This statement allows ANY hashCode implementation to be used and also allows the Map to resize itself as it goes. So this is a feature and not a problem in this context?

    – volley
    Dec 9 '09 at 21:01



















  • "This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time." -- HashMap's Javadocs. In other words, HashMap isn't ordered.

    – Powerlord
    Dec 9 '09 at 20:55











  • This statement allows ANY hashCode implementation to be used and also allows the Map to resize itself as it goes. So this is a feature and not a problem in this context?

    – volley
    Dec 9 '09 at 21:01

















"This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time." -- HashMap's Javadocs. In other words, HashMap isn't ordered.

– Powerlord
Dec 9 '09 at 20:55





"This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time." -- HashMap's Javadocs. In other words, HashMap isn't ordered.

– Powerlord
Dec 9 '09 at 20:55













This statement allows ANY hashCode implementation to be used and also allows the Map to resize itself as it goes. So this is a feature and not a problem in this context?

– volley
Dec 9 '09 at 21:01





This statement allows ANY hashCode implementation to be used and also allows the Map to resize itself as it goes. So this is a feature and not a problem in this context?

– volley
Dec 9 '09 at 21:01












9 Answers
9






active

oldest

votes


















9














A bit late for you, but for future visitors, it might be worth knowing that commons-collections has an AbstractHashedMap (in 3.2.2 and with generics in 4.0). You can override these protected methods to achieve your desired behaviour:



protected int hash(Object key) { ... }
protected boolean isEqualKey(Object key1, Object key2) { ... }
protected boolean isEqualValue(Object value1, Object value2) { ... }
protected HashEntry createEntry(
HashEntry next, int hashCode, Object key, Object value) { ... }


An example implementation of such an alternative HashedMap is commons-collections' own IdentityMap (only up to 3.2.2 as Java has its own since 1.4).



This is not as powerful as providing an external "Hasharator" to a Map instance. You have to implement a new map class for every hashing strategy (composition vs. inheritance striking back...). But it's still good to know.






share|improve this answer





















  • 1





    PlusOne. You might want to update that link to AbstractHashedMap to point to v4 which finally has generics.

    – Nicolai
    Dec 30 '14 at 9:59



















8














.NET has this via IEqualityComparer (for a type which can compare two objects) and IEquatable (for a type which can compare itself to another instance).



In fact, I believe it was a mistake to define equality and hashcodes in java.lang.Object or System.Object at all. Equality in particular is hard to define in a way which makes sense with inheritance. I keep meaning to blog about this...



But yes, basically the idea is sound.






share|improve this answer
























  • And it accounts for the concept that there may be more than one concept of equality for a given type.

    – Kenogu Labz
    Mar 28 '16 at 20:18



















6














HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.



public interface HashingStrategy<E>
{
int computeHashCode(E object);
boolean equals(E object1, E object2);
}


You can't use a HashingStrategy with the built in HashSet or HashMap. GS Collections includes a java.util.Set called UnifiedSetWithHashingStrategy and a java.util.Map called UnifiedMapWithHashingStrategy.



Let's look at an example.



public class Data
{
private final int id;

public Data(int id)
{
this.id = id;
}

public int getId()
{
return id;
}

// No equals or hashcode
}


Here's how you might set up a UnifiedSetWithHashingStrategy and use it.



java.util.Set<Data> set =
new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(Data::getId));
Assert.assertTrue(set.add(new Data(1)));

// contains returns true even without hashcode and equals
Assert.assertTrue(set.contains(new Data(1)));

// Second call to add() doesn't do anything and returns false
Assert.assertFalse(set.add(new Data(1)));


Why not just use a Map? UnifiedSetWithHashingStrategy uses half the memory of a UnifiedMap, and one quarter the memory of a HashMap. And sometimes you don't have a convenient key and have to create a synthetic one, like a tuple. That can waste more memory.



How do we perform lookups? Remember that Sets have contains(), but not get(). UnifiedSetWithHashingStrategy implements Pool in addition to Set, so it also implements a form of get().



Here's a simple approach to handle case-insensitive Strings.



UnifiedSetWithHashingStrategy<String> set = 
new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(String::toLowerCase));
set.add("ABC");
Assert.assertTrue(set.contains("ABC"));
Assert.assertTrue(set.contains("abc"));
Assert.assertFalse(set.contains("def"));
Assert.assertEquals("ABC", set.get("aBc"));


This shows off the API, but it's not appropriate for production. The problem is that the HashingStrategy constantly delegates to String.toLowerCase() which creates a bunch of garbage Strings. Here's how you can create an efficient hashing strategy for case-insensitive Strings.



public static final HashingStrategy<String> CASE_INSENSITIVE =
new HashingStrategy<String>()
{
@Override
public int computeHashCode(String string)
{
int hashCode = 0;
for (int i = 0; i < string.length(); i++)
{
hashCode = 31 * hashCode + Character.toLowerCase(string.charAt(i));
}
return hashCode;
}

@Override
public boolean equals(String string1, String string2)
{
return string1.equalsIgnoreCase(string2);
}
};


Note: I am a developer on GS collections.






share|improve this answer

































    4














    Trove4j has the feature I'm after and they call it hashing strategies.



    Their map has an implementation with different limitations and thus different prerequisites, so this does not implicitly mean that an implementation for Java's "native" HashMap would be feasible.






    share|improve this answer

































      3














      Note: As noted in all other answers, HashMaps don't have an explicit ordering. They only recognize "equality". Getting an order out of a hash-based data structure is meaningless, as each object is turned into a hash - essentially a random number.



      You can always write a hash function for a class (and often times must), as long as you do it carefully. This is a hard thing to do properly because hash-based data structures rely on a random, uniform distribution of hash values. In Effective Java, there is a large amount of text devoted to properly implementing a hash method with good behaviour.



      With all that being said, if you just want your hashing to ignore the case of a String, you can write a wrapper class around String for this purpose and insert those in your data structure instead.



      A simple implementation:



      public class LowerStringWrapper {
      public LowerStringWrapper(String s) {
      this.s = s;
      this.lowerString = s.toLowerString();
      }

      // getter methods omitted

      // Rely on the hashing of String, as we know it to be good.
      public int hashCode() { return lowerString.hashCode(); }

      // We overrode hashCode, so we MUST also override equals. It is required
      // that if a.equals(b), then a.hashCode() == b.hashCode(), so we must
      // restore that invariant.
      public boolean equals(Object obj) {
      if (obj instanceof LowerStringWrapper) {
      return lowerString.equals(((LowerStringWrapper)obj).lowerString;
      } else {
      return lowerString.equals(obj);
      }
      }

      private String s;
      private String lowerString;
      }





      share|improve this answer

































        0














        good question, ask josh bloch. i submitted that concept as an RFE in java 7, but it was dropped, i believe the reason was something performance related. i agree, though, should have been done.






        share|improve this answer
























        • Hmm. Maybe it's because you miss out on the opportunity to cache the calculated hash codes..

          – volley
          Oct 18 '08 at 14:21



















        0














        I suspect this has not been done because it would prevent hashCode caching?



        I attempted creating a generic Map solution where all keys are silently wrapped. It turned out that the wrapper would have to hold the wrapped object, the cached hashCode and a reference to the callback interface responsible for equality-checks. This is obviously not as efficient as using a wrapper class, where you'd only have to cache the original key plus one more object (see hazzens answer).



        (I also bumped into a problem related to generics; the get-method accepts Object as input, so the callback interface responsible for hashing would have to perform an additional instanceof-check. Either that, or the map class would have to know the Class of its keys.)






        share|improve this answer































          0














          This is an interesting idea, but it's absolutely horrendous for performance. The reason for this is quite fundamental to the idea of a hashtable: the ordering cannot be relied upon. Hashtables are very fast (constant time) because of the way in which they index elements in the table: by computing a pseudo-unique integer hash for that element and accessing that location in an array. It's literally computing a location in memory and directly storing the element.



          This contrasts with a balanced binary search tree (TreeMap) which must start at the root and work its way down to the desired node every time a lookup is required. Wikipedia has some more in-depth analysis. To summarize, the efficiency of a tree map is dependent upon a consistent ordering, thus the order of the elements is predictable and sane. However, because of the performance hit imposed by the "traverse to your destination" approach, BSTs are only able to provide O(log(n)) performance. For large maps, this can be a significant performance hit.



          It is possible to impose a consistent ordering on a hashtable, but to do so involves using techniques similar to LinkedHashMap and manually maintaining the ordering. Alternatively, two separate data structures can be maintained internally: a hashtable and a tree. The table can be used for lookups, while the tree can be used for iteration. The problem of course is this uses more than double the required memory. Also, insertions are only as fast as the tree: O(log(n)). Concurrent tricks can bring this down a bit, but that isn't a reliable performance optimization.



          In short, your idea sounds really good, but if you actually tried to implement it, you would see that to do so would impose massive performance limitations. The final verdict is (and has been for decades): if you need performance, use a hashtable; if you need ordering and can live with degraded performance, use a balanced binary search tree. I'm afraid there's really no efficiently combining the two structures without losing some of the guarantees of one or the other.






          share|improve this answer



















          • 1





            I don't think your answer has much to do with the question. volley just wants to use a HashTable where the hash function is user-specified, instead of the default Object.hashCode().

            – Adam Rosenfield
            Oct 18 '08 at 16:49











          • No, I think he wants a little more than that. His proposed "solution" is to impose ordering using an alternative hash code, but that's not going to work (hashing into a limited domain). To order a hashtable, some ancillary structure is needed.

            – Daniel Spiewak
            Oct 18 '08 at 16:53






          • 1





            Hmm actually I think Adam is right; note that the interface I suggest contains one method for calculating the hash and one method for checking if two objects are equal. Ordering is not in there! The Comparator is mentioned as an analogy. (By the way, löve the Darwinian logo, Daniel!)

            – volley
            Oct 21 '08 at 20:09











          • You're completely wrong, just look at the Hasharator, there's nothing like ordering there. Look at com.google.common.base.Equivalence and their CustomConcurrentHashMap, that's exactly it.

            – maaartinus
            Apr 18 '11 at 10:49






          • 1





            Before telling me I'm completely wrong, it might be worth looking at the edit history of the question. What I replied to was something very different from what we have now.

            – Daniel Spiewak
            Apr 18 '11 at 20:00



















          0














          There's such a feature in com.google.common.collect.CustomConcurrentHashMap, unfortunately, there's currently no public way how to set the Equivalence (their Hasharator). Maybe they're not yet done with it, maybe they don't consider the feature to be useful enough. Ask at the guava mailing list.



          I wonder why it haven't happened yet, as it was mentioned in this talk over two years ago.






          share|improve this answer

























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f214136%2fwhy-not-allow-an-external-interface-to-provide-hashcode-equals-for-a-hashmap%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            9 Answers
            9






            active

            oldest

            votes








            9 Answers
            9






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            9














            A bit late for you, but for future visitors, it might be worth knowing that commons-collections has an AbstractHashedMap (in 3.2.2 and with generics in 4.0). You can override these protected methods to achieve your desired behaviour:



            protected int hash(Object key) { ... }
            protected boolean isEqualKey(Object key1, Object key2) { ... }
            protected boolean isEqualValue(Object value1, Object value2) { ... }
            protected HashEntry createEntry(
            HashEntry next, int hashCode, Object key, Object value) { ... }


            An example implementation of such an alternative HashedMap is commons-collections' own IdentityMap (only up to 3.2.2 as Java has its own since 1.4).



            This is not as powerful as providing an external "Hasharator" to a Map instance. You have to implement a new map class for every hashing strategy (composition vs. inheritance striking back...). But it's still good to know.






            share|improve this answer





















            • 1





              PlusOne. You might want to update that link to AbstractHashedMap to point to v4 which finally has generics.

              – Nicolai
              Dec 30 '14 at 9:59
















            9














            A bit late for you, but for future visitors, it might be worth knowing that commons-collections has an AbstractHashedMap (in 3.2.2 and with generics in 4.0). You can override these protected methods to achieve your desired behaviour:



            protected int hash(Object key) { ... }
            protected boolean isEqualKey(Object key1, Object key2) { ... }
            protected boolean isEqualValue(Object value1, Object value2) { ... }
            protected HashEntry createEntry(
            HashEntry next, int hashCode, Object key, Object value) { ... }


            An example implementation of such an alternative HashedMap is commons-collections' own IdentityMap (only up to 3.2.2 as Java has its own since 1.4).



            This is not as powerful as providing an external "Hasharator" to a Map instance. You have to implement a new map class for every hashing strategy (composition vs. inheritance striking back...). But it's still good to know.






            share|improve this answer





















            • 1





              PlusOne. You might want to update that link to AbstractHashedMap to point to v4 which finally has generics.

              – Nicolai
              Dec 30 '14 at 9:59














            9












            9








            9







            A bit late for you, but for future visitors, it might be worth knowing that commons-collections has an AbstractHashedMap (in 3.2.2 and with generics in 4.0). You can override these protected methods to achieve your desired behaviour:



            protected int hash(Object key) { ... }
            protected boolean isEqualKey(Object key1, Object key2) { ... }
            protected boolean isEqualValue(Object value1, Object value2) { ... }
            protected HashEntry createEntry(
            HashEntry next, int hashCode, Object key, Object value) { ... }


            An example implementation of such an alternative HashedMap is commons-collections' own IdentityMap (only up to 3.2.2 as Java has its own since 1.4).



            This is not as powerful as providing an external "Hasharator" to a Map instance. You have to implement a new map class for every hashing strategy (composition vs. inheritance striking back...). But it's still good to know.






            share|improve this answer















            A bit late for you, but for future visitors, it might be worth knowing that commons-collections has an AbstractHashedMap (in 3.2.2 and with generics in 4.0). You can override these protected methods to achieve your desired behaviour:



            protected int hash(Object key) { ... }
            protected boolean isEqualKey(Object key1, Object key2) { ... }
            protected boolean isEqualValue(Object value1, Object value2) { ... }
            protected HashEntry createEntry(
            HashEntry next, int hashCode, Object key, Object value) { ... }


            An example implementation of such an alternative HashedMap is commons-collections' own IdentityMap (only up to 3.2.2 as Java has its own since 1.4).



            This is not as powerful as providing an external "Hasharator" to a Map instance. You have to implement a new map class for every hashing strategy (composition vs. inheritance striking back...). But it's still good to know.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 15 '18 at 0:41









            caco3

            1,8481719




            1,8481719










            answered Nov 17 '13 at 12:43









            Lukas EderLukas Eder

            134k72438967




            134k72438967








            • 1





              PlusOne. You might want to update that link to AbstractHashedMap to point to v4 which finally has generics.

              – Nicolai
              Dec 30 '14 at 9:59














            • 1





              PlusOne. You might want to update that link to AbstractHashedMap to point to v4 which finally has generics.

              – Nicolai
              Dec 30 '14 at 9:59








            1




            1





            PlusOne. You might want to update that link to AbstractHashedMap to point to v4 which finally has generics.

            – Nicolai
            Dec 30 '14 at 9:59





            PlusOne. You might want to update that link to AbstractHashedMap to point to v4 which finally has generics.

            – Nicolai
            Dec 30 '14 at 9:59













            8














            .NET has this via IEqualityComparer (for a type which can compare two objects) and IEquatable (for a type which can compare itself to another instance).



            In fact, I believe it was a mistake to define equality and hashcodes in java.lang.Object or System.Object at all. Equality in particular is hard to define in a way which makes sense with inheritance. I keep meaning to blog about this...



            But yes, basically the idea is sound.






            share|improve this answer
























            • And it accounts for the concept that there may be more than one concept of equality for a given type.

              – Kenogu Labz
              Mar 28 '16 at 20:18
















            8














            .NET has this via IEqualityComparer (for a type which can compare two objects) and IEquatable (for a type which can compare itself to another instance).



            In fact, I believe it was a mistake to define equality and hashcodes in java.lang.Object or System.Object at all. Equality in particular is hard to define in a way which makes sense with inheritance. I keep meaning to blog about this...



            But yes, basically the idea is sound.






            share|improve this answer
























            • And it accounts for the concept that there may be more than one concept of equality for a given type.

              – Kenogu Labz
              Mar 28 '16 at 20:18














            8












            8








            8







            .NET has this via IEqualityComparer (for a type which can compare two objects) and IEquatable (for a type which can compare itself to another instance).



            In fact, I believe it was a mistake to define equality and hashcodes in java.lang.Object or System.Object at all. Equality in particular is hard to define in a way which makes sense with inheritance. I keep meaning to blog about this...



            But yes, basically the idea is sound.






            share|improve this answer













            .NET has this via IEqualityComparer (for a type which can compare two objects) and IEquatable (for a type which can compare itself to another instance).



            In fact, I believe it was a mistake to define equality and hashcodes in java.lang.Object or System.Object at all. Equality in particular is hard to define in a way which makes sense with inheritance. I keep meaning to blog about this...



            But yes, basically the idea is sound.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Oct 17 '08 at 23:40









            Jon SkeetJon Skeet

            1088k68979438440




            1088k68979438440













            • And it accounts for the concept that there may be more than one concept of equality for a given type.

              – Kenogu Labz
              Mar 28 '16 at 20:18



















            • And it accounts for the concept that there may be more than one concept of equality for a given type.

              – Kenogu Labz
              Mar 28 '16 at 20:18

















            And it accounts for the concept that there may be more than one concept of equality for a given type.

            – Kenogu Labz
            Mar 28 '16 at 20:18





            And it accounts for the concept that there may be more than one concept of equality for a given type.

            – Kenogu Labz
            Mar 28 '16 at 20:18











            6














            HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.



            public interface HashingStrategy<E>
            {
            int computeHashCode(E object);
            boolean equals(E object1, E object2);
            }


            You can't use a HashingStrategy with the built in HashSet or HashMap. GS Collections includes a java.util.Set called UnifiedSetWithHashingStrategy and a java.util.Map called UnifiedMapWithHashingStrategy.



            Let's look at an example.



            public class Data
            {
            private final int id;

            public Data(int id)
            {
            this.id = id;
            }

            public int getId()
            {
            return id;
            }

            // No equals or hashcode
            }


            Here's how you might set up a UnifiedSetWithHashingStrategy and use it.



            java.util.Set<Data> set =
            new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(Data::getId));
            Assert.assertTrue(set.add(new Data(1)));

            // contains returns true even without hashcode and equals
            Assert.assertTrue(set.contains(new Data(1)));

            // Second call to add() doesn't do anything and returns false
            Assert.assertFalse(set.add(new Data(1)));


            Why not just use a Map? UnifiedSetWithHashingStrategy uses half the memory of a UnifiedMap, and one quarter the memory of a HashMap. And sometimes you don't have a convenient key and have to create a synthetic one, like a tuple. That can waste more memory.



            How do we perform lookups? Remember that Sets have contains(), but not get(). UnifiedSetWithHashingStrategy implements Pool in addition to Set, so it also implements a form of get().



            Here's a simple approach to handle case-insensitive Strings.



            UnifiedSetWithHashingStrategy<String> set = 
            new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(String::toLowerCase));
            set.add("ABC");
            Assert.assertTrue(set.contains("ABC"));
            Assert.assertTrue(set.contains("abc"));
            Assert.assertFalse(set.contains("def"));
            Assert.assertEquals("ABC", set.get("aBc"));


            This shows off the API, but it's not appropriate for production. The problem is that the HashingStrategy constantly delegates to String.toLowerCase() which creates a bunch of garbage Strings. Here's how you can create an efficient hashing strategy for case-insensitive Strings.



            public static final HashingStrategy<String> CASE_INSENSITIVE =
            new HashingStrategy<String>()
            {
            @Override
            public int computeHashCode(String string)
            {
            int hashCode = 0;
            for (int i = 0; i < string.length(); i++)
            {
            hashCode = 31 * hashCode + Character.toLowerCase(string.charAt(i));
            }
            return hashCode;
            }

            @Override
            public boolean equals(String string1, String string2)
            {
            return string1.equalsIgnoreCase(string2);
            }
            };


            Note: I am a developer on GS collections.






            share|improve this answer






























              6














              HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.



              public interface HashingStrategy<E>
              {
              int computeHashCode(E object);
              boolean equals(E object1, E object2);
              }


              You can't use a HashingStrategy with the built in HashSet or HashMap. GS Collections includes a java.util.Set called UnifiedSetWithHashingStrategy and a java.util.Map called UnifiedMapWithHashingStrategy.



              Let's look at an example.



              public class Data
              {
              private final int id;

              public Data(int id)
              {
              this.id = id;
              }

              public int getId()
              {
              return id;
              }

              // No equals or hashcode
              }


              Here's how you might set up a UnifiedSetWithHashingStrategy and use it.



              java.util.Set<Data> set =
              new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(Data::getId));
              Assert.assertTrue(set.add(new Data(1)));

              // contains returns true even without hashcode and equals
              Assert.assertTrue(set.contains(new Data(1)));

              // Second call to add() doesn't do anything and returns false
              Assert.assertFalse(set.add(new Data(1)));


              Why not just use a Map? UnifiedSetWithHashingStrategy uses half the memory of a UnifiedMap, and one quarter the memory of a HashMap. And sometimes you don't have a convenient key and have to create a synthetic one, like a tuple. That can waste more memory.



              How do we perform lookups? Remember that Sets have contains(), but not get(). UnifiedSetWithHashingStrategy implements Pool in addition to Set, so it also implements a form of get().



              Here's a simple approach to handle case-insensitive Strings.



              UnifiedSetWithHashingStrategy<String> set = 
              new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(String::toLowerCase));
              set.add("ABC");
              Assert.assertTrue(set.contains("ABC"));
              Assert.assertTrue(set.contains("abc"));
              Assert.assertFalse(set.contains("def"));
              Assert.assertEquals("ABC", set.get("aBc"));


              This shows off the API, but it's not appropriate for production. The problem is that the HashingStrategy constantly delegates to String.toLowerCase() which creates a bunch of garbage Strings. Here's how you can create an efficient hashing strategy for case-insensitive Strings.



              public static final HashingStrategy<String> CASE_INSENSITIVE =
              new HashingStrategy<String>()
              {
              @Override
              public int computeHashCode(String string)
              {
              int hashCode = 0;
              for (int i = 0; i < string.length(); i++)
              {
              hashCode = 31 * hashCode + Character.toLowerCase(string.charAt(i));
              }
              return hashCode;
              }

              @Override
              public boolean equals(String string1, String string2)
              {
              return string1.equalsIgnoreCase(string2);
              }
              };


              Note: I am a developer on GS collections.






              share|improve this answer




























                6












                6








                6







                HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.



                public interface HashingStrategy<E>
                {
                int computeHashCode(E object);
                boolean equals(E object1, E object2);
                }


                You can't use a HashingStrategy with the built in HashSet or HashMap. GS Collections includes a java.util.Set called UnifiedSetWithHashingStrategy and a java.util.Map called UnifiedMapWithHashingStrategy.



                Let's look at an example.



                public class Data
                {
                private final int id;

                public Data(int id)
                {
                this.id = id;
                }

                public int getId()
                {
                return id;
                }

                // No equals or hashcode
                }


                Here's how you might set up a UnifiedSetWithHashingStrategy and use it.



                java.util.Set<Data> set =
                new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(Data::getId));
                Assert.assertTrue(set.add(new Data(1)));

                // contains returns true even without hashcode and equals
                Assert.assertTrue(set.contains(new Data(1)));

                // Second call to add() doesn't do anything and returns false
                Assert.assertFalse(set.add(new Data(1)));


                Why not just use a Map? UnifiedSetWithHashingStrategy uses half the memory of a UnifiedMap, and one quarter the memory of a HashMap. And sometimes you don't have a convenient key and have to create a synthetic one, like a tuple. That can waste more memory.



                How do we perform lookups? Remember that Sets have contains(), but not get(). UnifiedSetWithHashingStrategy implements Pool in addition to Set, so it also implements a form of get().



                Here's a simple approach to handle case-insensitive Strings.



                UnifiedSetWithHashingStrategy<String> set = 
                new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(String::toLowerCase));
                set.add("ABC");
                Assert.assertTrue(set.contains("ABC"));
                Assert.assertTrue(set.contains("abc"));
                Assert.assertFalse(set.contains("def"));
                Assert.assertEquals("ABC", set.get("aBc"));


                This shows off the API, but it's not appropriate for production. The problem is that the HashingStrategy constantly delegates to String.toLowerCase() which creates a bunch of garbage Strings. Here's how you can create an efficient hashing strategy for case-insensitive Strings.



                public static final HashingStrategy<String> CASE_INSENSITIVE =
                new HashingStrategy<String>()
                {
                @Override
                public int computeHashCode(String string)
                {
                int hashCode = 0;
                for (int i = 0; i < string.length(); i++)
                {
                hashCode = 31 * hashCode + Character.toLowerCase(string.charAt(i));
                }
                return hashCode;
                }

                @Override
                public boolean equals(String string1, String string2)
                {
                return string1.equalsIgnoreCase(string2);
                }
                };


                Note: I am a developer on GS collections.






                share|improve this answer















                HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.



                public interface HashingStrategy<E>
                {
                int computeHashCode(E object);
                boolean equals(E object1, E object2);
                }


                You can't use a HashingStrategy with the built in HashSet or HashMap. GS Collections includes a java.util.Set called UnifiedSetWithHashingStrategy and a java.util.Map called UnifiedMapWithHashingStrategy.



                Let's look at an example.



                public class Data
                {
                private final int id;

                public Data(int id)
                {
                this.id = id;
                }

                public int getId()
                {
                return id;
                }

                // No equals or hashcode
                }


                Here's how you might set up a UnifiedSetWithHashingStrategy and use it.



                java.util.Set<Data> set =
                new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(Data::getId));
                Assert.assertTrue(set.add(new Data(1)));

                // contains returns true even without hashcode and equals
                Assert.assertTrue(set.contains(new Data(1)));

                // Second call to add() doesn't do anything and returns false
                Assert.assertFalse(set.add(new Data(1)));


                Why not just use a Map? UnifiedSetWithHashingStrategy uses half the memory of a UnifiedMap, and one quarter the memory of a HashMap. And sometimes you don't have a convenient key and have to create a synthetic one, like a tuple. That can waste more memory.



                How do we perform lookups? Remember that Sets have contains(), but not get(). UnifiedSetWithHashingStrategy implements Pool in addition to Set, so it also implements a form of get().



                Here's a simple approach to handle case-insensitive Strings.



                UnifiedSetWithHashingStrategy<String> set = 
                new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(String::toLowerCase));
                set.add("ABC");
                Assert.assertTrue(set.contains("ABC"));
                Assert.assertTrue(set.contains("abc"));
                Assert.assertFalse(set.contains("def"));
                Assert.assertEquals("ABC", set.get("aBc"));


                This shows off the API, but it's not appropriate for production. The problem is that the HashingStrategy constantly delegates to String.toLowerCase() which creates a bunch of garbage Strings. Here's how you can create an efficient hashing strategy for case-insensitive Strings.



                public static final HashingStrategy<String> CASE_INSENSITIVE =
                new HashingStrategy<String>()
                {
                @Override
                public int computeHashCode(String string)
                {
                int hashCode = 0;
                for (int i = 0; i < string.length(); i++)
                {
                hashCode = 31 * hashCode + Character.toLowerCase(string.charAt(i));
                }
                return hashCode;
                }

                @Override
                public boolean equals(String string1, String string2)
                {
                return string1.equalsIgnoreCase(string2);
                }
                };


                Note: I am a developer on GS collections.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Mar 4 '15 at 20:55

























                answered Dec 31 '14 at 23:31









                Craig P. MotlinCraig P. Motlin

                18.4k1586120




                18.4k1586120























                    4














                    Trove4j has the feature I'm after and they call it hashing strategies.



                    Their map has an implementation with different limitations and thus different prerequisites, so this does not implicitly mean that an implementation for Java's "native" HashMap would be feasible.






                    share|improve this answer






























                      4














                      Trove4j has the feature I'm after and they call it hashing strategies.



                      Their map has an implementation with different limitations and thus different prerequisites, so this does not implicitly mean that an implementation for Java's "native" HashMap would be feasible.






                      share|improve this answer




























                        4












                        4








                        4







                        Trove4j has the feature I'm after and they call it hashing strategies.



                        Their map has an implementation with different limitations and thus different prerequisites, so this does not implicitly mean that an implementation for Java's "native" HashMap would be feasible.






                        share|improve this answer















                        Trove4j has the feature I'm after and they call it hashing strategies.



                        Their map has an implementation with different limitations and thus different prerequisites, so this does not implicitly mean that an implementation for Java's "native" HashMap would be feasible.







                        share|improve this answer














                        share|improve this answer



                        share|improve this answer








                        edited Dec 9 '09 at 20:48

























                        answered Dec 9 '09 at 20:42









                        volleyvolley

                        5,77912327




                        5,77912327























                            3














                            Note: As noted in all other answers, HashMaps don't have an explicit ordering. They only recognize "equality". Getting an order out of a hash-based data structure is meaningless, as each object is turned into a hash - essentially a random number.



                            You can always write a hash function for a class (and often times must), as long as you do it carefully. This is a hard thing to do properly because hash-based data structures rely on a random, uniform distribution of hash values. In Effective Java, there is a large amount of text devoted to properly implementing a hash method with good behaviour.



                            With all that being said, if you just want your hashing to ignore the case of a String, you can write a wrapper class around String for this purpose and insert those in your data structure instead.



                            A simple implementation:



                            public class LowerStringWrapper {
                            public LowerStringWrapper(String s) {
                            this.s = s;
                            this.lowerString = s.toLowerString();
                            }

                            // getter methods omitted

                            // Rely on the hashing of String, as we know it to be good.
                            public int hashCode() { return lowerString.hashCode(); }

                            // We overrode hashCode, so we MUST also override equals. It is required
                            // that if a.equals(b), then a.hashCode() == b.hashCode(), so we must
                            // restore that invariant.
                            public boolean equals(Object obj) {
                            if (obj instanceof LowerStringWrapper) {
                            return lowerString.equals(((LowerStringWrapper)obj).lowerString;
                            } else {
                            return lowerString.equals(obj);
                            }
                            }

                            private String s;
                            private String lowerString;
                            }





                            share|improve this answer






























                              3














                              Note: As noted in all other answers, HashMaps don't have an explicit ordering. They only recognize "equality". Getting an order out of a hash-based data structure is meaningless, as each object is turned into a hash - essentially a random number.



                              You can always write a hash function for a class (and often times must), as long as you do it carefully. This is a hard thing to do properly because hash-based data structures rely on a random, uniform distribution of hash values. In Effective Java, there is a large amount of text devoted to properly implementing a hash method with good behaviour.



                              With all that being said, if you just want your hashing to ignore the case of a String, you can write a wrapper class around String for this purpose and insert those in your data structure instead.



                              A simple implementation:



                              public class LowerStringWrapper {
                              public LowerStringWrapper(String s) {
                              this.s = s;
                              this.lowerString = s.toLowerString();
                              }

                              // getter methods omitted

                              // Rely on the hashing of String, as we know it to be good.
                              public int hashCode() { return lowerString.hashCode(); }

                              // We overrode hashCode, so we MUST also override equals. It is required
                              // that if a.equals(b), then a.hashCode() == b.hashCode(), so we must
                              // restore that invariant.
                              public boolean equals(Object obj) {
                              if (obj instanceof LowerStringWrapper) {
                              return lowerString.equals(((LowerStringWrapper)obj).lowerString;
                              } else {
                              return lowerString.equals(obj);
                              }
                              }

                              private String s;
                              private String lowerString;
                              }





                              share|improve this answer




























                                3












                                3








                                3







                                Note: As noted in all other answers, HashMaps don't have an explicit ordering. They only recognize "equality". Getting an order out of a hash-based data structure is meaningless, as each object is turned into a hash - essentially a random number.



                                You can always write a hash function for a class (and often times must), as long as you do it carefully. This is a hard thing to do properly because hash-based data structures rely on a random, uniform distribution of hash values. In Effective Java, there is a large amount of text devoted to properly implementing a hash method with good behaviour.



                                With all that being said, if you just want your hashing to ignore the case of a String, you can write a wrapper class around String for this purpose and insert those in your data structure instead.



                                A simple implementation:



                                public class LowerStringWrapper {
                                public LowerStringWrapper(String s) {
                                this.s = s;
                                this.lowerString = s.toLowerString();
                                }

                                // getter methods omitted

                                // Rely on the hashing of String, as we know it to be good.
                                public int hashCode() { return lowerString.hashCode(); }

                                // We overrode hashCode, so we MUST also override equals. It is required
                                // that if a.equals(b), then a.hashCode() == b.hashCode(), so we must
                                // restore that invariant.
                                public boolean equals(Object obj) {
                                if (obj instanceof LowerStringWrapper) {
                                return lowerString.equals(((LowerStringWrapper)obj).lowerString;
                                } else {
                                return lowerString.equals(obj);
                                }
                                }

                                private String s;
                                private String lowerString;
                                }





                                share|improve this answer















                                Note: As noted in all other answers, HashMaps don't have an explicit ordering. They only recognize "equality". Getting an order out of a hash-based data structure is meaningless, as each object is turned into a hash - essentially a random number.



                                You can always write a hash function for a class (and often times must), as long as you do it carefully. This is a hard thing to do properly because hash-based data structures rely on a random, uniform distribution of hash values. In Effective Java, there is a large amount of text devoted to properly implementing a hash method with good behaviour.



                                With all that being said, if you just want your hashing to ignore the case of a String, you can write a wrapper class around String for this purpose and insert those in your data structure instead.



                                A simple implementation:



                                public class LowerStringWrapper {
                                public LowerStringWrapper(String s) {
                                this.s = s;
                                this.lowerString = s.toLowerString();
                                }

                                // getter methods omitted

                                // Rely on the hashing of String, as we know it to be good.
                                public int hashCode() { return lowerString.hashCode(); }

                                // We overrode hashCode, so we MUST also override equals. It is required
                                // that if a.equals(b), then a.hashCode() == b.hashCode(), so we must
                                // restore that invariant.
                                public boolean equals(Object obj) {
                                if (obj instanceof LowerStringWrapper) {
                                return lowerString.equals(((LowerStringWrapper)obj).lowerString;
                                } else {
                                return lowerString.equals(obj);
                                }
                                }

                                private String s;
                                private String lowerString;
                                }






                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited Oct 18 '08 at 18:33

























                                answered Oct 17 '08 at 23:09









                                hazzenhazzen

                                12.2k53533




                                12.2k53533























                                    0














                                    good question, ask josh bloch. i submitted that concept as an RFE in java 7, but it was dropped, i believe the reason was something performance related. i agree, though, should have been done.






                                    share|improve this answer
























                                    • Hmm. Maybe it's because you miss out on the opportunity to cache the calculated hash codes..

                                      – volley
                                      Oct 18 '08 at 14:21
















                                    0














                                    good question, ask josh bloch. i submitted that concept as an RFE in java 7, but it was dropped, i believe the reason was something performance related. i agree, though, should have been done.






                                    share|improve this answer
























                                    • Hmm. Maybe it's because you miss out on the opportunity to cache the calculated hash codes..

                                      – volley
                                      Oct 18 '08 at 14:21














                                    0












                                    0








                                    0







                                    good question, ask josh bloch. i submitted that concept as an RFE in java 7, but it was dropped, i believe the reason was something performance related. i agree, though, should have been done.






                                    share|improve this answer













                                    good question, ask josh bloch. i submitted that concept as an RFE in java 7, but it was dropped, i believe the reason was something performance related. i agree, though, should have been done.







                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Oct 18 '08 at 1:35







                                    james




















                                    • Hmm. Maybe it's because you miss out on the opportunity to cache the calculated hash codes..

                                      – volley
                                      Oct 18 '08 at 14:21



















                                    • Hmm. Maybe it's because you miss out on the opportunity to cache the calculated hash codes..

                                      – volley
                                      Oct 18 '08 at 14:21

















                                    Hmm. Maybe it's because you miss out on the opportunity to cache the calculated hash codes..

                                    – volley
                                    Oct 18 '08 at 14:21





                                    Hmm. Maybe it's because you miss out on the opportunity to cache the calculated hash codes..

                                    – volley
                                    Oct 18 '08 at 14:21











                                    0














                                    I suspect this has not been done because it would prevent hashCode caching?



                                    I attempted creating a generic Map solution where all keys are silently wrapped. It turned out that the wrapper would have to hold the wrapped object, the cached hashCode and a reference to the callback interface responsible for equality-checks. This is obviously not as efficient as using a wrapper class, where you'd only have to cache the original key plus one more object (see hazzens answer).



                                    (I also bumped into a problem related to generics; the get-method accepts Object as input, so the callback interface responsible for hashing would have to perform an additional instanceof-check. Either that, or the map class would have to know the Class of its keys.)






                                    share|improve this answer




























                                      0














                                      I suspect this has not been done because it would prevent hashCode caching?



                                      I attempted creating a generic Map solution where all keys are silently wrapped. It turned out that the wrapper would have to hold the wrapped object, the cached hashCode and a reference to the callback interface responsible for equality-checks. This is obviously not as efficient as using a wrapper class, where you'd only have to cache the original key plus one more object (see hazzens answer).



                                      (I also bumped into a problem related to generics; the get-method accepts Object as input, so the callback interface responsible for hashing would have to perform an additional instanceof-check. Either that, or the map class would have to know the Class of its keys.)






                                      share|improve this answer


























                                        0












                                        0








                                        0







                                        I suspect this has not been done because it would prevent hashCode caching?



                                        I attempted creating a generic Map solution where all keys are silently wrapped. It turned out that the wrapper would have to hold the wrapped object, the cached hashCode and a reference to the callback interface responsible for equality-checks. This is obviously not as efficient as using a wrapper class, where you'd only have to cache the original key plus one more object (see hazzens answer).



                                        (I also bumped into a problem related to generics; the get-method accepts Object as input, so the callback interface responsible for hashing would have to perform an additional instanceof-check. Either that, or the map class would have to know the Class of its keys.)






                                        share|improve this answer













                                        I suspect this has not been done because it would prevent hashCode caching?



                                        I attempted creating a generic Map solution where all keys are silently wrapped. It turned out that the wrapper would have to hold the wrapped object, the cached hashCode and a reference to the callback interface responsible for equality-checks. This is obviously not as efficient as using a wrapper class, where you'd only have to cache the original key plus one more object (see hazzens answer).



                                        (I also bumped into a problem related to generics; the get-method accepts Object as input, so the callback interface responsible for hashing would have to perform an additional instanceof-check. Either that, or the map class would have to know the Class of its keys.)







                                        share|improve this answer












                                        share|improve this answer



                                        share|improve this answer










                                        answered Oct 18 '08 at 15:43









                                        volleyvolley

                                        5,77912327




                                        5,77912327























                                            0














                                            This is an interesting idea, but it's absolutely horrendous for performance. The reason for this is quite fundamental to the idea of a hashtable: the ordering cannot be relied upon. Hashtables are very fast (constant time) because of the way in which they index elements in the table: by computing a pseudo-unique integer hash for that element and accessing that location in an array. It's literally computing a location in memory and directly storing the element.



                                            This contrasts with a balanced binary search tree (TreeMap) which must start at the root and work its way down to the desired node every time a lookup is required. Wikipedia has some more in-depth analysis. To summarize, the efficiency of a tree map is dependent upon a consistent ordering, thus the order of the elements is predictable and sane. However, because of the performance hit imposed by the "traverse to your destination" approach, BSTs are only able to provide O(log(n)) performance. For large maps, this can be a significant performance hit.



                                            It is possible to impose a consistent ordering on a hashtable, but to do so involves using techniques similar to LinkedHashMap and manually maintaining the ordering. Alternatively, two separate data structures can be maintained internally: a hashtable and a tree. The table can be used for lookups, while the tree can be used for iteration. The problem of course is this uses more than double the required memory. Also, insertions are only as fast as the tree: O(log(n)). Concurrent tricks can bring this down a bit, but that isn't a reliable performance optimization.



                                            In short, your idea sounds really good, but if you actually tried to implement it, you would see that to do so would impose massive performance limitations. The final verdict is (and has been for decades): if you need performance, use a hashtable; if you need ordering and can live with degraded performance, use a balanced binary search tree. I'm afraid there's really no efficiently combining the two structures without losing some of the guarantees of one or the other.






                                            share|improve this answer



















                                            • 1





                                              I don't think your answer has much to do with the question. volley just wants to use a HashTable where the hash function is user-specified, instead of the default Object.hashCode().

                                              – Adam Rosenfield
                                              Oct 18 '08 at 16:49











                                            • No, I think he wants a little more than that. His proposed "solution" is to impose ordering using an alternative hash code, but that's not going to work (hashing into a limited domain). To order a hashtable, some ancillary structure is needed.

                                              – Daniel Spiewak
                                              Oct 18 '08 at 16:53






                                            • 1





                                              Hmm actually I think Adam is right; note that the interface I suggest contains one method for calculating the hash and one method for checking if two objects are equal. Ordering is not in there! The Comparator is mentioned as an analogy. (By the way, löve the Darwinian logo, Daniel!)

                                              – volley
                                              Oct 21 '08 at 20:09











                                            • You're completely wrong, just look at the Hasharator, there's nothing like ordering there. Look at com.google.common.base.Equivalence and their CustomConcurrentHashMap, that's exactly it.

                                              – maaartinus
                                              Apr 18 '11 at 10:49






                                            • 1





                                              Before telling me I'm completely wrong, it might be worth looking at the edit history of the question. What I replied to was something very different from what we have now.

                                              – Daniel Spiewak
                                              Apr 18 '11 at 20:00
















                                            0














                                            This is an interesting idea, but it's absolutely horrendous for performance. The reason for this is quite fundamental to the idea of a hashtable: the ordering cannot be relied upon. Hashtables are very fast (constant time) because of the way in which they index elements in the table: by computing a pseudo-unique integer hash for that element and accessing that location in an array. It's literally computing a location in memory and directly storing the element.



                                            This contrasts with a balanced binary search tree (TreeMap) which must start at the root and work its way down to the desired node every time a lookup is required. Wikipedia has some more in-depth analysis. To summarize, the efficiency of a tree map is dependent upon a consistent ordering, thus the order of the elements is predictable and sane. However, because of the performance hit imposed by the "traverse to your destination" approach, BSTs are only able to provide O(log(n)) performance. For large maps, this can be a significant performance hit.



                                            It is possible to impose a consistent ordering on a hashtable, but to do so involves using techniques similar to LinkedHashMap and manually maintaining the ordering. Alternatively, two separate data structures can be maintained internally: a hashtable and a tree. The table can be used for lookups, while the tree can be used for iteration. The problem of course is this uses more than double the required memory. Also, insertions are only as fast as the tree: O(log(n)). Concurrent tricks can bring this down a bit, but that isn't a reliable performance optimization.



                                            In short, your idea sounds really good, but if you actually tried to implement it, you would see that to do so would impose massive performance limitations. The final verdict is (and has been for decades): if you need performance, use a hashtable; if you need ordering and can live with degraded performance, use a balanced binary search tree. I'm afraid there's really no efficiently combining the two structures without losing some of the guarantees of one or the other.






                                            share|improve this answer



















                                            • 1





                                              I don't think your answer has much to do with the question. volley just wants to use a HashTable where the hash function is user-specified, instead of the default Object.hashCode().

                                              – Adam Rosenfield
                                              Oct 18 '08 at 16:49











                                            • No, I think he wants a little more than that. His proposed "solution" is to impose ordering using an alternative hash code, but that's not going to work (hashing into a limited domain). To order a hashtable, some ancillary structure is needed.

                                              – Daniel Spiewak
                                              Oct 18 '08 at 16:53






                                            • 1





                                              Hmm actually I think Adam is right; note that the interface I suggest contains one method for calculating the hash and one method for checking if two objects are equal. Ordering is not in there! The Comparator is mentioned as an analogy. (By the way, löve the Darwinian logo, Daniel!)

                                              – volley
                                              Oct 21 '08 at 20:09











                                            • You're completely wrong, just look at the Hasharator, there's nothing like ordering there. Look at com.google.common.base.Equivalence and their CustomConcurrentHashMap, that's exactly it.

                                              – maaartinus
                                              Apr 18 '11 at 10:49






                                            • 1





                                              Before telling me I'm completely wrong, it might be worth looking at the edit history of the question. What I replied to was something very different from what we have now.

                                              – Daniel Spiewak
                                              Apr 18 '11 at 20:00














                                            0












                                            0








                                            0







                                            This is an interesting idea, but it's absolutely horrendous for performance. The reason for this is quite fundamental to the idea of a hashtable: the ordering cannot be relied upon. Hashtables are very fast (constant time) because of the way in which they index elements in the table: by computing a pseudo-unique integer hash for that element and accessing that location in an array. It's literally computing a location in memory and directly storing the element.



                                            This contrasts with a balanced binary search tree (TreeMap) which must start at the root and work its way down to the desired node every time a lookup is required. Wikipedia has some more in-depth analysis. To summarize, the efficiency of a tree map is dependent upon a consistent ordering, thus the order of the elements is predictable and sane. However, because of the performance hit imposed by the "traverse to your destination" approach, BSTs are only able to provide O(log(n)) performance. For large maps, this can be a significant performance hit.



                                            It is possible to impose a consistent ordering on a hashtable, but to do so involves using techniques similar to LinkedHashMap and manually maintaining the ordering. Alternatively, two separate data structures can be maintained internally: a hashtable and a tree. The table can be used for lookups, while the tree can be used for iteration. The problem of course is this uses more than double the required memory. Also, insertions are only as fast as the tree: O(log(n)). Concurrent tricks can bring this down a bit, but that isn't a reliable performance optimization.



                                            In short, your idea sounds really good, but if you actually tried to implement it, you would see that to do so would impose massive performance limitations. The final verdict is (and has been for decades): if you need performance, use a hashtable; if you need ordering and can live with degraded performance, use a balanced binary search tree. I'm afraid there's really no efficiently combining the two structures without losing some of the guarantees of one or the other.






                                            share|improve this answer













                                            This is an interesting idea, but it's absolutely horrendous for performance. The reason for this is quite fundamental to the idea of a hashtable: the ordering cannot be relied upon. Hashtables are very fast (constant time) because of the way in which they index elements in the table: by computing a pseudo-unique integer hash for that element and accessing that location in an array. It's literally computing a location in memory and directly storing the element.



                                            This contrasts with a balanced binary search tree (TreeMap) which must start at the root and work its way down to the desired node every time a lookup is required. Wikipedia has some more in-depth analysis. To summarize, the efficiency of a tree map is dependent upon a consistent ordering, thus the order of the elements is predictable and sane. However, because of the performance hit imposed by the "traverse to your destination" approach, BSTs are only able to provide O(log(n)) performance. For large maps, this can be a significant performance hit.



                                            It is possible to impose a consistent ordering on a hashtable, but to do so involves using techniques similar to LinkedHashMap and manually maintaining the ordering. Alternatively, two separate data structures can be maintained internally: a hashtable and a tree. The table can be used for lookups, while the tree can be used for iteration. The problem of course is this uses more than double the required memory. Also, insertions are only as fast as the tree: O(log(n)). Concurrent tricks can bring this down a bit, but that isn't a reliable performance optimization.



                                            In short, your idea sounds really good, but if you actually tried to implement it, you would see that to do so would impose massive performance limitations. The final verdict is (and has been for decades): if you need performance, use a hashtable; if you need ordering and can live with degraded performance, use a balanced binary search tree. I'm afraid there's really no efficiently combining the two structures without losing some of the guarantees of one or the other.







                                            share|improve this answer












                                            share|improve this answer



                                            share|improve this answer










                                            answered Oct 18 '08 at 16:26









                                            Daniel SpiewakDaniel Spiewak

                                            46.5k1098119




                                            46.5k1098119








                                            • 1





                                              I don't think your answer has much to do with the question. volley just wants to use a HashTable where the hash function is user-specified, instead of the default Object.hashCode().

                                              – Adam Rosenfield
                                              Oct 18 '08 at 16:49











                                            • No, I think he wants a little more than that. His proposed "solution" is to impose ordering using an alternative hash code, but that's not going to work (hashing into a limited domain). To order a hashtable, some ancillary structure is needed.

                                              – Daniel Spiewak
                                              Oct 18 '08 at 16:53






                                            • 1





                                              Hmm actually I think Adam is right; note that the interface I suggest contains one method for calculating the hash and one method for checking if two objects are equal. Ordering is not in there! The Comparator is mentioned as an analogy. (By the way, löve the Darwinian logo, Daniel!)

                                              – volley
                                              Oct 21 '08 at 20:09











                                            • You're completely wrong, just look at the Hasharator, there's nothing like ordering there. Look at com.google.common.base.Equivalence and their CustomConcurrentHashMap, that's exactly it.

                                              – maaartinus
                                              Apr 18 '11 at 10:49






                                            • 1





                                              Before telling me I'm completely wrong, it might be worth looking at the edit history of the question. What I replied to was something very different from what we have now.

                                              – Daniel Spiewak
                                              Apr 18 '11 at 20:00














                                            • 1





                                              I don't think your answer has much to do with the question. volley just wants to use a HashTable where the hash function is user-specified, instead of the default Object.hashCode().

                                              – Adam Rosenfield
                                              Oct 18 '08 at 16:49











                                            • No, I think he wants a little more than that. His proposed "solution" is to impose ordering using an alternative hash code, but that's not going to work (hashing into a limited domain). To order a hashtable, some ancillary structure is needed.

                                              – Daniel Spiewak
                                              Oct 18 '08 at 16:53






                                            • 1





                                              Hmm actually I think Adam is right; note that the interface I suggest contains one method for calculating the hash and one method for checking if two objects are equal. Ordering is not in there! The Comparator is mentioned as an analogy. (By the way, löve the Darwinian logo, Daniel!)

                                              – volley
                                              Oct 21 '08 at 20:09











                                            • You're completely wrong, just look at the Hasharator, there's nothing like ordering there. Look at com.google.common.base.Equivalence and their CustomConcurrentHashMap, that's exactly it.

                                              – maaartinus
                                              Apr 18 '11 at 10:49






                                            • 1





                                              Before telling me I'm completely wrong, it might be worth looking at the edit history of the question. What I replied to was something very different from what we have now.

                                              – Daniel Spiewak
                                              Apr 18 '11 at 20:00








                                            1




                                            1





                                            I don't think your answer has much to do with the question. volley just wants to use a HashTable where the hash function is user-specified, instead of the default Object.hashCode().

                                            – Adam Rosenfield
                                            Oct 18 '08 at 16:49





                                            I don't think your answer has much to do with the question. volley just wants to use a HashTable where the hash function is user-specified, instead of the default Object.hashCode().

                                            – Adam Rosenfield
                                            Oct 18 '08 at 16:49













                                            No, I think he wants a little more than that. His proposed "solution" is to impose ordering using an alternative hash code, but that's not going to work (hashing into a limited domain). To order a hashtable, some ancillary structure is needed.

                                            – Daniel Spiewak
                                            Oct 18 '08 at 16:53





                                            No, I think he wants a little more than that. His proposed "solution" is to impose ordering using an alternative hash code, but that's not going to work (hashing into a limited domain). To order a hashtable, some ancillary structure is needed.

                                            – Daniel Spiewak
                                            Oct 18 '08 at 16:53




                                            1




                                            1





                                            Hmm actually I think Adam is right; note that the interface I suggest contains one method for calculating the hash and one method for checking if two objects are equal. Ordering is not in there! The Comparator is mentioned as an analogy. (By the way, löve the Darwinian logo, Daniel!)

                                            – volley
                                            Oct 21 '08 at 20:09





                                            Hmm actually I think Adam is right; note that the interface I suggest contains one method for calculating the hash and one method for checking if two objects are equal. Ordering is not in there! The Comparator is mentioned as an analogy. (By the way, löve the Darwinian logo, Daniel!)

                                            – volley
                                            Oct 21 '08 at 20:09













                                            You're completely wrong, just look at the Hasharator, there's nothing like ordering there. Look at com.google.common.base.Equivalence and their CustomConcurrentHashMap, that's exactly it.

                                            – maaartinus
                                            Apr 18 '11 at 10:49





                                            You're completely wrong, just look at the Hasharator, there's nothing like ordering there. Look at com.google.common.base.Equivalence and their CustomConcurrentHashMap, that's exactly it.

                                            – maaartinus
                                            Apr 18 '11 at 10:49




                                            1




                                            1





                                            Before telling me I'm completely wrong, it might be worth looking at the edit history of the question. What I replied to was something very different from what we have now.

                                            – Daniel Spiewak
                                            Apr 18 '11 at 20:00





                                            Before telling me I'm completely wrong, it might be worth looking at the edit history of the question. What I replied to was something very different from what we have now.

                                            – Daniel Spiewak
                                            Apr 18 '11 at 20:00











                                            0














                                            There's such a feature in com.google.common.collect.CustomConcurrentHashMap, unfortunately, there's currently no public way how to set the Equivalence (their Hasharator). Maybe they're not yet done with it, maybe they don't consider the feature to be useful enough. Ask at the guava mailing list.



                                            I wonder why it haven't happened yet, as it was mentioned in this talk over two years ago.






                                            share|improve this answer






























                                              0














                                              There's such a feature in com.google.common.collect.CustomConcurrentHashMap, unfortunately, there's currently no public way how to set the Equivalence (their Hasharator). Maybe they're not yet done with it, maybe they don't consider the feature to be useful enough. Ask at the guava mailing list.



                                              I wonder why it haven't happened yet, as it was mentioned in this talk over two years ago.






                                              share|improve this answer




























                                                0












                                                0








                                                0







                                                There's such a feature in com.google.common.collect.CustomConcurrentHashMap, unfortunately, there's currently no public way how to set the Equivalence (their Hasharator). Maybe they're not yet done with it, maybe they don't consider the feature to be useful enough. Ask at the guava mailing list.



                                                I wonder why it haven't happened yet, as it was mentioned in this talk over two years ago.






                                                share|improve this answer















                                                There's such a feature in com.google.common.collect.CustomConcurrentHashMap, unfortunately, there's currently no public way how to set the Equivalence (their Hasharator). Maybe they're not yet done with it, maybe they don't consider the feature to be useful enough. Ask at the guava mailing list.



                                                I wonder why it haven't happened yet, as it was mentioned in this talk over two years ago.







                                                share|improve this answer














                                                share|improve this answer



                                                share|improve this answer








                                                edited Apr 18 '11 at 11:29

























                                                answered Apr 18 '11 at 10:54









                                                maaartinusmaaartinus

                                                27.1k2195235




                                                27.1k2195235






























                                                    draft saved

                                                    draft discarded




















































                                                    Thanks for contributing an answer to Stack Overflow!


                                                    • Please be sure to answer the question. Provide details and share your research!

                                                    But avoid



                                                    • Asking for help, clarification, or responding to other answers.

                                                    • Making statements based on opinion; back them up with references or personal experience.


                                                    To learn more, see our tips on writing great answers.




                                                    draft saved


                                                    draft discarded














                                                    StackExchange.ready(
                                                    function () {
                                                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f214136%2fwhy-not-allow-an-external-interface-to-provide-hashcode-equals-for-a-hashmap%23new-answer', 'question_page');
                                                    }
                                                    );

                                                    Post as a guest















                                                    Required, but never shown





















































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown

































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown







                                                    Popular posts from this blog

                                                    Xamarin.iOS Cant Deploy on Iphone

                                                    Glorious Revolution

                                                    Dulmage-Mendelsohn matrix decomposition in Python