Postgres similarity function not appropriately using trigram index












0















I have a simple person table with a last_name column that I've added a GIST index with



CREATE INDEX last_name_idx ON person USING gist (last_name gist_trgm_ops);


According to the docs at https://www.postgresql.org/docs/10/pgtrgm.html, the <-> operator should utilize this index. However, when I actually try to use this difference operator using this query:



explain verbose select * from person where last_name <-> 'foobar' > 0.5;


I get this back:



Seq Scan on public.person  (cost=0.00..290.82 rows=4485 width=233)
Output: person_id, first_name, last_name
Filter: ((person.last_name <-> 'foobar'::text) < '0.5'::double precision)


And it doesn't look like the index is being used. However, if I use the % operator with this command:



explain verbose select * from person where last_name % 'foobar';


It seems to use the index:



Bitmap Heap Scan on public.person  (cost=4.25..41.51 rows=13 width=233)
Output: person_id, first_name, last_name
Recheck Cond: (person.last_name % 'foobar'::text)
-> Bitmap Index Scan on last_name_idx (cost=0.00..4.25 rows=13 width=0)
Index Cond: (person.last_name % 'foobar'::text)


I also noticed that if I move the operator to the select portion of the query, the index gets ignored again:



explain verbose select last_name % 'foobar' from person;

Seq Scan on public.person (cost=0.00..257.19 rows=13455 width=1)
Output: (last_name % 'foobar'::text)


Am I missing something obvious about how the similarity function uses the trigram index?



I am using Postgres 10.5 on OSX.



EDIT 1



As per Laurenz's suggestion, I tried setting enable_seqscan = off but unfortunately, the query with the <-> operator still seems to ignore the index.



show enable_seqscan;
enable_seqscan
----------------
off

explain verbose select * from person where last_name <-> 'foobar' < 0.5;

-----------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.person (cost=10000000000.00..10000000290.83 rows=4485 width=233)
Output: person_id, first_name, last_name
Filter: ((person.last_name <-> 'foobar'::text) < '0.5'::double precision)









share|improve this question





























    0















    I have a simple person table with a last_name column that I've added a GIST index with



    CREATE INDEX last_name_idx ON person USING gist (last_name gist_trgm_ops);


    According to the docs at https://www.postgresql.org/docs/10/pgtrgm.html, the <-> operator should utilize this index. However, when I actually try to use this difference operator using this query:



    explain verbose select * from person where last_name <-> 'foobar' > 0.5;


    I get this back:



    Seq Scan on public.person  (cost=0.00..290.82 rows=4485 width=233)
    Output: person_id, first_name, last_name
    Filter: ((person.last_name <-> 'foobar'::text) < '0.5'::double precision)


    And it doesn't look like the index is being used. However, if I use the % operator with this command:



    explain verbose select * from person where last_name % 'foobar';


    It seems to use the index:



    Bitmap Heap Scan on public.person  (cost=4.25..41.51 rows=13 width=233)
    Output: person_id, first_name, last_name
    Recheck Cond: (person.last_name % 'foobar'::text)
    -> Bitmap Index Scan on last_name_idx (cost=0.00..4.25 rows=13 width=0)
    Index Cond: (person.last_name % 'foobar'::text)


    I also noticed that if I move the operator to the select portion of the query, the index gets ignored again:



    explain verbose select last_name % 'foobar' from person;

    Seq Scan on public.person (cost=0.00..257.19 rows=13455 width=1)
    Output: (last_name % 'foobar'::text)


    Am I missing something obvious about how the similarity function uses the trigram index?



    I am using Postgres 10.5 on OSX.



    EDIT 1



    As per Laurenz's suggestion, I tried setting enable_seqscan = off but unfortunately, the query with the <-> operator still seems to ignore the index.



    show enable_seqscan;
    enable_seqscan
    ----------------
    off

    explain verbose select * from person where last_name <-> 'foobar' < 0.5;

    -----------------------------------------------------------------------------------------------------------------------------
    Seq Scan on public.person (cost=10000000000.00..10000000290.83 rows=4485 width=233)
    Output: person_id, first_name, last_name
    Filter: ((person.last_name <-> 'foobar'::text) < '0.5'::double precision)









    share|improve this question



























      0












      0








      0








      I have a simple person table with a last_name column that I've added a GIST index with



      CREATE INDEX last_name_idx ON person USING gist (last_name gist_trgm_ops);


      According to the docs at https://www.postgresql.org/docs/10/pgtrgm.html, the <-> operator should utilize this index. However, when I actually try to use this difference operator using this query:



      explain verbose select * from person where last_name <-> 'foobar' > 0.5;


      I get this back:



      Seq Scan on public.person  (cost=0.00..290.82 rows=4485 width=233)
      Output: person_id, first_name, last_name
      Filter: ((person.last_name <-> 'foobar'::text) < '0.5'::double precision)


      And it doesn't look like the index is being used. However, if I use the % operator with this command:



      explain verbose select * from person where last_name % 'foobar';


      It seems to use the index:



      Bitmap Heap Scan on public.person  (cost=4.25..41.51 rows=13 width=233)
      Output: person_id, first_name, last_name
      Recheck Cond: (person.last_name % 'foobar'::text)
      -> Bitmap Index Scan on last_name_idx (cost=0.00..4.25 rows=13 width=0)
      Index Cond: (person.last_name % 'foobar'::text)


      I also noticed that if I move the operator to the select portion of the query, the index gets ignored again:



      explain verbose select last_name % 'foobar' from person;

      Seq Scan on public.person (cost=0.00..257.19 rows=13455 width=1)
      Output: (last_name % 'foobar'::text)


      Am I missing something obvious about how the similarity function uses the trigram index?



      I am using Postgres 10.5 on OSX.



      EDIT 1



      As per Laurenz's suggestion, I tried setting enable_seqscan = off but unfortunately, the query with the <-> operator still seems to ignore the index.



      show enable_seqscan;
      enable_seqscan
      ----------------
      off

      explain verbose select * from person where last_name <-> 'foobar' < 0.5;

      -----------------------------------------------------------------------------------------------------------------------------
      Seq Scan on public.person (cost=10000000000.00..10000000290.83 rows=4485 width=233)
      Output: person_id, first_name, last_name
      Filter: ((person.last_name <-> 'foobar'::text) < '0.5'::double precision)









      share|improve this question
















      I have a simple person table with a last_name column that I've added a GIST index with



      CREATE INDEX last_name_idx ON person USING gist (last_name gist_trgm_ops);


      According to the docs at https://www.postgresql.org/docs/10/pgtrgm.html, the <-> operator should utilize this index. However, when I actually try to use this difference operator using this query:



      explain verbose select * from person where last_name <-> 'foobar' > 0.5;


      I get this back:



      Seq Scan on public.person  (cost=0.00..290.82 rows=4485 width=233)
      Output: person_id, first_name, last_name
      Filter: ((person.last_name <-> 'foobar'::text) < '0.5'::double precision)


      And it doesn't look like the index is being used. However, if I use the % operator with this command:



      explain verbose select * from person where last_name % 'foobar';


      It seems to use the index:



      Bitmap Heap Scan on public.person  (cost=4.25..41.51 rows=13 width=233)
      Output: person_id, first_name, last_name
      Recheck Cond: (person.last_name % 'foobar'::text)
      -> Bitmap Index Scan on last_name_idx (cost=0.00..4.25 rows=13 width=0)
      Index Cond: (person.last_name % 'foobar'::text)


      I also noticed that if I move the operator to the select portion of the query, the index gets ignored again:



      explain verbose select last_name % 'foobar' from person;

      Seq Scan on public.person (cost=0.00..257.19 rows=13455 width=1)
      Output: (last_name % 'foobar'::text)


      Am I missing something obvious about how the similarity function uses the trigram index?



      I am using Postgres 10.5 on OSX.



      EDIT 1



      As per Laurenz's suggestion, I tried setting enable_seqscan = off but unfortunately, the query with the <-> operator still seems to ignore the index.



      show enable_seqscan;
      enable_seqscan
      ----------------
      off

      explain verbose select * from person where last_name <-> 'foobar' < 0.5;

      -----------------------------------------------------------------------------------------------------------------------------
      Seq Scan on public.person (cost=10000000000.00..10000000290.83 rows=4485 width=233)
      Output: person_id, first_name, last_name
      Filter: ((person.last_name <-> 'foobar'::text) < '0.5'::double precision)






      postgresql similarity postgresql-10 trigram






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 14 '18 at 16:22







      archeezee

















      asked Nov 13 '18 at 16:32









      archeezeearcheezee

      12611




      12611
























          1 Answer
          1






          active

          oldest

          votes


















          1














          This behavior is normal for all kinds of indexes.



          The first query is not in a form that can use the index. For that, a condition would have to be of the form



          <indexed expression> <operator supported by the index> <quasi-constant>


          where the last expressions remains constant for the duration of the index scan and the operator returns a boolean value. Your expression ´last_name <-> 'foobar' > 0.5` is not of that form.



          The <-> operator has to be used in an ORDER BY clause to be able to use the index.



          The third query doesn't use the index because the query affects all rows of the table. An index does not speed up the evaluation of an expression, it is only useful to quickly identify a subset of the table (or to get rows in a certain sort order).






          share|improve this answer


























          • Thanks, that all makes sense. However, I just tried setting enable_seqscan = off but the query is still doing the seq scan with the <-> operator.

            – archeezee
            Nov 13 '18 at 22:56











          • I've updated the question with the results from your suggestion.

            – archeezee
            Nov 14 '18 at 16:23











          • Ah, stupid me. I have fixed the answer.

            – Laurenz Albe
            Nov 14 '18 at 19:12











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53285509%2fpostgres-similarity-function-not-appropriately-using-trigram-index%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          This behavior is normal for all kinds of indexes.



          The first query is not in a form that can use the index. For that, a condition would have to be of the form



          <indexed expression> <operator supported by the index> <quasi-constant>


          where the last expressions remains constant for the duration of the index scan and the operator returns a boolean value. Your expression ´last_name <-> 'foobar' > 0.5` is not of that form.



          The <-> operator has to be used in an ORDER BY clause to be able to use the index.



          The third query doesn't use the index because the query affects all rows of the table. An index does not speed up the evaluation of an expression, it is only useful to quickly identify a subset of the table (or to get rows in a certain sort order).






          share|improve this answer


























          • Thanks, that all makes sense. However, I just tried setting enable_seqscan = off but the query is still doing the seq scan with the <-> operator.

            – archeezee
            Nov 13 '18 at 22:56











          • I've updated the question with the results from your suggestion.

            – archeezee
            Nov 14 '18 at 16:23











          • Ah, stupid me. I have fixed the answer.

            – Laurenz Albe
            Nov 14 '18 at 19:12
















          1














          This behavior is normal for all kinds of indexes.



          The first query is not in a form that can use the index. For that, a condition would have to be of the form



          <indexed expression> <operator supported by the index> <quasi-constant>


          where the last expressions remains constant for the duration of the index scan and the operator returns a boolean value. Your expression ´last_name <-> 'foobar' > 0.5` is not of that form.



          The <-> operator has to be used in an ORDER BY clause to be able to use the index.



          The third query doesn't use the index because the query affects all rows of the table. An index does not speed up the evaluation of an expression, it is only useful to quickly identify a subset of the table (or to get rows in a certain sort order).






          share|improve this answer


























          • Thanks, that all makes sense. However, I just tried setting enable_seqscan = off but the query is still doing the seq scan with the <-> operator.

            – archeezee
            Nov 13 '18 at 22:56











          • I've updated the question with the results from your suggestion.

            – archeezee
            Nov 14 '18 at 16:23











          • Ah, stupid me. I have fixed the answer.

            – Laurenz Albe
            Nov 14 '18 at 19:12














          1












          1








          1







          This behavior is normal for all kinds of indexes.



          The first query is not in a form that can use the index. For that, a condition would have to be of the form



          <indexed expression> <operator supported by the index> <quasi-constant>


          where the last expressions remains constant for the duration of the index scan and the operator returns a boolean value. Your expression ´last_name <-> 'foobar' > 0.5` is not of that form.



          The <-> operator has to be used in an ORDER BY clause to be able to use the index.



          The third query doesn't use the index because the query affects all rows of the table. An index does not speed up the evaluation of an expression, it is only useful to quickly identify a subset of the table (or to get rows in a certain sort order).






          share|improve this answer















          This behavior is normal for all kinds of indexes.



          The first query is not in a form that can use the index. For that, a condition would have to be of the form



          <indexed expression> <operator supported by the index> <quasi-constant>


          where the last expressions remains constant for the duration of the index scan and the operator returns a boolean value. Your expression ´last_name <-> 'foobar' > 0.5` is not of that form.



          The <-> operator has to be used in an ORDER BY clause to be able to use the index.



          The third query doesn't use the index because the query affects all rows of the table. An index does not speed up the evaluation of an expression, it is only useful to quickly identify a subset of the table (or to get rows in a certain sort order).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 14 '18 at 19:11

























          answered Nov 13 '18 at 22:03









          Laurenz AlbeLaurenz Albe

          45.1k102747




          45.1k102747













          • Thanks, that all makes sense. However, I just tried setting enable_seqscan = off but the query is still doing the seq scan with the <-> operator.

            – archeezee
            Nov 13 '18 at 22:56











          • I've updated the question with the results from your suggestion.

            – archeezee
            Nov 14 '18 at 16:23











          • Ah, stupid me. I have fixed the answer.

            – Laurenz Albe
            Nov 14 '18 at 19:12



















          • Thanks, that all makes sense. However, I just tried setting enable_seqscan = off but the query is still doing the seq scan with the <-> operator.

            – archeezee
            Nov 13 '18 at 22:56











          • I've updated the question with the results from your suggestion.

            – archeezee
            Nov 14 '18 at 16:23











          • Ah, stupid me. I have fixed the answer.

            – Laurenz Albe
            Nov 14 '18 at 19:12

















          Thanks, that all makes sense. However, I just tried setting enable_seqscan = off but the query is still doing the seq scan with the <-> operator.

          – archeezee
          Nov 13 '18 at 22:56





          Thanks, that all makes sense. However, I just tried setting enable_seqscan = off but the query is still doing the seq scan with the <-> operator.

          – archeezee
          Nov 13 '18 at 22:56













          I've updated the question with the results from your suggestion.

          – archeezee
          Nov 14 '18 at 16:23





          I've updated the question with the results from your suggestion.

          – archeezee
          Nov 14 '18 at 16:23













          Ah, stupid me. I have fixed the answer.

          – Laurenz Albe
          Nov 14 '18 at 19:12





          Ah, stupid me. I have fixed the answer.

          – Laurenz Albe
          Nov 14 '18 at 19:12


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53285509%2fpostgres-similarity-function-not-appropriately-using-trigram-index%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Xamarin.iOS Cant Deploy on Iphone

          Glorious Revolution

          Dulmage-Mendelsohn matrix decomposition in Python