8 Replies Latest reply: Sep 16, 2011 3:12 PM by Clovis Wichoski RSS

a SELECT DISTINCT that uses LIMIT returns wrong results

Clovis Wichoski
Currently Being Moderated

HI,

 

In MaxDB version 7.6.06 on Linux X86_64 using follow SQL:

 

SELECT  DISTINCT

    "PECA"."OID"

FROM

    "PECA","ENTIDADEPRODUTO" "ENTIDADEPRODUTO_0",

    "PRODUTO" "PRODUTO_1",

    "PRODUTO"

WHERE

    "PECA"."OID"="PRODUTO"."OID" AND

    "PRODUTO"."OID"="ENTIDADEPRODUTO_0"."MY_PRODUTO" AND

    "ENTIDADEPRODUTO_0"."MY_PRODUTO"="PRODUTO_1"."OID" AND

    ("PECA"."OID" = "PRODUTO_1"."OID" AND

     "PRODUTO"."SUBCLASS" = 'PC' AND

     "ENTIDADEPRODUTO_0"."MY_ENTIDADE" = 'AAAAD8BK')

 

1) I get  these records:

 

AAAAA89-

AAAAA824

AAAAA239

AAAAA825

AAAAA817

AAAAA818

AAAAQ24-

 

2) If i append in that SQL statement a LIMIT 4 clause i get:

 

AAAADzA4

AAAADzA5

AAAADzA6

AAAADzA7

 

3) If i append in that SQL statement a ORDER BY 1 LIMIT 4 clause i get:

 

AAAAA20N

AAAAA20O

AAAAA20P

AAAAA20Q

 

4) If i suround that SQL statment with SELECT * FROM ( putSqlStatementHere) LIMIT 4 i get:

 

AAAAA89-

AAAAA824

AAAAA239

AAAAA825

 

 

The result at case 4 is the result that i expect when executed the case 2, anyone knows why this occurs?

 

regards

 

Clóvis

 

  • Re: a SELECT DISTINCT that uses LIMIT returns wrong results
    Lars Breddemann
    Currently Being Moderated
    Clovis Wichoski wrote:

    HI,

     

    In MaxDB version 7.6.06 on Linux X86_64 using follow SQL:

     

    SELECT  DISTINCT

        "PECA"."OID"

    FROM

        "PECA","ENTIDADEPRODUTO" "ENTIDADEPRODUTO_0",

        "PRODUTO" "PRODUTO_1",

        "PRODUTO"

    WHERE

        "PECA"."OID"="PRODUTO"."OID" AND

        "PRODUTO"."OID"="ENTIDADEPRODUTO_0"."MY_PRODUTO" AND

        "ENTIDADEPRODUTO_0"."MY_PRODUTO"="PRODUTO_1"."OID" AND

        ("PECA"."OID" = "PRODUTO_1"."OID" AND

         "PRODUTO"."SUBCLASS" = 'PC' AND

         "ENTIDADEPRODUTO_0"."MY_ENTIDADE" = 'AAAAD8BK')

     

    1) I get  these records:

     

    AAAAA89-

    AAAAA824

    AAAAA239

    AAAAA825

    AAAAA817

    AAAAA818

    AAAAQ24-

     

    2) If i append in that SQL statement a LIMIT 4 clause i get:

     

    AAAADzA4

    AAAADzA5

    AAAADzA6

    AAAADzA7

     

     

    4) If i suround that SQL statment with SELECT * FROM ( putSqlStatementHere) LIMIT 4 i get:

     

    AAAAA89-

    AAAAA824

    AAAAA239

    AAAAA825

     

     

    The result at case 4 is the result that i expect when executed the case 2, anyone knows why this occurs?

     

    Hi Clóvis,

     

    it's because your expectation is wrong.

    EVERYTIME and ALWAYS when you want/expect/need your data in a specific order, then you have to use the ORDER BY clause.

    Without the ORDER BY the actual order of result rows has to be considered to be random.

     

    So far about the developers point of view. That's what you need to keep in mind when using SQL.

     

    Now, to figure out why this happens technically - a good first approach would be to check whether the execution plan for case 2 is different from the plan for case 1.

     

    regards,

    Lars

     

    • Re: a SELECT DISTINCT that uses LIMIT returns wrong results
      Clovis Wichoski
      Currently Being Moderated

      Hi, Lars,

       

      I think that you missinterpret my question, or maybe i'm not so clear about the problem, without surround the SQL and having DISTINCT with LIMIT, some records ahead the MaxDB always returns same records, in case 2 for example:

       

      LIMIT 1, 100 returns values (note its not random, its based on primary key index) as you expected.

       

      LIMIT 101, 100 returns random values as you expected.

       

      but LIMIT 201, 100 returns same results as above.

       

      LIMIT 301, 100 same

       

      and so on,

       

      this query have about 2997 records

       

      regards

       

      Clóvis

       

      • Re: a SELECT DISTINCT that uses LIMIT returns wrong results
        Lars Breddemann
        Currently Being Moderated

        > I think that you missinterpret my question, or maybe i'm not so clear about the problem, without surround the SQL and having DISTINCT with LIMIT, some records ahead the MaxDB always returns same records, in case 2 for example:

         

        Nope I got your question right.

        You wonder that there is a different order in the second case.

        And yes, it would be interesting to know why this is the case (that's why I would check the execution plan).

         

        But the key point up to here is that the LIMIT clause does not imply any kind of ordering.

        It just says: start counting the rows you get as they come to you and then print out n rows starting at row x.

        Since the primary key is due to the implementation kind of a "natural order"ing element, it might often happen that the output appears to be ordered by the primary key.

        But it is not. There hadn't been a specific ordering process step before giving out the rows.

         

        So the order is random in so far as there is no guarantee for the order.

         

        If you want to use the LIMIT clause to create a kind of data window over a fixed result set, then you've to use the ORDER BY clause.

         

        I hope I got your point right.

         

        regards,

        Lars

        • Re: a SELECT DISTINCT that uses LIMIT returns wrong results
          Clovis Wichoski
          Currently Being Moderated

          Hi Lars,

           

          Sorry for long delay,

           

          explain plan for 2 case:

           

          OWNER     TABLENAME     COLUMN_OR_INDEX     STRATEGY     PAGECOUNT
               ENTIDADEPRODUTO_0     IENTIDADEPRODUTO1     EQUAL CONDITION FOR INDEX           5523
                    MY_ENTIDADE          (USED INDEX COLUMN)     
               PRODUTO_1     OID     JOIN VIA KEY COLUMN           8887
          VENDOR_CONCES     PRODUTO     OID     JOIN VIA KEY COLUMN           8887
          VENDOR_CONCES     PECA     OID     JOIN VIA KEY COLUMN            960
                              NO TEMPORARY RESULTS CREATED     
               JDBC_CURSOR_7               RESULT IS COPIED   , COSTVALUE IS            237
               JDBC_CURSOR_7          QUERYREWRITE - APPLIED RULES:     
               JDBC_CURSOR_7             DistinctPushDownTo              1

           

          • Re: a SELECT DISTINCT that uses LIMIT returns wrong results
            Lars Breddemann
            Currently Being Moderated

            HI Clovis,

             

            wow - this is a long dead thread coming back to life

             

            I still don't see the wrong behaviour here - as long as the ORDER BY clause is not given, the result set can be delivered in any order.

             

            To understand the different result orders for the different statements, we would have to compare the different execution plans.

             

            In a quick test for myself, I was able to produce different result orders as well and the major difference was the usage of the internal temp. table for the SELECT FROM (SUBQUERY) WHERE ROWNUM <=10 case.

             

            Do you still consider it an issue for your use case?

             

            regards,

            Lars

            • Re: a SELECT DISTINCT that uses LIMIT returns wrong results
              Clovis Wichoski
              Currently Being Moderated

              Hi, Lars,

               

              i got this long time because, in past i dont have time to test the concepts you explained.

               

              today I tried limit clause with that distinct command, and now i'm getting a exception:

               

              com.sap.dbtech.jdbc.exceptions.JDBCDriverException: SAP DBTech JDBC: [-7085]: Function not allowed for this cursor

               

              but as I remember the big problem in that time, using limit, next 100, next 100, above 300 records, always i get same result. But now with that exception i cant simulate, the exception occurs, if i use LIMIT, here is the command:

               

              SELECT * FROM (
              SELECT DISTINCT
              "PECA"."OID"
              FROM
              "PECA","ENTIDADEPRODUTO" "ENTIDADEPRODUTO_0",
              "PRODUTO" "PRODUTO_1",
              "PRODUTO"
              WHERE
              "PECA"."OID"="PRODUTO"."OID" AND
              "PRODUTO"."OID"="ENTIDADEPRODUTO_0"."MY_PRODUTO" AND
              "ENTIDADEPRODUTO_0"."MY_PRODUTO"="PRODUTO_1"."OID" AND
              ("PECA"."OID" = "PRODUTO_1"."OID" AND
              "PRODUTO"."SUBCLASS" = 'PC' AND
              "ENTIDADEPRODUTO_0"."MY_ENTIDADE" = 'AAAAD8BK')
              )
              LIMIT 200, 100

               

              but if i use explain on that i get the explain plan:

               

              OWNER     TABLENAME     COLUMN_OR_INDEX     STRATEGY     PAGECOUNT
                        MY_ENTIDADE          (USED INDEX COLUMN)     
              VENDOR_CONCES     PECA     OID     JOIN VIA KEY COLUMN            764
                   PRODUTO_1     OID     JOIN VIA KEY COLUMN           6554
              VENDOR_CONCES     PRODUTO     OID     JOIN VIA KEY COLUMN           6554
                                  NO TEMPORARY RESULTS CREATED     
              INTERNAL     TEMPORARY RESULT          TABLE SCAN              1
                   JDBC_CURSOR_29               RESULT IS COPIED   , COSTVALUE IS             17
                   JDBC_CURSOR_29          QUERYREWRITE - APPLIED RULES:     
                   JDBC_CURSOR_29             DistinctPushDownTo              1

               

              database version still 7.6.06.10

              • Re: a SELECT DISTINCT that uses LIMIT returns wrong results
                Lars Breddemann
                Currently Being Moderated

                Hi Clovis,

                 

                the error message is due to the kind of cursor (I believe it's a forward-only cursor) used by MaxDB Studio in this case.

                You may try it with SQL Studio or sqlcli - this should work (at least it did for me).

                 

                When you really see the same results this would be an error - true.

                How about adding a rownum pseudo-column to the output to be able to identify the single result rows?

                 

                regards,

                Lars

                • Re: a SELECT DISTINCT that uses LIMIT returns wrong results
                  Clovis Wichoski
                  Currently Being Moderated

                  Hi Lars,

                   

                  Using ROWNO, appears that the command runs as i expect without surround with a subquery.

                   

                  SELECT DISTINCT
                  ROWNO, "PECA"."OID"
                  FROM
                  "PECA","ENTIDADEPRODUTO" "ENTIDADEPRODUTO_0",
                  "PRODUTO" "PRODUTO_1",
                  "PRODUTO"
                  WHERE
                  "PECA"."OID"="PRODUTO"."OID" AND
                  "PRODUTO"."OID"="ENTIDADEPRODUTO_0"."MY_PRODUTO" AND
                  "ENTIDADEPRODUTO_0"."MY_PRODUTO"="PRODUTO_1"."OID" AND
                  ("PECA"."OID" = "PRODUTO_1"."OID" AND
                  "PRODUTO"."SUBCLASS" = 'PC' AND
                  "ENTIDADEPRODUTO_0"."MY_ENTIDADE" = 'AAAAD8BK')
                  LIMIT 300, 100

                   

                  and here is the explain:

                   

                  OWNER     TABLENAME     COLUMN_OR_INDEX     STRATEGY     PAGECOUNT
                       ENTIDADEPRODUTO_0     IENTIDADEPRODUTO1     EQUAL CONDITION FOR INDEX           5523
                            MY_ENTIDADE          (USED INDEX COLUMN)     
                       PRODUTO_1     OID     JOIN VIA KEY COLUMN           8887
                  VENDOR_CONCES     PRODUTO     OID     JOIN VIA KEY COLUMN           8887
                  VENDOR_CONCES     PECA     OID     JOIN VIA KEY COLUMN            960
                                      NO TEMPORARY RESULTS CREATED     
                       SYSCURSORNAME               RESULT IS COPIED   , COSTVALUE IS            237
                       SYSCURSORNAME          QUERYREWRITE - APPLIED RULES:     
                       SYSCURSORNAME             DistinctPushDownTo              1

                   

                   

                  I used the JDBC driver to run these commands, directly inside my application. Appears that using ROWNO solves all things, its right to think, that when use rowno and distinct, MaxDB first put an order for temp result table, and just after that works with limit?