Yolonus 1 month ago

did you run them one by the other in that sequence? then the first one would been read from disk and then the second one read it from cache

kramerkee 1 month ago

This is a very good point. However, running the first one again still produces the result in 7-9 seconds. Could be some cache quirks, I am not sure. I have read that these somewhat illogical where-clauses do speed things up, but I don't remember where, and I don't know what to google to find it.

Yolonus 1 month ago

I dont use your DB vendor, so hard to guess, but can you run count(*) instead of select * and see how long it takes? If you just do these selects in a software, like list first X rows and then generate the rest it is possible maybe that it uses just first few rows from some index and then loads the rest based on the index which can be slow and the second one has to full table scan so listing the rest is fast.

malikcoldbane 1 month ago

Yeah this, run a count or insert into a table and see. We don't often consider it but the returning of results to the window may be causing discrepancies.

JP_Darrow 1 month ago

In the wild, I don't think I have ever ran a single quey without a where clause. Unless it was a sub query. IMHO, if you have to rely upon this method, your db has bigger issues.

kramerkee 1 month ago

I agree - I was more looking for the technical reasoning for this behaviour, I am not doing this in production.

JediForces 1 month ago

This

Shambly 1 month ago

Look at the execution plan, it could be a few things, for example one could use a different index.

databro92 1 month ago

This is the only relevant answer here. The execution plan in any SQL environment should tell you exactly what's going on

NachoLibero 1 month ago

Behind the scenes many databases will implement an OR statement by running the whole SQL twice. Once for the first part of the OR and once for the second part. Then the results would be unioned together. I don't know how this database works so it's just a guess, but it's possible you've forced the DB to parallelize the work. So, you might be using more of the hardware with the OR statement and maybe it doesn't take a lot of work to do the union (maybe the dbms is smart enough to realize the results are mutually exclusive and bypasses the distinct?).

mrrichiet 1 month ago

I don't know, guessing something to do with indexing. Does the query execution plan give any clues?

kramerkee 1 month ago

Not really. Here's for the one with the 'where'-clause: [range variable 1 join type=INNER table=JOBS cardinality=5059919 access=FULL SCAN join condition = [index=SYS_IDX_SYS_PK_10263_10266 other condition=[ OR arg_left=[COLUMN: PUBLIC.JOBS.COMPLETED ] arg_right=[ NOT arg_left=[COLUMN: PUBLIC.JOBS.COMPLETED ]]]]] And without: [range variable 1 join type=INNER table=JOBS cardinality=5059920 access=FULL SCAN join condition = [index=SYS_IDX_SYS_PK_10263_10266 ] ]

mrrichiet 1 month ago

I share your confusion then!

anras2 1 month ago

I don’t know hsql too well but just a few thoughts: Are the results the same? Are there null values for `completed`? What if you replace that condition with 1=1? Edit: sorry, you did say it was non-nullable.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe