Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify query used by CDBAdapterPostgreSQL::getFilesAndIndicesForDimensions #341

Open
lukas-phaf opened this issue Mar 1, 2024 · 2 comments

Comments

@lukas-phaf
Copy link
Collaborator

lukas-phaf commented Mar 1, 2024

The function CDBAdapterPostgreSQL::getFilesAndIndicesForDimensions() constructs a query of the form:

select distinct * from
    (select a0.path,member,dimmember,height,dimheight,time,dimtime from
        (select path,dimmember,member from t20240301t095153031_xkq9pp0nv3kqmonttvhf where member = 'member4' and adaguctilinglevel != -1 ORDER BY member DESC limit 512)a0
        ,(select path,dimheight,height from t20240301t095153038_hqug9msl1wgxpew4yozf where abs(5000 - height) = (select min(abs(5000 - height)) from t20240301t095153038_hqug9msl1wgxpew4yozf)ORDER BY height DESC)a1
        ,(select path,dimtime,time from t20240301t095153043_oe2pyeutb8pdrvb7gu7r where time = '2017-01-01T00:15:00Z' ORDER BY time DESC)a2
     where a0.path=a1.path and a0.path=a2.path)T
order by member,height,time

A simpler from that does (roughly?) the same is:

select distinct t1.path,member,dimmember,height,dimheight,time,dimtime from
    t20240301t095153031_xkq9pp0nv3kqmonttvhf t1
inner join t20240301t095153038_hqug9msl1wgxpew4yozf t2
    ON t1.path = t2.path
inner join t20240301t095153043_oe2pyeutb8pdrvb7gu7r t3
    ON t1.path = t3.path
where
    member = 'member4' and
    t1.adaguctilinglevel != -1 and
    abs(5000 - height) = (select min(abs(5000 - height)) from t20240301t095153038_hqug9msl1wgxpew4yozf) and
    time = '2017-01-01T00:15:00Z'
order by member,height,time
limit 512

This should give the same result, except for the limit. In the original query it is on the first dimension (which is incorrect, it should be on time). In the second query, it limits the total amount of files returned to 512.

This new query should of course be tested to make sure it give equivalent results.

@maartenplieger
Copy link
Member

👍

@mgrunbauer
Copy link
Collaborator

For reference, I'm currently hitting the limit 512 when querying an hourly updated dataset with a lot of layers through EDR. I only receive a smaller amount of timesteps than expected. Configuring the layer with maxquerylimit="2048" works, since that changes the limit 512 to limit 2024.

I think the proposed query would help in my case, since it would only limit the amount of output files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants