Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add production for AR regions #4894

Open
wants to merge 41 commits into
base: master
Choose a base branch
from

Conversation

meiyiniu
Copy link
Contributor

Issue

None

Description

Adds regional production data for Argentina (real-time renewable and non-renewable), pulling from the Cammesa data source which is currently used for Argentina national data.

Preview

image

@github-actions github-actions bot added frontend 🎨 parser tests translations 🗣 zone config Pull request or issue for zone configurations labels Dec 21, 2022
@unitrium
Copy link
Contributor

Thanks for the great work, looking forward to see this on the map!
TLDR: try to have only changes related to your feature on this PR, here there are also some stuff related to FR_O.
Running git rebase master should do the trick.

Long version:
Please just make sure to rebase your branch on the latest master branch to avoid taking commits from your other branches. The reasoning is that we should have more or less 1 PR = 1 branch = 1 feature. Otherwise it gets a bit mixed up when reviewing. I see here that we have modifications related to FR_O which is not the same feature as the Argentinian regions :) Your changes for FR_O will be merged in with #4501 .

@VIKTORVAV99
Copy link
Member

VIKTORVAV99 commented Dec 21, 2022

Thanks for the great work, looking forward to see this on the map! TLDR: try to have only changes related to your feature on this PR, here there are also some stuff related to FR_O. Running git rebase master should do the trick.

Long version: Please just make sure to rebase your branch on the latest master branch to avoid taking commits from your other branches. The reasoning is that we should have more or less 1 PR = 1 branch = 1 feature. Otherwise it gets a bit mixed up when reviewing. I see here that we have modifications related to FR_O which is not the same feature as the Argentinian regions :) Your changes for FR_O will be merged in with #4501 .

Agree with everything in here but I just wanted to add since you are using a fork it is probably better to use upstream/master as the rebase target to avoid rebasing on a outdated master branch. (git rebase upstream/master)

@meiyiniu
Copy link
Contributor Author

@unitrium @VIKTORVAV99 Thanks for letting me know! This is our first time contributing to open-source so we're still trying to gain familiarity with workflows - your advice is much appreciated :) I've rebased the branch onto upstream/master - marking this PR as ready to review now!

@meiyiniu meiyiniu marked this pull request as ready for review December 21, 2022 23:31
Copy link
Contributor

@unitrium unitrium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks promising! A few comments about some improvements that could be done on the code.
Also, we have a few exchanges for Argentina, you will need to create the regional exchanges in the config as well.
So for instance we have AR_BR in config/exchanges, you will need to find out in which region on the argentinian side this exchange lives and create the corresponding exchange config. It's probably something like AR-CEN_BR-S. Also note that Brazil is an aggregated country so make sure to do it on the regional level there.

config/zones/AR.yaml Show resolved Hide resolved
parsers/AR.py Outdated Show resolved Hide resolved
parsers/AR.py Outdated Show resolved Hide resolved
parsers/AR.py Outdated Show resolved Hide resolved
parsers/AR.py Outdated Show resolved Hide resolved
@unitrium
Copy link
Contributor

As I said, we will probably want to keep the AR zone without it becoming an aggregation as we have direct data for it. I will discuss it with the team to decide on how we want to tackle this.
Probably in your config file it will be something like this to add

AR.yaml

contryView: true

@unitrium unitrium mentioned this pull request Dec 22, 2022
7 tasks
@VIKTORVAV99
Copy link
Member

As I said, we will probably want to keep the AR zone without it becoming an aggregation as we have direct data for it. I will discuss it with the team to decide on how we want to tackle this. Probably in your config file it will be something like this to add

AR.yaml

contryView: true

We could probably just check if the country level zones have any parsers instead of adding additional config keys?

That way we could also use the parser information to infer what parts should be aggregated or not.
Example country level:

parser:
    production: "example parser"

Here we have a country level config with a production parser so we would only need to aggregate say consumption and price from the subzones (assuming they have it).

@unitrium
Copy link
Contributor

True that would be a better way of being more granular. I'm just afraid it might add some hidden mechanics.

As I said, we will probably want to keep the AR zone without it becoming an aggregation as we have direct data for it. I will discuss it with the team to decide on how we want to tackle this. Probably in your config file it will be something like this to add

AR.yaml

contryView: true

We could probably just check if the country level zones have any parsers instead of adding additional config keys?

That way we could also use the parser information to infer what parts should be aggregated or not. Example country level:

parser:
    production: "example parser"

Here we have a country level config with a production parser so we would only need to aggregate say consumption and price from the subzones (assuming they have it).

True that would be a better way of having a more granular control, I'm just afraid it would result in some hidden mechanics people are not aware of.

@meiyiniu meiyiniu requested review from unitrium and VIKTORVAV99 and removed request for unitrium December 22, 2022 18:59
Copy link
Member

@VIKTORVAV99 VIKTORVAV99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes looks mostly good but I do have some questions about the parser itself.
First it seems like the regional zones only return one data point at once? Would it be possible to return a list like for the AR zone?

It also seems to be some availability issues? I tried ran though the zones and all of the sudden it stopped working? This would probably be resolved if it's possible to return it as a list as well.

parsers/AR.py Outdated
Comment on lines 175 to 179
assert (
regions_response.status_code == 200
), "Exception when fetching regions for AR: " "error when calling url={}".format(
CAMMESA_REGIONS_ENDPOINT
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if this used f"string {object}" instead, it's a little easier to work with and it's sligltly faster from a preformance point of view.

Copy link
Contributor

@unitrium unitrium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better! I haven't been able to test it locally yet though.
If you are up for it, it would be great if you could write a unit test to demonstrate how it works. For this you could just add some mock data from the API and put it in the tests/mock folder. You can have a look at test/test_NTESMO.py for inspiration.

Otherwise just missing the exchange config and we're almost done!

parsers/AR.py Outdated Show resolved Hide resolved
parsers/AR.py Outdated Show resolved Hide resolved
zone_key, current_session
)
non_renewables_production = non_renewables_production_mix(zone_key, current_session)
renewables_production = renewables_production_mix(zone_key, current_session)

full_production_list = [
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just thinking that here we will miss datapoints which have only non_renewables.

renewables = {
'2022-12-22 01:00': {},
'2022-12-22 02:00': {}
}

non_renewables = {
'2022-12-22 01:00': {},
'2022-12-22 03:00': {}
}
Here the 03:00 timestamp would be ignored for non_renewable.

Maybe it would be better to rewrite this a bit to merge them if they exist with the same timestamp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably add unique keys for hydro (since it exits in both) and then just merge the dicts with the datetime as the key, that would ensure we get all data in one place grouped by the correct datetime and then format the data in a separate function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@unitrium please see my comment here, which addresses your concern:
#4894 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like Robin I am unsure how to best handle this (based on the above comment) as there seem to be no way to ensure the two APIs are in sync.

@meiyiniu
Copy link
Contributor Author

First it seems like the regional zones only return one data point at once? Would it be possible to return a list like for the AR zone?
It also seems to be some availability issues? I tried ran though the zones and all of the sudden it stopped working? This would probably be resolved if it's possible to return it as a list as well.

@VIKTORVAV99 Both of these stem from the same reason, which is that the renewables endpoint for regional zones only returns live data, i.e. data for the current time, which is only 1 data point. The non-renewables endpoint, which AR uses, returns data in 5-minute intervals over the entire day. I tried looking for renewables data across the entire day like the non-renewables one, but it does not seem to be available.

To merge the data from the two endpoints, I'm assigning the datetime for the renewables live data to the last 5min mark (e.g. 12:09 -> 12:05), and checking the non-renewables data for the datetime. So, to answer your first question, that's why there's only one data point - we only have both renewables and non-renewable data for the last 5min mark. To answer your second question, sometimes the non-renewables data takes a few minutes to update, so the last 5min mark might not yet be available (e.g. if it's 12:09, the datetime is set to be 12:05, but the non-renewables might have only updated up to 12:00). In this case, no data would be returned since the datetime cannot be found in the non-renewables data. A solution could be to merge the live renewables data with the last updated non-renewables data whatever the time is, but this adds a dependency on the non-renewable API: if the non-renewables data takes longer than just a few minutes to update (it could be the case that it hasn't been updated for 2+ hours) the data returned by the parser would be inaccurate/out of date.

Not sure what to do about the first problem of there being only 1 data point, aside from just returning all of the non-renewables data even if there is no corresponding renewables data for the older datetimes. But, not sure if that is ideal/meaningful/accurate to have just non-renewables data for past times but both for live data. Any suggestions would be appreciated!

@github-actions github-actions bot added the exchange config Pull request or issue for exchange configurations label Dec 23, 2022
@meiyiniu
Copy link
Contributor Author

@unitrium I've added the regional exchange under AR-NEA_BR-S :)

Comment on lines 1 to 6
lonlat:
- -56.407453
- -28.880006
parsers:
exchange: BR.fetch_exchange
rotation: 110
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BR.py parser does not support the AR-NEA->BR-S exchange so this will have to be implemented in the AR.py parser. I would suggest adding it in #4893 and then we can merge that first.

Copy link
Contributor Author

@meiyiniu meiyiniu Dec 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in: 44946b6

@VIKTORVAV99
Copy link
Member

Once #4893 is merged we can proceed with this and create all the new exchange configs in this PR.

@VIKTORVAV99
Copy link
Member

Since #4893 is merged now we can proceed with this.
What we need to do in here now is first to clean up the merge conflicts, create config files for the exchanges between all zones, and then last of all make sure it's all functioning as expected. 🙂

@github-actions github-actions bot added dependencies Pull requests that update a dependency file infrastructure labels Jan 3, 2023
@VIKTORVAV99
Copy link
Member

@VIKTORVAV99 Hi Viktor, are there any action items for me for this PR?

Yes and no, we need a review from @unitrium (as he requested changes) and then we need to determine if the parser is stable enough. I think my latest changes improved things a little (at least during my local testing) but we might have to rework things further.

Copy link
Contributor

@unitrium unitrium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking so much time to come back to this!
A few more stuff about the exchanges, it should be easy to fix, it's just removing code.

Regarding exchanges we should query them at the smallest granularity level, here the regions. We should maintain a single source of truth for the data. The aggregation pipeline will then create the data for the country derived from those.

parsers/UY.py Outdated Show resolved Hide resolved
config/exchanges/AR_UY.yaml Outdated Show resolved Hide resolved
config/exchanges/AR_CL-SEN.yaml Outdated Show resolved Hide resolved
config/exchanges/AR_PY.yaml Outdated Show resolved Hide resolved
config/exchanges/AR-NEA_PY.yaml Show resolved Hide resolved
- -58.192
- -32.456
parsers:
exchange: UY.fetch_exchange
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we won't need that, see my comments below :)
It's the same as for DK and DK-1 DK-2.

- -58.192
- -32.456
parsers:
exchange: UY.fetch_exchange
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this is merged, in the database we will remap all exchanges from AR->UY to AR-LIT->UY.

parsers/AR.py Outdated Show resolved Hide resolved
parsers/AR.py Outdated Show resolved Hide resolved
@unitrium
Copy link
Contributor

Regarding the conflicts that are on your branch you probably need to:

  • pull the latest version of master. A lot has changed recently.
  • overwrite your web/config/exchanges.json and web/config/zones.json with the version on master.
  • rerun npm generate-world in /web.

@meiyiniu
Copy link
Contributor Author

@unitrium I've fixed the things you mentioned, please take another look :)

@meiyiniu meiyiniu requested review from unitrium and removed request for VIKTORVAV99 February 23, 2023 20:17
@unitrium
Copy link
Contributor

Hey @meiyiniu,
I gave it a more in depths review this time.
Regarding the config, thanks for fixing the mentioned items, I think we are good to go on that end.

One final request, would be to add a bit of unit testing so we can have a proof that the parser works and is not regressing in the future. I know we are a bit behind on that area, but it's probably essential so we don't break stuff.
I took the liberty of writing a small unit test for fetching the national production, so you can have structure for writing yours. I downloaded some json data from the api and truncated it to make it easy.
You will find it under test/test_AR.py. You can also have a look at how we did this for test_ENTSOE.py.

So it would be great to finish the PR by adding a test case for fetching regional production data and exchanges.

While writing the unit test I found out that for the national data there seemed to be an error, the renewable production list was always empty, probably because it was missing this. Have a look at it and let me know if you agree.
Also another issue I stumbled upon was that I couldn't get the datetimes to match between renewables and non renewables. I think I fixed it by using the same formatting. If you could also have a look and let me know if it sounds ok for you ?

Copy link
Member

@VIKTORVAV99 VIKTORVAV99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things look good from my side so there should be nothing blocking this from approval once the tests are in.

I also did a final cleanup as the fix for AR broke all the AR-* zones but now it should work for both.

@meiyiniu
Copy link
Contributor Author

meiyiniu commented Mar 9, 2023

@unitrium Thanks for making the fixes! I've added a test for regional production and for exchange. Please take a look and let me know what you think.

@meiyiniu meiyiniu requested review from VIKTORVAV99 and removed request for unitrium March 9, 2023 05:55
@VIKTORVAV99 VIKTORVAV99 linked an issue Mar 13, 2023 that may be closed by this pull request
@VIKTORVAV99
Copy link
Member

@unitrium what did you and @FelixDQ come up with during your discussion, is it safe to merge this as is assuming the tests check out (I'll give them a review later today) or do we need to make any other changes in here or in the backend? (Like the retry function in the parser itself we discussed really quick a while back?)

It would be great to get this merged as it seems we are way off on the carbon emissions as described in #5196 which this would take care of.

@unitrium
Copy link
Contributor

unitrium commented Mar 15, 2023

So to put my discussion with Victor here as well.
Actually in the backend we do the time realignment. So if you publish two separate production events like this:
[{solar:10, datetime: 11/05/2015 15:05}, {unknown:20, datetime: 11/05/2015 15:35}]
The backend would reconstruct a coherent dict for the hour 15:00 with

{
 solar: average of all solar events from 15:00 to 16:00,
 unknown: average of all unknown events from 15:00 to 16:00
}

This means that we can eliminate the flakiness coming from the fact that we have to merge two dictionaries from two endpoints. We just have to return all the production events separately.

The problem in this case is that we get two hydro events. One probably big hydro coming from the renewable endpoint and another coming from smaller hydro in the other endpoint. They should be summed not averaged. So they would still need to be merged together somehow. I don't think there's an easy way out in the backend. So we would prefer having this issue solved here.

I would suggest the following:

  • report all renewables and other production modes appart from hydro separately, let the backend handle the reconstruction part.
  • for hydro we need to merge the dictionaries, therefore we would need to add some fail safe like the retry function @VIKTORVAV99 is suggesting.

@FelixDQ
Copy link
Member

FelixDQ commented Mar 15, 2023

Actually in the backend we do the time realignment. So if you publish two separate production events like this: [{solar:10, datetime: 11/05/2015 15:05}, {unknown:20, datetime: 11/05/2015 15:35}] The backend would reconstruct a coherent dict for the hour 15:00 with

{
 solar: average of all solar events from 15:00 to 16:00,
 unknown: average of all unknown events from 15:00 to 16:00
}

I might be misunderstanding what you mean, but I don't think that is exactly what's happening. The time alignment would essentially decide that {solar:10, datetime: 11/05/2015 15:05} is valid from 15:05 to 15:34, and {unknown:20, datetime: 11/05/2015 15:35} would be valid from 15:35 to 15:35+2 hours or until another event comes in. It wouldn't merge the events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exchange config Pull request or issue for exchange configurations frontend 🎨 parser tests translations 🗣 zone config Pull request or issue for zone configurations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Carbon intensity for Argentina (AR)
4 participants