Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] CarbonEmissionMonitor #148

Merged
merged 5 commits into from
Dec 22, 2024
Merged

[feat] CarbonEmissionMonitor #148

merged 5 commits into from
Dec 22, 2024

Conversation

danielhou0515
Copy link
Collaborator

This pull requests implements the CarbonEmissionMonitor that measures carbon emission. The class features the same methods exposes by ZeusMonitor and is built with a single polling process retrieving energy consumption measurements from ZeusMonitor and carbon intensity measurements from ElectricityMaps.

This pull request also includes tests that test the following scenarios:

  • One window measurement for a window length of <= 1hr
  • One window measurement for a window length of >= 24hr
  • Two window measurements each with a window length of <= 1hr
  • Two window measurements each with a window length of >= 24hr

Copy link
Member

@jaywonchung jaywonchung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work! I went through files in detail except for the test file.

examples/carbon_emission_monitor/run_clm.py Outdated Show resolved Hide resolved
examples/carbon_emission_monitor/README.md Outdated Show resolved Hide resolved
examples/carbon_emission_monitor/README.md Outdated Show resolved Hide resolved
zeus/monitor/carbon.py Outdated Show resolved Hide resolved
zeus/monitor/carbon.py Outdated Show resolved Hide resolved
zeus/monitor/carbon.py Outdated Show resolved Hide resolved
zeus/monitor/carbon.py Outdated Show resolved Hide resolved
zeus/monitor/carbon.py Outdated Show resolved Hide resolved
tests/test_carbon.py Outdated Show resolved Hide resolved
zeus/monitor/carbon.py Outdated Show resolved Hide resolved
@jaywonchung jaywonchung changed the title [feat] Implemented CarbonEmissionMonitor with Test Cases [feat] CarbonEmissionMonitor Dec 16, 2024
@@ -54,7 +62,7 @@ def get_current_carbon_intensity(self) -> float:
pass

@abc.abstractmethod
def get_recent_carbon_intensity(self) -> dict[str, float]:
def get_recent_carbon_intensity(self) -> dict[datetime, float]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: When I get a dictionary that maps 2024 December 18th 10:33 PM ET as the key to a certain carbon intensity number, what does that mean? Carbon intensity of the hour that includes 2024 December 18th 10:33 PM ET, i.e., 2024 December 18th 10:00 - 10:59 PM ET? The duration and boundary of the timeframe that corresponds to each carbon intensity value is unclear from this API contract.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in that example, the duration is from 2024 December 18th 10:00 - 10:59 PM ET.

If I make a fetch for live carbon intensity at 2024 December 18th 10:33 PM ET, the datetime recorded in the response is 2024 December 18th 10:00 ET. This will continue to be the live carbon intensity until the current datetime becomes 2024 December 18th 11:00 ET.

There shouldn't be a situation where 2024 December 18th 10:33 PM ET is a key in the map, the dictionary only includes keys of "whole hours" (2024 December 18th 10:00, 2024 December 18th 11:00, 2024 December 18th 12:00, etc.)

Copy link
Member

@jaywonchung jaywonchung Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so for ElectricityMaps, a datetime with 10 PM maps to the carbon intensity value of 10 PM to 11 PM. And we're doing all this back and forth because this is ambiguous in the API contract. What if someone implemented a new carbon intensity client that return 11 PM to mean the carbon intensity value for 10 PM to 11 PM? Or, what if another service has a granularity of 30 minutes, and returns dictionaries with keys 11:00 PM, 11:30 PM, 12:00 AM, etc.? Downstream calculations will be wrong inside the polling process will be wrong.

Copy link
Collaborator Author

@danielhou0515 danielhou0515 Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, ok this makes sense. I think what we can do is have each specific carbon intensity class store an attribute with the polling frequency and we can pass that to polling process. As for time ranges, would most simple be having the carbon intensity providers map the beginning of each window with the carbon intensity?

For example, if the granularity of updates for the carbon intensity provider is 30 minutes, carbon intensity provider will provider:
{10:00PM -> 300, 10:30PM->345, 11:00PM->343}, where 300 is the carbon intensity for 10:00-10:30PM.

Copy link
Member

@jaywonchung jaywonchung Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that making "the datetime key must mark the beginning of the period" part of the contract is the way to go. And for the granularity, I can think of something like this:

@property
@abstractmethod
def update_period(self) -> int:
    """How long each carbon intensity value in the history dict remains current, in seconds."""
    return 3600

Or, datetime.timedelta(hours=1) should also work. That might actually be easier in some places because we already have datetime.datetime objects around.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question-- should we also set up a similar method that can return how often you need to fetch from the API? ElectricityMaps needs to be fetched every 23 hours because thats how far back they give recent carbon intensity values for but other APIs might be different?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps that can be inferred from calling len on the return value of get_recent_carbon_intensity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up creating a similar @property method. I was thinking that this could save 1 API request and allows for the programmer to have finer control over the polling frequency from APIs.

zeus/monitor/carbon.py Outdated Show resolved Hide resolved
zeus/monitor/carbon.py Outdated Show resolved Hide resolved
zeus/monitor/carbon.py Outdated Show resolved Hide resolved
Comment on lines 376 to 391
for dt, measurement_map in energy_measurements[key].items():
for index, energy in measurement_map.items():
hardware_type, num_index = index.split("_")

for hour_key, carbon_intensity in carbon_intensities.items():
for gpu_index in zeus_monitor.gpu_indices:
# divide by 3600 to convert joules -> Wh
gpu_carbon_emissions[key][gpu_index] += (
energy_measurements[key][hour_key][f"gpu_{gpu_index}"]
/ 3600
* carbon_intensity
)

for cpu_index in zeus_monitor.cpu_indices:
# divide by 3600 to convert joules -> Wh
cpu_carbon_emissions[key][cpu_index] += (
energy_measurements[key][hour_key][f"cpu_{cpu_index}"]
/ 3600
* carbon_intensity
)
if hardware_type == "gpu":
gpu_carbon_emissions[key][int(num_index)] += (
energy / 3.6e6 * carbon_intensities[dt]
)
elif hardware_type == "cpu":
cpu_carbon_emissions[key][int(num_index)] += (
energy / 3.6e6 * carbon_intensities[dt]
)
elif hardware_type == "dram":
dram_carbon_emissions[key][int(num_index)] += (
energy / 3.6e6 * carbon_intensities[dt]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, depending on the meaning of the datetime object part of the CarbonIntensityProvider API contract, this may or may not be correct. Here, for a datetime key in energy_measurements (this datetime object is from hour_floor, so it's probably the energy consumed by the previous one hour) is accessed. This is matched with the carbon intensity value of the carbon_intensities corresponding to the same datetime object used to access energy_measurements, meaning that it the carbon intensity value accessed must be for the past hour as well. This makes sense, but it is not apparent from the API contract.

Assuming that I understood things correctly, for the purposes of this PR, I think documenting this assumption in the CarbonIntensityProvider method is sufficient. But, in an ideal implementation, what the CarbonIntensityProvider API should provide is a dictionary that maps time ranges to carbon intensity values and the polling process should handle that correspondingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my code, hour_floor is used as the key and denotes the beginning of the windows of energy measurements. At the end of that hour, the ZeusMonitor.end_window() for each active window is called and the measurement is record with the hour_floor used at the beginning of the window. Example: 4PM is hour_floor and when ZeusMonitor.begin_window() are called for every active window. At 5PM, ZeusMonitor.end_window() is called for every active window, but each measurement collected is recorded into energy_measurements with the key 4PM. Hence, wouldn’t using hour_floor as the key mean each datetime key refers to the energy consumed within that hour and not the energy consumed with the previous hour? (i.e. if datetime key was 4PM in energy_measurements, energy consumed is measured in the time range from 4PM to 5PM and not from 3PM to 4PM?)

Copy link
Member

@jaywonchung jaywonchung Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, together with how ElectricityMaps return their carbon intensity data, then this would be correct. My point about API contract stands.

Copy link
Member

@jaywonchung jaywonchung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautifully done! Thanks again for your contribution.

@jaywonchung jaywonchung merged commit c5e09ed into master Dec 22, 2024
3 checks passed
@jaywonchung jaywonchung deleted the carbon_emission_monitor branch December 22, 2024 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants