Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datasets) Add semantic partitioner #3663

Open
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

KarhouTam
Copy link
Contributor

@KarhouTam KarhouTam commented Jun 21, 2024

New Feature

Implement semantic partitioning scheme (named SemanticPartitioner).

Semantic partitioning scheme is proposed by What Do We Mean by Generalization in Federated Learning? (accepted by ICLR 2022)

(Cited from the paper)
Semantic partitioner's goal is to reverse-engineer the federated dataset-generating process so that each client possesses semantically similar data.

Checklist

  • Implement proposed change
  • Write tests
  • Update documentation
  • Make CI checks pass
  • Ping maintainers on Slack (channel #contributions)

Any other comments?

I have to mention that this partitioner has a strong dependency on the PyTorch library, so I don't know if this is allowed or not.

It's my first time to contribute flower with new feature.

Happy to see contributors's comments and suggestions ❤.

@KarhouTam KarhouTam force-pushed the semantic-partitioner branch 7 times, most recently from 35ce9fc to 2edf546 Compare June 21, 2024 09:20
@KarhouTam

This comment was marked as resolved.

@KarhouTam KarhouTam marked this pull request as ready for review June 21, 2024 09:41
@KarhouTam KarhouTam force-pushed the semantic-partitioner branch from c5758de to b4cfd2f Compare July 1, 2024 14:40
@KarhouTam
Copy link
Contributor Author

Some checks failed due to the strong dependency on pytorch, sklearn and scipy.

Comment on lines 265 to 271
with torch.no_grad():
for i in range(0, images.shape[0], self._batch_size):
idxs = list(range(i, min(i + self._batch_size, images.shape[0])))
batch = torch.tensor(images[idxs], dtype=torch.float, device=device)
if batch.shape[1] == 1:
batch = batch.broadcast_to((batch.shape[0], 3, *batch.shape[2:]))
embedding_list.append(efficient_net(batch).cpu().numpy())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, what is the reason that you handle batches "by hand" instead of using DataLoader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using DataLoader needs to trans images to torch.utils.data.Dataset first and is better used in multiple round data loading. Here I just want to traverse images one time. Anyway, just my habit.

@adam-narozniak
Copy link
Contributor

@KarhouTam I left a few more comments and questions.

@adam-narozniak
Copy link
Contributor

I wanted to merge the code, but checking the results visually on the mnist dataset doesn't seem to produce results such that the same partition has images in the same style (D.3 Visualization of Semantically Partitioned MNIST Dataset from the paper). I'm gonna investigate it further but we should be able to produce something similar.

@KarhouTam
Copy link
Contributor Author

Maybe some hyperparameter settings are not the same. I remember that I reproduced this partitioner by the official repo: https://github.com/google-research/federated/tree/master/generalization.

@adam-narozniak
Copy link
Contributor

Hi, I haven't found the problem; however, before resolving that, we won't be able to merge this PR. We need to ensure that we can reproduce these results.

@KarhouTam
Copy link
Contributor Author

Hi, @adam-narozniak.
I've followed the source and fix some different process codes in the partitioner.

Also I evaluate the partitioner with the same settings in the paper and here is the graph:
image

@KarhouTam
Copy link
Contributor Author

Hi, @adam-narozniak. I noticed that this PR has not been updated after my commits and comments on the partitioner performance. Is there still any unresolved issues with this PR?

@adam-narozniak
Copy link
Contributor

What parameters did you use? In the paper, directly, I saw just information about the number of clients = 300 clients and it gives me this plot (which is not convincing)
image

@KarhouTam
Copy link
Contributor Author

KarhouTam commented Nov 27, 2024

Hi, @adam-narozniak, I reviewed the source code from Google and noticed some parameter discrepancies between the paper and the code. I’ve updated them based on the official source code.

Could you please help test the latest version with 300 clients on MNIST? My laptop doesn’t have enough memory to handle the EfficientNet-B7 model with 300 partition settings. Most parameters don’t require any changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants