Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UserWarning: Lambda function is not supported by pickle, please use regular python function or functools.partial instead #953

Closed
austinmw opened this issue Jan 19, 2023 · 6 comments

Comments

@austinmw
Copy link

🐛 Describe the bug

When I run:

from torchdata.datapipes.iter import HttpReader

URL = "https://raw.githubusercontent.com/mhjabreel/CharCnn_Keras/master/data/ag_news_csv/train.csv"
ag_news_train = HttpReader([URL]).parse_csv().map(lambda t: (int(t[0]), " ".join(t[1:])))
agn_batches = ag_news_train.batch(2).map(lambda batch: {'labels': [sample[0] for sample in batch],\
                                      'text': [sample[1].split() for sample in batch]})

batch = next(iter(agn_batches))
assert batch['text'][0][0:8] == ['Wall', 'St.', 'Bears', 'Claw', 'Back', 'Into', 'the', 'Black']

I get the following:

UserWarning: Lambda function is not supported by pickle, please use regular python function or functools.partial instead.

Versions

Python 3.8.0
torch 2.0.0.dev20230119+cu116
torchdata # 0.6.0.dev20230119

@ejguan
Copy link
Contributor

ejguan commented Jan 19, 2023

Please try to reduce the usage of lambda function in the pipeline, which is unpicklable -> can't do multiprocessing.

You can replace your lambda functions with

def map_fn1(t):
    return (int(t[0]), " ".join(t[1:]))

def map_fn2(batch):
    ...

@austinmw
Copy link
Author

austinmw commented Jan 19, 2023

@ejguan But I am literally copying the "sanity check" example directly from TorchData's GitHub homepage..

I guess that is not an up-to-date/recommended way to use this library?

@ejguan
Copy link
Contributor

ejguan commented Jan 19, 2023

Fair point that we should improve the part of sanity check. cc: @NivekT since you are working on README right now, we might remove the sanity check part and ask users to refer to examples/online doc.

For reference, we have a folder of examples in https://github.com/pytorch/data/tree/main/examples
Our online doc has amount of examples as well https://pytorch.org/data/main/

@austinmw
Copy link
Author

Thanks, I will refer to those examples!

@NivekT
Copy link
Contributor

NivekT commented Jan 19, 2023

Added the fix to #954

@ejguan
Copy link
Contributor

ejguan commented Jan 20, 2023

Closing as the sanity check has been removed from README

@ejguan ejguan closed this as completed Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants