Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何从原数据获取DIN数据 #21

Open
xhygh opened this issue Dec 6, 2018 · 10 comments
Open

如何从原数据获取DIN数据 #21

xhygh opened this issue Dec 6, 2018 · 10 comments

Comments

@xhygh
Copy link

xhygh commented Dec 6, 2018

我下载了数据集,但是aliccp文件里面读取数据部分和数据集命名,格式都不同,无法处理,请问是否更换了数据集?
criteo数据集里只有reademe.txt,train.txt,test.txt,并没有aliccp中的*-*命名,其中也没有“,”分隔符

@lambdaji
Copy link
Owner

lambdaji commented Dec 7, 2018

DIN用的不是criteo数据集,参考DeepMTL

@xhygh
Copy link
Author

xhygh commented Dec 11, 2018

您好,我下载了链接里面的天池数据,但是他的命名结构依然不是get_tfrecord.py中的"-"格式。是否官方更换了数据,或者您可以提供下部分处理好的数据吗?希望可以先跑通代码,看看模型结构。

@lambdaji
Copy link
Owner

参考DeepMTL/feature_pipline

@xhygh
Copy link
Author

xhygh commented Dec 11, 2018

参考了get_tfrecord.py,看你DeepMTL说明是跑这个,但是数据格式不对。

@xhygh
Copy link
Author

xhygh commented Dec 11, 2018

我看里面有.sh文件使用了hadoop,如果是处理aliccp数据的可以请您给下使用方法吗?我不会用这个,还要请别人帮忙跑。例如是get_join那几个文件。

@lambdaji
Copy link
Owner

原始文件 -----> libsvm ----> tfrecords
用什么工具跑不重要,sh+hadoop就是为了把原始数据转成libsvm数据格式(40362692,0,0,216:9342395:1.0 301:9351665:1.0 205:7702673:1.0 206:8317829:1.0 207:8967741:1.0 508:9356012:2.30259),然后get_tfrecords.py再把libsvm转成tfrecord

@xhygh
Copy link
Author

xhygh commented Dec 12, 2018

我看您写的get_join以为是“原始数据——>libsvm”的处理,如果您写好了,我想通过已有的设备先把数据处理部分略过,首先关注模型本身。
我通过看您的代码知道是这么个处理过程,不过我没有用过.sh部分的操作,所以想先跳过,之后再花时间学,毕竟重点在模型而不是数据。

@xhygh
Copy link
Author

xhygh commented Dec 12, 2018

解决方法:跳过天池数据集数据处理部分,使用假数据,例如1,0,0,216:19:1.0 301:11:1.0 205:10:1.0 206:16:1.0 207:17:1.0 508:23:2.30259 210:19:1.0 210:20:1.0 210:21:1.0 210:22:1.0 210:24:1.0 127_14:14:2.3979 127_14:25:2.70805 多粘贴几行,能跑就行。

@xhygh xhygh closed this as completed Dec 12, 2018
@zhenghang
Copy link

解决方法:跳过天池数据集数据处理部分,使用假数据,例如1,0,0,216:19:1.0 301:11:1.0 205:10:1.0 206:16:1.0 207:17:1.0 508:23:2.30259 210:19:1.0 210:20:1.0 210:21:1.0 210:22:1.0 210:24:1.0 127_14:14:2.3979 127_14:25:2.70805 多粘贴几行,能跑就行。

用你的办法,还是模型跑不通,请问一下,你这边跑通了吗

@xhygh xhygh reopened this Dec 13, 2018
@xhygh
Copy link
Author

xhygh commented Dec 13, 2018

解决方法:跳过天池数据集数据处理部分,使用假数据,例如1,0,0,216:19:1.0 301:11:1.0 205:10:1.0 206:16:1.0 207:17:1.0 508:23:2.30259 210:19:1.0 210:20:1.0 210:21:1.0 210:22:1.0 210:24:1.0 127_14:14:2.3979 127_14:25:2.70805 多粘贴几行,能跑就行。

用你的办法,还是模型跑不通,请问一下,你这边跑通了吗

我跑这个代码会出现
INFO:tensorflow:Saving checkpoints for 0 into 20181212\model.ckpt.
INFO:tensorflow:Loss for final step: None.
WARNING:tensorflow:No new checkpoint ready for evaluation. Skip the current evaluation pass as evaluation results are expected to be same for the same checkpoint.
Traceback (most recent call last):
File "D:/workspace/tmpsfk/tf_repos/deep_ctr/Model_pipeline/DIN.py", line 401, in
tf.app.run()
File "D:\python\anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "D:/workspace/tmpsfk/tf_repos/deep_ctr/Model_pipeline/DIN.py", line 377, in main
tf.estimator.train_and_evaluate(Estimator, train_spec, eval_spec)
File "D:\python\anaconda3\lib\site-packages\tensorflow\python\estimator\training.py", line 447, in train_and_evaluate
return executor.run()
File "D:\python\anaconda3\lib\site-packages\tensorflow\python\estimator\training.py", line 531, in run
return self.run_local()
File "D:\python\anaconda3\lib\site-packages\tensorflow\python\estimator\training.py", line 687, in run_local
'Eval status: {}'.format(eval_result.status))
RuntimeError: There was no new checkpoint after the training. Eval status: no new checkpoint

Process finished with exit code 1
这个问题,最后放到了官方的wide&deep里面跑的,吧model_fn和input_fn换掉,export_model注释掉,就好了,我就是想看网络结构。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants