训练好测试显示全是标点符号。。。 #6

shizhediao · 2018-04-12T02:33:07Z

如图所示，输入‘你好’
但是输出全是数字和符号。。。
不知道是什么问题
求解
谢谢

qhduan · 2018-04-13T01:03:48Z

不知道你用chatbot还是chatbot_cut

cut那个，我调的比较少，问题比较大

当然，不cut的结果没好多少，至少稍微稳定一些

shizhediao · 2018-04-13T02:03:02Z

我用的chatbot。。。
所有过程按照readme来的，一步不差
但是全是符号让人有点困惑。。。
这个可以说中文吗，还是只能说英文呢

qhduan · 2018-04-13T02:04:32Z

可能是准备数据的问题，如果是windows下，某些步骤可能有编码问题

shizhediao · 2018-04-13T02:05:49Z

嗯，是linux ubuntu 14.04
用gpu训练的
准备数据可能是什么问题呢
我按照这个流程来的😀

shizhediao · 2018-04-13T02:06:24Z

另外麻烦看一下这个问题
实在打扰了哈哈
qhduan/Seq2Seq_Chatbot_QA#21
谢谢！

qhduan · 2018-04-13T02:11:44Z

你看上面那个25%那部分，test.py会输出一部分训练数据（语料句子）

这个就有问题，我怀疑你训练数据那里可能就有问题（是说chatbot.pkl）

你可以随便打开一个python3 shell，然后

import pickle
t = pickle.load(open('chatbot.pkl', 'rb'))

类似这样，看看t里面结果是不是错的

如果是错的建议你也可以重新clone一遍项目，然后重做一遍数据准备试试看

我的运行test.py输出大概会是这样的：

qhduan-station-gpu-01% python3 test.py 
/home/qhduan/.local/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
畹 华 吾 侄
你 接 到 这 封 信 的 时 候
畹 华 吾 侄 ， 你 接 到 这 封 信 的 时 候
咱 们 梅 家 从 你 爷 爷 起
咱 们 梅 家 从 你 爷 爷 起
2018-04-13 10:06:39.736784: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-13 10:06:39.849721: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-13 10:06:39.850069: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties: 
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7845
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 6.69GiB
2018-04-13 10:06:39.850103: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-04-13 10:06:40.038533: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/device:GPU:0 with 6459 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-04-13 10:06:40.413912: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-04-13 10:06:40.414090: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/device:GPU:0 with 114 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-04-13 10:06:40.583457: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
try load model from ./s2ss_chatbot.ckpt
Input Chat Sentence:你好
[[   3 1456  562]] [3]
[[2067 2017  562  530 2483  456 3425    3]]
['</s>', '好', '你']
['我', '想', '你', '会', '有', '事', '的', '</s>']
Input Chat Sentence:

shizhediao · 2018-04-13T02:23:06Z

you're right！
看了一下chat.pkl全是逗号，没有有意义字符
我怀疑是不是系统语言的问题？我vim extract_conv.py ，代码都会乱码。。。

qhduan · 2018-04-13T02:30:50Z

linux下也可能会有编码问题吧，中间存取文件会有问题

我基本只测试了系统在utf-8下没问题

个人开发，精力有限

shizhediao · 2018-04-13T02:32:04Z

嗯好的，谢谢
我再尝试一下
打扰了！

sadxiaohu · 2019-03-20T08:51:59Z

嗯好的，谢谢
我再尝试一下
打扰了！

你好，请问你遇到的问题解决了么，我也遇到这样的问题，而且在linux下预处理训练数据就会出现编码问题

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

训练好测试显示全是标点符号。。。 #6

训练好测试显示全是标点符号。。。 #6

shizhediao commented Apr 12, 2018

qhduan commented Apr 13, 2018

shizhediao commented Apr 13, 2018 •

edited

Loading

qhduan commented Apr 13, 2018

shizhediao commented Apr 13, 2018

shizhediao commented Apr 13, 2018

qhduan commented Apr 13, 2018

shizhediao commented Apr 13, 2018

qhduan commented Apr 13, 2018

shizhediao commented Apr 13, 2018

sadxiaohu commented Mar 20, 2019

训练好测试显示全是标点符号。。。 #6

训练好测试显示全是标点符号。。。 #6

Comments

shizhediao commented Apr 12, 2018

qhduan commented Apr 13, 2018

shizhediao commented Apr 13, 2018 • edited Loading

qhduan commented Apr 13, 2018

shizhediao commented Apr 13, 2018

shizhediao commented Apr 13, 2018

qhduan commented Apr 13, 2018

shizhediao commented Apr 13, 2018

qhduan commented Apr 13, 2018

shizhediao commented Apr 13, 2018

sadxiaohu commented Mar 20, 2019

shizhediao commented Apr 13, 2018 •

edited

Loading