Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType' #204

Closed
Aldemaro14 opened this issue Jul 24, 2020 · 2 comments

Comments

@Aldemaro14
Copy link

Hello good people, I'm trying to train this model with my own data, after some issues, now i got the following when running the following:

`(venv) C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark>python train.py --train_data ../result --valid_data ../result_val --Transformation None --FeatureExtraction VGG --SequenceModeling BiLSTM --Prediction CTC
Filtering the images containing characters which are not in opt.character
Filtering the images whose label is longer than opt.batch_max_length

dataset_root: ../result
opt.select_data: ['/']
opt.batch_ratio: ['1']

dataset_root: ../result dataset: /
None
Traceback (most recent call last):
File "train.py", line 304, in
train(opt)
File "train.py", line 31, in train
train_dataset = Batch_Balanced_Dataset(opt)
File "C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark\dataset.py", line 42, in init
_dataset, _dataset_log = hierarchical_dataset(root=opt.train_data, opt=opt, select_data=[selected_d])
File "C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark\dataset.py", line 118, in hierarchical_dataset
dataset = LmdbDataset(dirpath, opt)
File "C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark\dataset.py", line 143, in init
nSamples = int(txn.get('num-samples'.encode()))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'`

also i was able to generate the lmdb dataset

image

@Aldemaro14
Copy link
Author

##172 (comment)

Also i did followed this advice and got another issue

(venv) C:\Users\itres\Desktop\OCR\craft_crnn\deep-text-recognition-benchmark>python create_lmdb_dataset.py --inputPath ../data/training_data --gtFile ../data/training_data/gt.txt --outputPath result/ Traceback (most recent call last): File "create_lmdb_dataset.py", line 89, in <module> fire.Fire(createDataset) File "C:\Users\itres\Desktop\OCR\craft_crnn\venv\lib\site-packages\fire\core.py", line 138, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "C:\Users\itres\Desktop\OCR\craft_crnn\venv\lib\site-packages\fire\core.py", line 463, in _Fire component, remaining_args = _CallAndUpdateTrace( File "C:\Users\itres\Desktop\OCR\craft_crnn\venv\lib\site-packages\fire\core.py", line 672, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "create_lmdb_dataset.py", line 49, in createDataset imagePath, label = datalist[i].strip('\n').split('\t', 1) ValueError: not enough values to unpack (expected 2, got 1)

it worked properly with space just by changing:
from this
imagePath, label = datalist[i].strip('\n').split('\t')
to this
imagePath, label = datalist[i].strip('\n').split(' ', 1)

@Aldemaro14
Copy link
Author

Fixed.

1st- you can train the model using PNG+space instead of PNG+tab, just use the code abobe.

2nd- the issue was that I was pointing to the wrong folder.....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants