random_pairs_without_replacement_large_frames does not create random_state when it's not given as a parameter #128

pauldg · 2020-01-21T08:44:38Z

I've ran the following code:

(len(df_a), len(df_b))

(270351, 1555850)

indexer = rl.Index()
indexer.add(Random(n=10, replace=False))
candidate_links = indexer.index(df_a, df_b)

AttributeError Traceback (most recent call last)
in

~/.local/lib/python3.7/site-packages/recordlinkage/base.py in index(self, x, x_link)
122 pairs = None
123 for cl_alg in self.algorithms:
--> 124 pairs_i = cl_alg.index(x, x_link)
125
126 if pairs is None:

~/.local/lib/python3.7/site-packages/recordlinkage/base.py in index(self, x, x_link)
343 if not self._deduplication(x):
344
--> 345 pairs = self._link_index(*x)
346 names = self._make_index_names(x[0].index.name, x[1].index.name)
347

~/.local/lib/python3.7/site-packages/recordlinkage/index.py in _link_index(self, df_a, df_b)
420 else:
421 pairs = random_pairs_without_replacement_large_frames(
--> 422 self.n, shape, self.random_state)
423
424 levels = [df_a.index.values, df_b.index.values]

~/.local/lib/python3.7/site-packages/recordlinkage/algorithms/indexing.py in random_pairs_without_replacement_large_frames(n, shape, random_state)
82 # because the duplicates are dropped).
83 n_sample_size = (n - len(sample)) * 2
---> 84 sample = random_state.randint(n_max, size=n_sample_size)
85
86 # concatenate pairs and deduplicate

AttributeError: 'NoneType' object has no attribute 'randint'

In the source code I did not find any creation of the random_state.

The text was updated successfully, but these errors were encountered:

J535D165 · 2020-02-02T21:47:13Z

Thanks for reporting. This is indeed a bug. The PR fixes your bug and a few other issues regarding this function.

J535D165 mentioned this issue Feb 2, 2020

Fix bug in low memory random sampling #130

Merged

J535D165 closed this as completed in #130 Feb 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

random_pairs_without_replacement_large_frames does not create random_state when it's not given as a parameter #128

random_pairs_without_replacement_large_frames does not create random_state when it's not given as a parameter #128

pauldg commented Jan 21, 2020 •

edited

Loading

J535D165 commented Feb 2, 2020

random_pairs_without_replacement_large_frames does not create random_state when it's not given as a parameter #128

random_pairs_without_replacement_large_frames does not create random_state when it's not given as a parameter #128

Comments

pauldg commented Jan 21, 2020 • edited Loading

J535D165 commented Feb 2, 2020

pauldg commented Jan 21, 2020 •

edited

Loading