Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random_pairs_without_replacement_large_frames does not create random_state when it's not given as a parameter #128

Closed
pauldg opened this issue Jan 21, 2020 · 1 comment · Fixed by #130

Comments

@pauldg
Copy link

pauldg commented Jan 21, 2020

I've ran the following code:

(len(df_a), len(df_b))

(270351, 1555850)

indexer = rl.Index()
indexer.add(Random(n=10, replace=False))
candidate_links = indexer.index(df_a, df_b)


AttributeError Traceback (most recent call last)
in

~/.local/lib/python3.7/site-packages/recordlinkage/base.py in index(self, x, x_link)
122 pairs = None
123 for cl_alg in self.algorithms:
--> 124 pairs_i = cl_alg.index(x, x_link)
125
126 if pairs is None:

~/.local/lib/python3.7/site-packages/recordlinkage/base.py in index(self, x, x_link)
343 if not self._deduplication(x):
344
--> 345 pairs = self._link_index(*x)
346 names = self._make_index_names(x[0].index.name, x[1].index.name)
347

~/.local/lib/python3.7/site-packages/recordlinkage/index.py in _link_index(self, df_a, df_b)
420 else:
421 pairs = random_pairs_without_replacement_large_frames(
--> 422 self.n, shape, self.random_state)
423
424 levels = [df_a.index.values, df_b.index.values]

~/.local/lib/python3.7/site-packages/recordlinkage/algorithms/indexing.py in random_pairs_without_replacement_large_frames(n, shape, random_state)
82 # because the duplicates are dropped).
83 n_sample_size = (n - len(sample)) * 2
---> 84 sample = random_state.randint(n_max, size=n_sample_size)
85
86 # concatenate pairs and deduplicate

AttributeError: 'NoneType' object has no attribute 'randint'

In the source code I did not find any creation of the random_state.

@J535D165
Copy link
Owner

J535D165 commented Feb 2, 2020

Thanks for reporting. This is indeed a bug. The PR fixes your bug and a few other issues regarding this function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants