We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I've ran the following code:
(len(df_a), len(df_b))
(270351, 1555850)
indexer = rl.Index() indexer.add(Random(n=10, replace=False)) candidate_links = indexer.index(df_a, df_b)
indexer = rl.Index()
indexer.add(Random(n=10, replace=False))
candidate_links = indexer.index(df_a, df_b)
AttributeError Traceback (most recent call last) in ~/.local/lib/python3.7/site-packages/recordlinkage/base.py in index(self, x, x_link) 122 pairs = None 123 for cl_alg in self.algorithms: --> 124 pairs_i = cl_alg.index(x, x_link) 125 126 if pairs is None: ~/.local/lib/python3.7/site-packages/recordlinkage/base.py in index(self, x, x_link) 343 if not self._deduplication(x): 344 --> 345 pairs = self._link_index(*x) 346 names = self._make_index_names(x[0].index.name, x[1].index.name) 347 ~/.local/lib/python3.7/site-packages/recordlinkage/index.py in _link_index(self, df_a, df_b) 420 else: 421 pairs = random_pairs_without_replacement_large_frames( --> 422 self.n, shape, self.random_state) 423 424 levels = [df_a.index.values, df_b.index.values] ~/.local/lib/python3.7/site-packages/recordlinkage/algorithms/indexing.py in random_pairs_without_replacement_large_frames(n, shape, random_state) 82 # because the duplicates are dropped). 83 n_sample_size = (n - len(sample)) * 2 ---> 84 sample = random_state.randint(n_max, size=n_sample_size) 85 86 # concatenate pairs and deduplicate AttributeError: 'NoneType' object has no attribute 'randint'
AttributeError Traceback (most recent call last) in
~/.local/lib/python3.7/site-packages/recordlinkage/base.py in index(self, x, x_link) 122 pairs = None 123 for cl_alg in self.algorithms: --> 124 pairs_i = cl_alg.index(x, x_link) 125 126 if pairs is None:
~/.local/lib/python3.7/site-packages/recordlinkage/base.py in index(self, x, x_link) 343 if not self._deduplication(x): 344 --> 345 pairs = self._link_index(*x) 346 names = self._make_index_names(x[0].index.name, x[1].index.name) 347
~/.local/lib/python3.7/site-packages/recordlinkage/index.py in _link_index(self, df_a, df_b) 420 else: 421 pairs = random_pairs_without_replacement_large_frames( --> 422 self.n, shape, self.random_state) 423 424 levels = [df_a.index.values, df_b.index.values]
~/.local/lib/python3.7/site-packages/recordlinkage/algorithms/indexing.py in random_pairs_without_replacement_large_frames(n, shape, random_state) 82 # because the duplicates are dropped). 83 n_sample_size = (n - len(sample)) * 2 ---> 84 sample = random_state.randint(n_max, size=n_sample_size) 85 86 # concatenate pairs and deduplicate
AttributeError: 'NoneType' object has no attribute 'randint'
In the source code I did not find any creation of the random_state.
The text was updated successfully, but these errors were encountered:
Thanks for reporting. This is indeed a bug. The PR fixes your bug and a few other issues regarding this function.
Sorry, something went wrong.
Successfully merging a pull request may close this issue.
I've ran the following code:
(len(df_a), len(df_b))
indexer = rl.Index()
indexer.add(Random(n=10, replace=False))
candidate_links = indexer.index(df_a, df_b)
In the source code I did not find any creation of the random_state.
The text was updated successfully, but these errors were encountered: