Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4348] [PySpark] [MLlib] rename random.py to rand.py #3216

Closed
wants to merge 1 commit into from

Conversation

davies
Copy link
Contributor

@davies davies commented Nov 12, 2014

This PR rename random.py to rand.py to avoid the side affects of conflict with random module, but still keep the same interface as before.

>>> from pyspark.mllib.random import RandomRDDs
$ pydoc pyspark.mllib.random
Help on module random in pyspark.mllib:
NAME
    random - Python package for random data generation.

FILE
    /Users/davies/work/spark/python/pyspark/mllib/rand.py

CLASSES
    __builtin__.object
        pyspark.mllib.random.RandomRDDs

    class RandomRDDs(__builtin__.object)
     |  Generator methods for creating RDDs comprised of i.i.d samples from
     |  some distribution.
     |
     |  Static methods defined here:
     |
     |  normalRDD(sc, size, numPartitions=None, seed=None)

cc @mengxr

reference link: http://xion.org.pl/2012/05/06/hacking-python-imports/

@davies
Copy link
Contributor Author

davies commented Nov 12, 2014

cc @JoshRosen

@SparkQA
Copy link

SparkQA commented Nov 12, 2014

Test build #23237 has started for PR 3216 at commit 7ac4e8b.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 12, 2014

Test build #23237 has finished for PR 3216 at commit 7ac4e8b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23237/
Test PASSed.

@mengxr
Copy link
Contributor

mengxr commented Nov 12, 2014

@davies I tried the following but failed:

In [1]: from pyspark.mllib.feature import Word2Vec
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1-ab62774bf8f7> in <module>()
----> 1 from pyspark.mllib.feature import Word2Vec

/Users/meng/src/spark/python/pyspark/mllib/__init__.py in <module>()
     32 import rand as random
     33 random.__name__ = 'random'
---> 34 random.RandomRDDs.__module__ = __name__ + '.random'
     35
     36

AttributeError: 'module' object has no attribute 'RandomRDDs'

This doesn't work either:

In [2]: from pyspark.mllib.random import RandomRDDs
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-f53a0b3853a0> in <module>()
----> 1 from pyspark.mllib.random import RandomRDDs

/Users/meng/src/spark/python/pyspark/mllib/__init__.py in <module>()
     32 import rand as random
     33 random.__name__ = 'random'
---> 34 random.RandomRDDs.__module__ = __name__ + '.random'
     35
     36

AttributeError: 'module' object has no attribute 'RandomRDDs'

I'm using IPython 2.1.0 with Python 2.7.8.

@davies
Copy link
Contributor Author

davies commented Nov 12, 2014

It works for me:

Python 2.7.5 (default, Mar  9 2014, 22:15:05)
Type "copyright", "credits" or "license" for more information.

IPython 2.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: from pyspark.mllib.feature import Word2Vec

In [2]: from pyspark.mllib.random import RandomRDDs

Could you try to import rand, and show where is it?

>>> import rand
>>> rand
>>> from pyspark.mllib.rand import RandomRDDs

@mengxr
Copy link
Contributor

mengxr commented Nov 12, 2014

It seems that I cannot import anything under mllib:

In [4]: import pyspark.mllib
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-932bde561ea1> in <module>()
----> 1 import pyspark.mllib

/Users/meng/src/spark/python/pyspark/mllib/__init__.py in <module>()
     32 import rand as random
     33 random.__name__ = 'random'
---> 34 random.RandomRDDs.__module__ = __name__ + '.random'
     35
     36

AttributeError: 'module' object has no attribute 'RandomRDDs'

@dbtsai
Copy link
Member

dbtsai commented Nov 13, 2014

It works for me as well.

᚛ |activeIterator *|$ ./bin/pyspark
Python 2.7.6 (default, Sep  9 2014, 15:04:36) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Warning: SPARK_MEM is deprecated, please use a more specific config option
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 1.2.0-SNAPSHOT
      /_/

Using Python version 2.7.6 (default, Sep  9 2014 15:04:36)
SparkContext available as sc.
>>> from pyspark.mllib.feature import Word2Vec
>>> from pyspark.mllib.random import RandomRDDs

@mengxr
Copy link
Contributor

mengxr commented Nov 13, 2014

It turned out I had a folder called rand under pyspark/mllib with __init__.pyc inside, which may have been staying there since we added this random module. After I removed that folder, everything works correctly. Thanks @davies for helping debug!

This looks good to me.

@mengxr
Copy link
Contributor

mengxr commented Nov 13, 2014

Merged into master and branch-1.2. Thanks!

@asfgit asfgit closed this in ce0333f Nov 13, 2014
asfgit pushed a commit that referenced this pull request Nov 13, 2014
This PR rename random.py to rand.py to avoid the side affects of conflict with random module, but still keep the same interface as before.

```
>>> from pyspark.mllib.random import RandomRDDs
```

```
$ pydoc pyspark.mllib.random
Help on module random in pyspark.mllib:
NAME
    random - Python package for random data generation.

FILE
    /Users/davies/work/spark/python/pyspark/mllib/rand.py

CLASSES
    __builtin__.object
        pyspark.mllib.random.RandomRDDs

    class RandomRDDs(__builtin__.object)
     |  Generator methods for creating RDDs comprised of i.i.d samples from
     |  some distribution.
     |
     |  Static methods defined here:
     |
     |  normalRDD(sc, size, numPartitions=None, seed=None)
```

cc mengxr

reference link: http://xion.org.pl/2012/05/06/hacking-python-imports/

Author: Davies Liu <[email protected]>

Closes #3216 from davies/random and squashes the following commits:

7ac4e8b [Davies Liu] rename random.py to rand.py

(cherry picked from commit ce0333f)
Signed-off-by: Xiangrui Meng <[email protected]>
JoshRosen pushed a commit to JoshRosen/spark that referenced this pull request Jan 12, 2015
This PR rename random.py to rand.py to avoid the side affects of conflict with random module, but still keep the same interface as before.

```
>>> from pyspark.mllib.random import RandomRDDs
```

```
$ pydoc pyspark.mllib.random
Help on module random in pyspark.mllib:
NAME
    random - Python package for random data generation.

FILE
    /Users/davies/work/spark/python/pyspark/mllib/rand.py

CLASSES
    __builtin__.object
        pyspark.mllib.random.RandomRDDs

    class RandomRDDs(__builtin__.object)
     |  Generator methods for creating RDDs comprised of i.i.d samples from
     |  some distribution.
     |
     |  Static methods defined here:
     |
     |  normalRDD(sc, size, numPartitions=None, seed=None)
```

cc mengxr

reference link: http://xion.org.pl/2012/05/06/hacking-python-imports/

Author: Davies Liu <[email protected]>

Closes apache#3216 from davies/random and squashes the following commits:

7ac4e8b [Davies Liu] rename random.py to rand.py

(cherry picked from commit ce0333f)
Signed-off-by: Josh Rosen <[email protected]>

Conflicts:
	python/pyspark/mllib/feature.py
	python/run-tests
asfgit pushed a commit that referenced this pull request Jan 13, 2015
…o branch-1.1

This backports #3216 and #3669 to `branch-1.1` in order to fix the PySpark unit tests.

Author: Joseph K. Bradley <[email protected]>
Author: Davies Liu <[email protected]>

Closes #4011 from JoshRosen/pyspark-rand-fix-1.1-backport and squashes the following commits:

ace4cb6 [Joseph K. Bradley] [SPARK-4821] [mllib] [python] [docs] Fix for pyspark.mllib.rand doc
7ae5a1c [Davies Liu] [SPARK-4348] [PySpark] [MLlib] rename random.py to rand.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants