Skip to content

Commit

Permalink
[MLlib] fix python example of ALS in guide
Browse files Browse the repository at this point in the history
fix python example of ALS in guide, use Rating instead of np.array.

Author: Davies Liu <[email protected]>

Closes apache#4226 from davies/fix_als_guide and squashes the following commits:

1433d76 [Davies Liu] fix python example of als in guide
  • Loading branch information
Davies Liu authored and mengxr committed Jan 27, 2015
1 parent ff356e2 commit fdaad4e
Showing 1 changed file with 5 additions and 6 deletions.
11 changes: 5 additions & 6 deletions docs/mllib-collaborative-filtering.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,23 +192,22 @@ We use the default ALS.train() method which assumes ratings are explicit. We eva
recommendation by measuring the Mean Squared Error of rating prediction.

{% highlight python %}
from pyspark.mllib.recommendation import ALS
from numpy import array
from pyspark.mllib.recommendation import ALS, Rating

# Load and parse the data
data = sc.textFile("data/mllib/als/test.data")
ratings = data.map(lambda line: array([float(x) for x in line.split(',')]))
ratings = data.map(lambda l: l.split(',')).map(lambda l: Rating(int(l[0]), int(l[1]), float(l[2])))

# Build the recommendation model using Alternating Least Squares
rank = 10
numIterations = 20
model = ALS.train(ratings, rank, numIterations)

# Evaluate the model on training data
testdata = ratings.map(lambda p: (int(p[0]), int(p[1])))
testdata = ratings.map(lambda p: (p[0], p[1]))
predictions = model.predictAll(testdata).map(lambda r: ((r[0], r[1]), r[2]))
ratesAndPreds = ratings.map(lambda r: ((r[0], r[1]), r[2])).join(predictions)
MSE = ratesAndPreds.map(lambda r: (r[1][0] - r[1][1])**2).reduce(lambda x, y: x + y)/ratesAndPreds.count()
MSE = ratesAndPreds.map(lambda r: (r[1][0] - r[1][1])**2).reduce(lambda x, y: x + y) / ratesAndPreds.count()
print("Mean Squared Error = " + str(MSE))
{% endhighlight %}

Expand All @@ -217,7 +216,7 @@ signals), you can use the trainImplicit method to get better results.

{% highlight python %}
# Build the recommendation model using Alternating Least Squares based on implicit ratings
model = ALS.trainImplicit(ratings, rank, numIterations, alpha = 0.01)
model = ALS.trainImplicit(ratings, rank, numIterations, alpha=0.01)
{% endhighlight %}
</div>

Expand Down

0 comments on commit fdaad4e

Please sign in to comment.