-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added real2nat/real2int and stdlib expansion #117
Conversation
chi distributions added to stdlib
adding arcsin, standard normal, log-normal
pascal distributions
Merge in fix for Gamma-Poisson
Merging in latest changes
@@ -114,6 +114,8 @@ primTable = | |||
,("nat2real", primCoerce cNat2Real) | |||
,("nat2prob", primCoerce cNat2Prob) | |||
,("nat2int", primCoerce cNat2Int) | |||
,("real2int", primUnsafe cInt2Real) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These seem backwards -- why is it called real2int but then hooked to cInt2Real ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried cReal2Int at first and got an error saying it wasn't defined. Then I noticed line 110 has:
("int2nat", primUnsafe cNat2Int)
So I tried the same thing for real2int and it worked. Other than occasional rounding errors, these functions give the desired result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be because they are actually identity functions in the underlying representation! But that line 110 feels like a bug too. I understand that you were misled by it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, shall I remove this then? Removing it doesn't affect the functionality of the functions I was using it for. But my choose function does return real values with lots of trailing decimal places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok to remove real2int
, but it seems that
,("int2nat", primUnsafe cNat2Int)
,("real2prob", primUnsafe cProb2Real)
,("real2int", primUnsafe cInt2Real)
are all correct given the way Language.Hakaru.Syntax.TypeCheck
handles U.UnsafeTo_
. If you take a look at the source there you'll see that dom
and cod
are used in the opposite way compared to how they are used for U.CoerceTo_
.
stdlib/betaTransformations.hk
Outdated
X <~ beta(a,b) | ||
return X / (1 - X) | ||
|
||
def stdUniform(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that stdUniform
is equivalent to a beta
should not be part of the code of the standard library, but instead should be in a test.
stdlib/betaTransformations.hk
Outdated
return p | ||
|
||
|
||
invBeta(2,2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The standard library should contain no raw calls to distributions - it should only contain definitions.
stdlib/chiDistributions.hk
Outdated
X <~ stdCauchy() | ||
return a + alpha*X | ||
|
||
T_distribution(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above - no raw calls.
Also, why are all of these defined in this one file? Comments would be useful!!!
stdlib/discreteDistributions.hk
Outdated
# Limitation: n <= 20 | ||
# If return w/ real2nat, there are rounding errors | ||
def choose(n nat, k nat): | ||
real2nat((product i from 1 to n+1: i)/((product i from 1 to int2nat(n-k+1): i)*(product i from 1 to k+1: i))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the 'wrong' way to define choose. What you wrote is the mathematical definition, which is a very rotten way of actually computing it. Please re-implement.
Also, you should never use real2nat
. This is a sign of a bug.
stdlib/histogram.py
Outdated
@@ -0,0 +1,31 @@ | |||
# Histogram plotting tool to check the shape of distributions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful -- but does it belong here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can put it in the test folder
stdlib/standard_normal.hk
Outdated
@@ -0,0 +1,18 @@ | |||
# Standard Normal Distribution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this duplicating things in chiDistribution.hk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove these from chiDistribution.hk since these are simpler implementations
As I indicated on the details of PR itself, there are flaws with this. It would be best if you partitioned your PRs into smaller chunks, so that we can accept the good parts, and continue the discussion on the parts that need tweaking. |
stdlib/chiDistributions.hk
Outdated
|
||
def cauchy(a real, alpha prob): | ||
X <~ stdCauchy() | ||
return a + alpha*X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all together because they are related. I can seperate them out (i.e. central chi from the non-central chi). For the Cauchy and T distributions, I'd like to put them in their own file (chiTransformations) but that would require having a way to import functions from other files, which if there is, I don't know how to do. Haven't seen anything in the documentation about it.
stdlib/histogram.py
Outdated
@@ -0,0 +1,31 @@ | |||
# Histogram plotting tool to check the shape of distributions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can put it in the test folder
@@ -114,6 +114,8 @@ primTable = | |||
,("nat2real", primCoerce cNat2Real) | |||
,("nat2prob", primCoerce cNat2Prob) | |||
,("nat2int", primCoerce cNat2Int) | |||
,("real2int", primUnsafe cInt2Real) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, shall I remove this then? Removing it doesn't affect the functionality of the functions I was using it for. But my choose function does return real values with lots of trailing decimal places
stdlib/standard_normal.hk
Outdated
@@ -0,0 +1,18 @@ | |||
# Standard Normal Distribution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove these from chiDistribution.hk since these are simpler implementations
|
||
def choose(n nat, k nat): | ||
# TODO error check k <= n | ||
(product i from k+1 to n+1: i)/(product i from 1 to int2nat(n-k+1): i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a better definition for choose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still computationally quite bad. Best is likely to assume it as a primitive (i.e. that binomial
takes 2 nat
and returns a nat
). And implement that primitive in Haskell, by using the implementation from Math.Combinatorics.Exact.Binomial
(from the exact-combinatorics package). See https://hackage.haskell.org/package/exact-combinatorics-0.2.0.8/docs/Math-Combinatorics-Exact-Binomial.html if you are curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First let me note that what is mathematically the same thing may be best defined differently depending on how the program is going to be interpreted -- by running and sampling, by simplification and human reading, or in some other way. For simplification and human reading, this definition of choose
in terms of product
is fine. For running and sampling, if it's ok to have some rounding errors for large inputs, it's better to use betaFunc
. According to https://en.wikipedia.org/wiki/Beta_function,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which makes me think that we might be better off implementing choose
as a primitive!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to tell without knowing how it would be used.
Yes, please remove |
More generally, Hakaru really (really!) needs some kind of 'import' facility. I think this work may well push that need "over the edge", so that it gets implemented. |
Added real2int and real2nat coercions to hakaru. Rounding errors need to be resolved
Added functions factorial() and choose() for discrete distributions
Adding a bunch of discrete distributions (binomial + transformations)
Adding beta distribution transformations (inverse Beta)
Adding gamma distribution transformations (inverse Gamma, erlang)
Adding some chi distribution transformations (F distribution, T distribution, Cauchy distribution)
Adding uniform distribution transformations (triangle, TSP (incomplete), pareto, standard Power, gompertz)