Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure functions/effect system #1631

Closed
SoniEx2 opened this issue May 26, 2016 · 53 comments
Closed

Pure functions/effect system #1631

SoniEx2 opened this issue May 26, 2016 · 53 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@SoniEx2
Copy link

SoniEx2 commented May 26, 2016

Can we get pure fn which would be a way to indicate a function has no side-effects/can only call other pure fns?

It'd prohibit mutation of statics, pointers and references, and heap allocations. Traits would be able to demand pure fns, which would be useful for getters.

@burdges
Copy link

burdges commented May 26, 2016

Would unsafe would allow mutation or calls to non-pure functions inside a pure fn? I suppose say taking a reference to an Arc<T> is not pure since it must update the reference count?

@eddyb
Copy link
Member

eddyb commented May 26, 2016

@burdges Taking a reference to Arc<T> is free, .clone() does a refcount increment.

@eddyb
Copy link
Member

eddyb commented May 26, 2016

I believe the general solution for this which is not completely cumbersome, and can support various predicates on the behavior (another useful one is "can panic/abort") is an effect system, but I can't find open issues on anything like that.

@SoniEx2
Copy link
Author

SoniEx2 commented May 26, 2016

@burdges I guess unsafe could be used to say "this doesn't strictly follow pure rules, but we can guarantee it's still pure ourselves"...

@SoniEx2
Copy link
Author

SoniEx2 commented May 26, 2016

I guess unsafe blocks inside pure impls would be useful for RPC and stuff.

@burdges
Copy link

burdges commented May 26, 2016

I'd agree with @eddyb that one should work out how a (non-)effects system might fit into Rust.

A priori, I'd wager this requires first working out how higher-kinded types fit in, if only because effects are monads in Idris. Ain't clear if a non-effect like pure can or should be viewed a monad, comonad, etc., but I doubt you drop abstraction with the non.

There might be interesting papers on using HTKs to do non-effects like pure in non-pure functional languages. Ask @edwinb if he's seen anything similar. :)

@birkenfeld
Copy link

Did anything change since http://thread.gmane.org/gmane.comp.lang.rust.devel/3674/focus=3855 ?

@glaebhoerl
Copy link
Contributor

glaebhoerl commented May 26, 2016

Rust's affine types would in fact already enable having a really strong and flexible notion of purity that's well-integrated with the existing type system, without having to add any major new features or annotations on top of it. If you chase links from this recent discussion I had on reddit you can read more of my thoughts about it. TL;DR there'd be a zero-sized IO type (a capability token) that's not Copy or Clone, I/O and shared mutable state like Cell and RefCell would require it, and any fn and &Fn which don't receive IO as an argument could therefore not be anything other than pure. Further refinements possible.

But it's not something that could be easily retrofitted onto Rust 1.x.

@ticki
Copy link
Contributor

ticki commented May 26, 2016

I am against this because of the lack of generality. pure is just one of many effects systems, which could be useful. Ideally, these should be expressed through the type system (i.e., as traits implemented by function pointers and closures), but I am unsure how it can be "restricted" (in the sense that pure code can't call impure code, non-panic code can't call panicable code, and so on). In particular, there are two ways of breaking such an restriction. The first one being by propagating it, namely letting callers carry the same bound. The second way is local and non-propagating way, which might not always be allowed.

Either way, such an addition should be general and most importantly expressive, covering lots of possible use cases (an example is the insecure, denoting standard non-UB security problems, which is frequently asked for).

Finally, I could see such a feature being abused in one way or another, especially since it might not be consistent across libraries. One way that could be solved is to provide a medium sized collection of common "effect systems" (in the lack of a better term), but uniformity is still not guaranteed: each library might have its own definition of "secure" and "not secure", and these might or might not play well together.

@SoniEx2
Copy link
Author

SoniEx2 commented May 26, 2016

pure should be synonym for "mathematical function", and unsafe inside pure should let you use I/O for RPC.

@HallM
Copy link

HallM commented May 26, 2016

Design by contract may be a possible path for this by allowing a contract to flag all side effects a function may generate. With Spark14 (maybe Ada12 as well), you specify which globals/statics you may read from, which you may read & write, and which parameters you may write. Params passed in are assumed read access. Then would need the compiler to propagate those contracts and enforce that if B calls A, then B's contract must contain the same level of side effects as A in it's contract. For backwards compatibility, no listed contracts would have to be assumed as "anything goes".

I would not agree with allowing unsafe or impure blocks inside of a pure. It breaks the contract and purpose of saying the function is pure. I use layers where the outer-most layer handles IO/user input/anything external to the app, calls an internal pure-layer to do processing, then that returns something to return back out of the app.

@SoniEx2
Copy link
Author

SoniEx2 commented May 27, 2016

I think "unsafe pure" is "hey compiler, you can't check the purity of this but I'm a smart developer so trust me on this".

Allowing some RPCs to be pure would be nice, and "unsafe pure" does exactly that. In theory you'd need to use unsafe { unsafe { stuff here } } to be able to do unsafe things, in practice that'd be a pain for the compiler and you'd instead either have impure {} or just use a plain unsafe for both.

The compiler can't guarantee a RPC is pure because 1. it knows nothing about RPCs 2. it's implemented over I/O which's impure, but that doesn't mean a RPC isn't pure.

Take, for example, getting the title of a Guake tab. This must be implemented as a RPC, but it's still pure as it mutates nothing.

@killercup
Copy link
Member

Huh, I may be crazy, but: How much of the pure fn stuff is already available by using const fn?

@gsnoff
Copy link

gsnoff commented Aug 15, 2016

@killercup I think there are quite a lot of limitations in the current RFC for const fn per se (see here, for example), and it's also quite likely that any existing implementation for that is even more limited.

@killercup
Copy link
Member

That was my expectations as well. There are (unofficial) plans to use Miri
in the future for const eval, though.

Ilyas Gasanov [email protected] schrieb am Mo. 15. Aug. 2016 um
18:00:

@killercup https://github.com/killercup I think there are quite a lot
of limitations in the current RFC for const fn per se (see here
https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md#detailed-design,
for example), and it's also quite likely that any existing implementation
for that is even more limited.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1631 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AABOXwBaMWh268pLgzMbKhV31dIbqU58ks5qgI0fgaJpZM4Inbba
.

@gsnoff
Copy link

gsnoff commented Aug 15, 2016

Allowing some RPCs to be pure would be nice, and "unsafe pure" does exactly that. In theory you'd need to use unsafe { unsafe { stuff here } } to be able to do unsafe things, in practice that'd be a pain for the compiler and you'd instead either have impure {} or just use a plain unsafe for both.

@SoniEx2 In my opinion, plain unsafe should be the way to go for a number of reasons. One of the most important ones is that, for example, if we use the pure constraint on deref() in the Deref trait as a guarantee that each consequent call to it should yield the same value (see the above reference link on conversion traits), and we have some code which makes an assertion that this is always the case (safe bindings to FFI code being a major example of this, see e.g. here) - in this case using impure code can be as unsafe as one generally marked as unsafe.

@gsnoff
Copy link

gsnoff commented Aug 15, 2016

TL;DR there'd be a zero-sized IO type (a capability token) that's not Copy or Clone, I/O and shared mutable state like Cell and RefCell would require it, and any fn and &Fn which don't receive IO as an argument could therefore not be anything other than pure. Further refinements possible.

But it's not something that could be easily retrofitted onto Rust 1.x.

@glaebhoerl If traits can be implemented for fn types, a possible way of doing this would be having Io as a marker bound which would be implicitly implemented for all fn declarations, barring const fn ones, and then the pure keyword as a syntactic sugar for marking the type of fn as !Io. This would be similar to how Sized is implicit for type arguments in generics.

Either that, or we could use Pure trait with semantics inverse to that, with pure keyword being basically synonymous to marking the function type as #[derive(Pure)]. I think Sync works in a similar way.

Of course this implies that the types of a pure fn and an "impure" fn would be different, hence their signatures would not match even if otherwise similar. However I don't think that working around it would be hard; allowing linking an impure consumer with a pure producer (but not the other way around) should be a matter of little more than adding just another (generic) coercion rule.

@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Aug 17, 2016
@nrc nrc changed the title Pure fn Pure functions/effect system Aug 17, 2016
@nrc
Copy link
Member

nrc commented Aug 17, 2016

This has been much discussed in the past, and may even have existed in very old Rust. It is highly unlikely that we'd accept an RFC for this. The best way to make progress here (for anyone interested) would be to design and implement a system in a fork of the compiler and use that to prove utility. That would be huge, but very interesting task.

@ticki
Copy link
Contributor

ticki commented Aug 18, 2016

@nrc Rust used to have an effect system (pure fn).

@brendanzab
Copy link
Member

More fine grained effect types were discussed, and proposed, but never agreed upon or implemented. I have no idea how we could retrofit it now :(

@gsnoff
Copy link

gsnoff commented Aug 18, 2016

Speaking of effect types in general, apart from side effects there is another useful concept - the one of predictable timing/performance. It would be nice to be able to guarantee that any call to some fn won't take more time - or number of instructions - than a certain fixed maximum (and in the case of instructions that maximum can be determined statically after compilation). This would disallow stuff like arbitrary loops (unless compiler can reason that a loop is finite and has a maximum number of possible iterations), heap/resource allocations, calls to other fns which aren't marked as predictable, possibly some types of dereferencing too. As a bonus, some certain metadata would be written for every such fn after compilation, depending on maximum numbers of different instruction types &c in code path, which can then be used by profiling tools.

This is similar to certain goals that language subsets like MISRA C are designed to achieve. Common use cases for this are going to be interrupt and signal handlers, and generally whatever code constrained to real-time conditions.

@socumbersome
Copy link

socumbersome commented Feb 23, 2017

How do you guys feel about having effect system similar to Koka's? (see http://www.rise4fun.com/koka/tutorialcontent/guide or https://www.microsoft.com/en-us/research/wp-content/uploads/2016/08/algeff-tr-2016-v3.pdf).
Would that fit into Rust? Certainly nice thing about such design is that purity is just a special case then.

@leonardo-m
Copy link

Adding purity to D language, despite several successive refinements, was a big failure. If you want to add something like that to Rust you need at least to do it in a much more principled way. Like row (set) effects:
http://www.purescript.org/learn/eff/

@w0rp
Copy link

w0rp commented Sep 7, 2017

I wouldn't call pure functions in D "a big failure." Pure functions in D are very useful and interesting, and the rules for how they work are very logical. There are ways of implementing pure functions in programming languages, and you have to decide what it is you want to really implement.

@Centril
Copy link
Contributor

Centril commented Apr 26, 2018

As const fns are pure (== deterministic) functions, I'm closing this issue as resolved.

@Centril Centril closed this as completed Apr 26, 2018
@w0rp
Copy link

w0rp commented Apr 26, 2018

Isn't const a compile-time only function? If so, that's not the same thing. A pure function is any function which can be executed at runtime, which cannot produce side-effects, to some defined degree of success. ("Safe" code is never really safe, as you must at some point do an unsafe thing in safe code, and that's okay.)

@Centril
Copy link
Contributor

Centril commented Apr 26, 2018

@w0rp No, const fns are not limited to compile time execution and they can be given arguments that are evaluated at runtime. But they can't do things that require run-time execution. This happens to coincide with purity.

@leonardo-m
Copy link

"Pure functions in D are very useful": what they are actually used for and useful for in D? (I've tried to use them for years and I've seen people try to use them for years). In D standard library even in the few spots where a purity annotation is useful, it's not used.

@w0rp
Copy link

w0rp commented Apr 26, 2018

@Centril Okay, that could be considered some definition of pure, then.

@leonardo-m You don't need them, like how you don't need memory safety, but they are nice to have. I'm not sure I'd be able to convince you that they are useful, but I like them. I find them useful for documenting which functions may or may not have side effects, with compile time checking for that promise of no side effects. They are just difficult to use some times because too much code has side effects, even functions as simple as sin.

@SoniEx2
Copy link
Author

SoniEx2 commented Apr 26, 2018

Things that I would consider pure (but not const):

  • Getting an &-ptr from a box (i.e. Deref).
  • RPC calls to pure functions.
  • Interning/memoization of pure functions.

Things that I would consider const (but not pure):

  • Anything that involves &mut. Just because you have a &mut, doesn't mean it's not deterministic.

@Ixrec
Copy link
Contributor

Ixrec commented Apr 26, 2018

The more persuasive argument against a separate purity effect/annotation is that Rust already tried that before 1.0. I wasn't around back then, but I've heard that we ended up removing it because it wasn't that useful, no one could agree what it meant, and had composability issues that led to a de facto standard practice of marking everything as if it was impure. Unfortunately I can't find a source to cite for that right now.

@SoniEx2
Copy link
Author

SoniEx2 commented Apr 26, 2018

I guess we mostly have purity through immutable references already. Altho some ppl (diesel.rs >.>) don't use them correctly.

@CryZe
Copy link

CryZe commented Apr 26, 2018

Altho some ppl (diesel.rs >.>) don't use them correctly.

I very much doubt that. This sounds more like there's a misunderstanding. Rust doesn't have immutable references. Rust only has shared references, and you can very much use shared references to mutate state.

@SoniEx2
Copy link
Author

SoniEx2 commented Apr 26, 2018

I'd expect diesel to take &mut Connections but (apparently?) it takes &Connections.

I do consider "immutable" references to be a form of purity system. While you can use interior mutability for everything, which is like marking everything as pure and overriding the purity system (see #1631 (comment) which says RPC can be pure), I'd rather things used & for immutable-ish/pure and encapsulation (see eventbus crate), and &mut for mutable-ish.

@hasufell
Copy link

RPC calls to pure functions.

I think that is a problematic semantics of purity and might be the cause of confusion in the future.

The problem is, I can only find one real academic definiton of purity, which is in the realm of functional programming languages (lambda calculus) and I'm not sure how useful those semantics are: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.27.7800&rep=rep1&type=pdf

Comparing with existing non-academic definitions of imperative languages:

The loosest definition is probably the gcc one (which is slightly different to const functions, because it allows reading global variables), but even that assumes no side-effects.

@SoniEx2
Copy link
Author

SoniEx2 commented Apr 28, 2018

What's wrong with the ability to have pure RPC functions?

@hasufell
Copy link

What's wrong with the ability to have pure RPC functions?

Because they are inherently side-effect full (possible network calls and whatnot, even if they are hidden behind an abstraction). I haven't seen a definition of "purity" yet that allows side-effects. I'm not convinced rust is a good experiment to introduce such a foggy definition.

@SoniEx2
Copy link
Author

SoniEx2 commented Apr 29, 2018

network state doesn't affect data state (unless you're Wireshark)

@hasufell
Copy link

hasufell commented Apr 29, 2018

network state doesn't affect data state (unless you're Wireshark)

Even if that was true, it's irrelevant to the definition of purity. Networking is a side-effect and introduces additional sources of failure that you cannot easily reason about.

The function call may fail, regardless of input. How is that pure?

You're introducing a totally new definition that no language (or academical source) has previously deployed.

@Centril
Copy link
Contributor

Centril commented Apr 29, 2018

The function call may fail, regardless of input. How is that pure?

To be clear; it is not the failure itself that is the cause of impurity; Failure can easily be modelled in a partiality monad and can be handled purely.

What is the cause of impurity is that if you call net_fun(x) twice you may get two different results (including the server failing, or simply returning something else cause of changed state on the server side) despite x being the same in both calls.
That is, networking can easily violate the requirement of a pure function f : A -> B that:
forall x : A, y : A. x == y -> f(x) == f(y).

If you somehow manage to define a linear function f : UniqueToken x A -o B that consumes the UniqueToken and for which the same UniqueToken can truly never be created twice, then f should be a pure function. Of course, to generate a truly unique token, you need impurity, so you have moved out the impurity from f to somewhere else.

@hasufell
Copy link

To be clear; it is not the failure itself that is the cause of impurity; Failure can easily be modelled in a partiality monad and can be handled purely.

What is the cause of impurity is that if you call net_fun(x) twice you may get two different results

Failure that doesn't depend on input (or initial program state) automatically violates referential transparency, no matter if you model it as a monadic value or not. So yes, I think we kind of mean the same.

Expanding on that:

Purity is most often defined in terms of evaluation. That means: if evaluation doesn't cause execution (e.g. of network calls), then the function is still pure. In practice, that means: if you make effects explicit via e.g. an IO type, then you can still have a pure function that does network stuff, given that your evaluation model doesn't cause execution of that IO action. That would allow the compiler to recognize this and have correct optimisation on pure functions in either way.

Since we don't have an IO type (anymore) and evaluation and execution are not strictly separated, we cannot allow network calls.

@Centril
Copy link
Contributor

Centril commented Apr 29, 2018

Failure that doesn't depend on input (or initial program state) automatically violates referential transparency, no matter if you model it as a monadic value or not.

We are mostly in agreement, but I have a nit: This doesn't have to be the case. We can encode the following function:

alwaysError :: b -> Maybe a
alwaysError _ = Nothing

The use of alwaysError whatever is referentially transparent, but it does not depend on the value of b. However, it is true that the call by name/need and the call by value semantics differ as alwaysError undefined == Nothing for the former two and alwaysError undefined == _|_ for the latter.

I agree with your notes on practice, however, I think I showed that there can be pure networked functions given that they take an argument for which the value can never be repeated. I think this is a bizarre theoretical result which is boring and uninteresting, but well...

EDIT: In this bizarre and unpractical world, networkedFun ut is a pure expression,
but do ut <- mkUniqueToken; pure $ networkedFun ut is not pure.

@hasufell
Copy link

We are mostly in agreement, but I have a nit: This doesn't have to be the case. We can encode the following function:

You are right, my informal definition was maybe misleading. My point was rather that whether a function is pure or not doesn't depend on how you model failure (e.g. monadic). Which I think we still agree on.

However, it is true that the call by name/need and the call by value semantics differ

Kind of, but it still meets the formal definition of purity, since it's defined as the "(weak) equivalence of call-by-name, call-by-value and call-by-need" (weak equivalence = considering undefined to be equivalent).

I think I showed that there can be pure networked functions given that they take an argument for which the value can never be repeated.

Whether a value can be repeated in practice is irrelevant for the theory of functional purity. A function doesn't suddenly turn pure or impure depending on what exact instance the value has in the range of the type.
Even if you insist on this model (which I argue is flawed, because it pulls practical computability into a theoretical equivalence model), then you would have to say "I cannot say whether the function is pure or not, because I cannot repeat the function call with the exact values".

As such, every function that causes networking during evaluation is always impure. Do we agree on that?

@SoniEx2
Copy link
Author

SoniEx2 commented May 13, 2018

let's say you have a networked FFT function. that is, you call it with some set of values, and it calculates the FFT on a computing cluster.

is it pure? yes, because it's an FFT. same input, same output, always.

@hasufell
Copy link

let's say you have a networked FFT function. that is, you call it with some set of values, and it calculates the FFT on a computing cluster.

is it pure? yes, because it's an FFT. same input, same output, always.

Absolutely not. The syscall may fail, your computer may be offline, the cluster may be down, corrupted, ...

As such, calling this function twice may totally yield different results (including unexpected failure).

@SoniEx2
Copy link
Author

SoniEx2 commented May 13, 2018

oh yeah I can also point a heatgun to my RAM while running a pure local FFT function.

those failures don't exist in the abstract machine. they're failures of the physical system.

@hasufell
Copy link

oh yeah I can also point a heatgun to my RAM while running a pure local FFT function.

Execution is always impure. I'm not sure you listened to the conversation.

@SoniEx2
Copy link
Author

SoniEx2 commented May 13, 2018

networked FFT is pure, because if you consider the network as a whole as an abstract machine, it's no different from calling a local function.

@hasufell
Copy link

because if you consider the network as a whole as an abstract machine

It isn't. That's not how you design a language. That's how you design your own architecture. You are confusing totally different abstraction levels. Please.

@SoniEx2
Copy link
Author

SoniEx2 commented May 13, 2018

the function's implementation should not affect its purity. neither should side-channels. (e.g. checking if your FFT is running locally or on another machine based on CPU usage)

@Serentty
Copy link

@SoniEx2
The language has absolutely no way to know if the function is pure or not if it's remote. Also, the odds of a network failure are quite high compared to a local failure, so the odds of the assumption of purity being violated become unacceptably high.

@Swoorup
Copy link

Swoorup commented Jan 18, 2019

Using pure means the language has to guarantee or provide contracts that there would be no side-effects. Like @Serentty said if the language have no way to figure out the purity of function, its pointless to be able to mark a function as pure.

@Centril
Copy link
Contributor

Centril commented Jan 18, 2019

I think this thread has outlived it's usefulness.

@rust-lang rust-lang locked as resolved and limited conversation to collaborators Jan 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests