Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use compose-with-values instead of compose #191

Open
wants to merge 2 commits into
base: deforest-all-the-things
Choose a base branch
from

Conversation

countvajhula
Copy link
Collaborator

@countvajhula countvajhula commented Jan 16, 2025

Summary of Changes

This approach was suggested by @samth @notjack and @benknoble on Discord as an alternative to using compose, and which may be more compatible with Typed Racket.

Below is a script for basic benchmarking written by Sam (and modified to include using compose-with-values). This clearly shows a greater than 2x improvement. But after integrating it into the Qi compiler (i.e., this PR), the existing composition benchmark seems to show a slight degradation in performance. It's hard to tell since the number is already so low, but increasing the number of runs by an order of magnitude shows comparable or slightly worse performance as a result of the change. This seems unexpected though, and I'm more inclined to trust the standalone result, but it would be great to at least understand the reason for the discrepancy between the results.

To run the Qi benchmark for "composition" (along with others -- make install-sdk first, though, if you haven't already):

$ make benchmark-nonlocal

The standalone benchmarking script:

#lang racket

(require syntax/parse/define)

(define (add1 x) (+ x 1))
(define (sub1 x) (- x 1))

(define (f1 x)
  (add1 (sub1 x)))

(define (f2 x)
  (call-with-values
   (lambda ()
     (call-with-values
      (lambda ()
        x)
      sub1))
   add1))

(define (f3 x)
  ((compose add1 sub1) x))

(define (f4 x)
  ((compose1 add1 sub1) x))

(define-syntax-parser compose-with-values
  [(_ args)
   #'(apply values args)]
  [(_ args f g ...)
   #'(call-with-values
      (λ ()
        (compose-with-values args g ...))
      f)])

(define-syntax-parser comp
  [(_ f ...) #'(λ args
                 (compose-with-values args
                                      f ...))])

(define (f5 x)
  ((comp add1 sub1) x))

(define N 100000000)

(collect-garbage)
'app
(time (for/fold ([k 0]) ([i (in-range N)])
    (f1 i)))
'call-with-values
(time (for/fold ([k 0]) ([i (in-range N)])
    (f2 i)))
'compose
(time (for/fold ([k 0]) ([i (in-range N)])
    (f3 i)))
'compose1
(time (for/fold ([k 0]) ([i (in-range N)])
        (f4 i)))
'compose-with-values
(time (for/fold ([k 0]) ([i (in-range N)])
        (f5 i)))

Public Domain Dedication

  • In contributing, I relinquish any copyright claims on my contribution and freely release it into the public domain in the simple hope that it will provide value.

(Why: The freely released, copyright-free work in this repository represents an investment in a better way of doing things called attribution-based economics. Attribution-based economics is based on the simple idea that we gain more by giving more, not by holding on to things that, truly, we could only create because we, in our turn, received from others. As it turns out, an economic system based on attribution -- where those who give more are more empowered -- is significantly more efficient than capitalism while also being stable and fair (unlike capitalism, on both counts), giving it transformative power to elevate the human condition and address the problems that face us today along with a host of others that have been intractable since the beginning. You can help make this a reality by releasing your work in the same way -- freely into the public domain in the simple hope of providing value. Learn more about attribution-based economics at drym.org, tell your friends, do your part.)

This was suggested by @samth, @notjack and @benknoble, as a way both
to make composition using `~>` more performant, and as a step towards
making it interoperable with Typed Racket.

Currently, although standalone benchmarking shows a > 2x improvement,
our "smoke" nonlocal benchmark for composition seems to show a slight
drop in performance. Reason for the discrepancy unclear.
@benknoble
Copy link
Collaborator

I have a feeling the outer (lambda args (and maybe deeply-nested (apply values args)) is what's hurting in our case.

Are there major cp0 output differences?

It would be nice to say something more substantial here, but I don't
understand the issues yet myself. For now, it's at least useful to
acknowledge this in a minimal way.
@countvajhula
Copy link
Collaborator Author

The outer args and deeply nested (apply values args) are present in the standalone benchmark too, though. And that shows the speedup that the Qi nonlocal benchmark doesn't show...

cp0, I created this module:

#lang racket

(require qi)

(define-flow composition
  (~> add1 sqr sub1))

(composition 5)

Then I ran:

$ PLT_LINKLET_SHOW_CP0=1 raco make scratch.rkt

That produced a ton of output with many sections claiming to be "cp0". But I found what appeared to be the right cp0 output.

When using the original compose:

;; cp0 ---------------------
(lambda (instance-variable-reference .get-syntax-literal!1
         .set-transformer!2 compose3 sqr4 print-values5 composition6
         lifted/1.17)
  (variable-set!/define lifted/1.17 (#2%void) 'consistent)
  (call-with-module-prompt
    (lambda ()
      ((if (#2%procedure? compose3)
           compose3
           ($app/no-inline
             (#3%$top-level-value 'slow-extract-procedure)
             compose3
             3))
        #2%sub1
        sqr4
        #2%add1))
    '(composition)
    '(constant)
    composition6)
  (let ([composition (#3%$object-ref
                       'scheme-object
                       composition6
                       9)])
    (call-with-module-prompt
      (lambda ()
        (#%call-with-values
          (lambda ()
            ((if (#2%procedure? composition)
                 composition
                 ($app/no-inline
                   (#3%$top-level-value 'slow-extract-procedure)
                   composition
                   1))
              5))
          print-values5)))
    (#2%void)))

When using compose-with-values:

;; cp0 ---------------------
(lambda (instance-variable-reference .get-syntax-literal!1
         .set-transformer!2 sqr3 print-values4 composition5
         lifted/1.16)
  (letrec ([composition (lambda args_2
                          (#%call-with-values
                            (lambda ()
                              (#%call-with-values
                                (lambda ()
                                  (#%call-with-values
                                    (lambda () (apply #2%values args_2))
                                    #2%add1))
                                sqr3))
                            #2%sub1))])
    (set-consistent-variables!/define
      (#2%vector lifted/1.16 composition5)
      (#2%vector (#2%void) composition))
    (call-with-module-prompt
      (lambda ()
        (#%call-with-values
          (lambda ()
            (let ([args_2 (#3%list 5)])
              (#%call-with-values
                (lambda ()
                  (#%call-with-values
                    (lambda ()
                      (#%call-with-values
                        (lambda () (apply #2%values args_2))
                        #2%add1))
                    sqr3))
                #2%sub1)))
          print-values4)))
    (#2%void)))

That suggest anything to you?

@benknoble
Copy link
Collaborator

does that suggest anything to you

No, but there are some apparent differences in the way the composition is handled (see the lets).

@samth might be about to land some improvements in the compiler though. racket/racket#5182

@countvajhula
Copy link
Collaborator Author

I verified that cp0 for Sam's original script for the two alternatives -- i.e., call-with-values manually and using compose-with-values (which expands to repeated call-with-values) -- are effectively the same (modulo the extra wrapping lambda and final apply -- but that's just due to how they are written in the source module):

compose-with-values:

	[f5 (lambda (x_133)
		  (let ([args_134 (#2%list x_133)])
			(#%call-with-values
			  (lambda ()
				(#%call-with-values
				  (lambda () (apply #2%values args_134))
				  1/sub1))
			  1/add1)))]

call-with-values:

	[f2 (lambda (x_130)
		  (#%call-with-values
			(lambda ()
			  (#%call-with-values (lambda () x_130) 1/sub1))
			1/add1))]

@countvajhula
Copy link
Collaborator Author

Looking again at CP0 once integrated into the compiler, it looks like this bit that composes the functions is repeated twice:

(lambda args_2
  (#%call-with-values
    (lambda ()
      (#%call-with-values
        (lambda ()
          (#%call-with-values
            (lambda () (apply #2%values args_2))
            #2%add1))
        sqr3))
    #2%sub1))

... once in the letrec bindings clause, and then again in the body, where the binding, composition, is just not used! It seems to be doing double the work, which would seem to explain why the 2x improvement becomes nullified.

Looking at the changes in this PR, though, I don't immediately see why it should be duplicated that way. Any ideas?

@benknoble
Copy link
Collaborator

Looking again at CP0 once integrated into the compiler, it looks like this bit that composes the functions is repeated twice:

(lambda args_2
  (#%call-with-values
    (lambda ()
      (#%call-with-values
        (lambda ()
          (#%call-with-values
            (lambda () (apply #2%values args_2))
            #2%add1))
        sqr3))
    #2%sub1))

... once in the letrec bindings clause, and then again in the body, where the binding, composition, is just not used! It seems to be doing double the work, which would seem to explain why the 2x improvement becomes nullified.

Looking at the changes in this PR, though, I don't immediately see why it should be duplicated that way. Any ideas?

That stood out to me, too, and I wasn't clear at pointing it out. Not sure why it's happening, but a key factor is probably

    (set-consistent-variables!/define
      (#2%vector lifted/1.16 composition5)
      (#2%vector (#2%void) composition))

("Macro magic" ?)

@samth
Copy link

samth commented Jan 27, 2025

  1. I think the duplication is just inlining.
  2. set-consistent-variables!/define is part of how linklets/modules are implemented by translation to Chez Scheme. There are two module-level variables there, and they get filled in with those two values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants