Skip to content

Latest commit

 

History

History
85 lines (69 loc) · 4.53 KB

notes.md

File metadata and controls

85 lines (69 loc) · 4.53 KB
  • Read bad apple and extract frames

    • Read pixel data from frame

      - hard-code color depth and number of channels for now
      - man, never knew strides were a thing (show image without stride?)
      - holy cow its slow... lets measure fps
      - extracting things from within the loop (barely any result at all.... 3.1 ->
        3.2 fps)
      - remove reader (big jump! 3 -> 9 fps!)
      - still kinda slow though, i guess its time to bust out the virtualvm and see
        why
        - oh... the string format thing is kind of slow....
          - ![](./visualvm_1.jpg)
        - using a lookup table lets us get somewhat stable 30 fps!
      - but.... that can't be it, right? how far can we go?
      - ~120fps by only sending diffs!
      - Let's see where the bottleneck is now
        - Looks like the println statement takes quite some time
          - ![](./visualvm_2.jpg)
        - doing byte array manipulation instead of using/creating Strings seems to
          have helped somewhat.
      - it is starting to become hard to measure performance just using the crude
        fps measurments, maybe it's time to improve that
      - implemented jmh
      
        - Current score:
      
        | Benchmark       | Mode | Cnt | Score  | Error   | Units |
        | --------------- | ---- | --- | ------ | ------- | ----- |
        | Benchmarks.init | avgt | 25  | 21,780 | ± 0,415 | s/op  |
      
      - i think its time to switch to nio, since the write operation on the buffer still takes the majority of the frame time
      - ![](./visualvm_3.jpg)
      
        - score with nio:
      
        | Benchmark       | Mode | Cnt | Score  | Error   | Units |
        | --------------- | ---- | --- | ------ | ------- | ----- |
        | Benchmarks.init | avgt | 25  | 16,305 | ± 0,112 | s/op  |
      
      - and write times are looking much better
      - ![](./visualvm_4.jpg)
      - changing to non-blocking nio, doesn't make a big difference. let's try double buffering!
      - ![](./visualvm_5.jpg)
      
        | Benchmark       | Mode | Cnt | Score  | Error   | Units |
        | --------------- | ---- | --- | ------ | ------- | ----- |
        | Benchmarks.init | avgt | 25  | 15,337 | ± 0,320 | s/op  |
      
      - it's better, but writing the pixels to the buffer takes the majority of the time, let's do something about that!
      - Hm... writing in parallel did make it a bit faster, but the screen-tearing is awful and it's not as fast as it can be, since i have to synchronize the write to the buffer...
      
        | Benchmark       | Mode | Cnt | Score  | Error   | Units |
        | --------------- | ---- | --- | ------ | ------- | ----- |
        | Benchmarks.init | avgt | 25  | 14,788 | ± 0,352 | s/op  |
      
      - well... introducing local buffers _did_ make the screen tearing go away, but the performance is virtually the same
      - oh.....
      - ![](./visualvm_6.jpg)
      - looks like we wait half the frame time to aquire the next frame buffer...
      - replace allocate with allocateDirect (heap vs. off heap?), but that just makes the percentage of acquire time worse...
      
        | Benchmark       | Mode | Cnt | Score  | Error   | Units |
        | --------------- | ---- | --- | ------ | ------- | ----- |
        | Benchmarks.init | avgt | 25  | 14,698 | ± 0,102 | s/op  |
      
      - running on a stronger machine (8 cores instead of 4) improves things again, but... throwing hardware at the problem is only half as fun
      
      | Benchmark       | Mode | Cnt | Score  | Error   | Units |
      | --------------- | ---- | --- | ------ | ------- | ----- |
      | Benchmarks.init | avgt | 25  | 10.256 | ± 0.319 | s/op  |
      
      - .... or is it the stronger single cores??
        - Why is the pixelpwnr-server using exactly 1 core to the max?
        - ![](./usage_pixelpwn-server.jpg)
        - Okay, i should have guessed as much, but the server i am running against is bound to 1 thread per connection....
          - That explains the long wait times for the buffer to be released...
          - Should i change the server code?
          - Should i update my code to spawn multiple connection?
          - ..... changing the server would be quite fun... but maybe not in the spirit of this endeavour?
          - Decisions decisions...
        - After a _very crude_ test; results of multiple connections are a lot better but... the screen tearing.... eww, not worth it
          Benchmark Mode Cnt Score Error Units
          Benchmarks.init avgt 25 5.578 ± 0.154 s/op