Investigate other marker input strategies #7

zepumph · 2021-03-23T18:07:40Z

Over in #4 we added beholder-detection to ratio and proportion, but the responsiveness with motion tracking was not sufficient. It would be best to investigate other software.

Tagging @BLFiedler to mention timeline and priority here as it pertains to future data studies.

Potential leads:

https://trackingjs.com/
https://www.npmjs.com/package/handtrackjs (though this may be too specific to our needs to ratio and proportion)

samreid · 2021-10-06T13:24:12Z

Here's an open source library for hand and face tracking that works great on my machine:
https://storage.googleapis.com/tfjs-models/demos/handtrack/index.html
https://blog.tensorflow.org/2020/03/face-and-hand-tracking-in-browser-with-mediapipe-and-tensorflowjs.html
https://codepen.io/mediapipe/pen/RwGWYJw

samreid · 2021-10-06T16:43:12Z

I was interested in the mediapipe implementation and wanted to test it out in a PhET sim. I used it to implement gesture-based control for Build an Atom using this patch:

Index: js/common/view/BAAScreenView.js
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/js/common/view/BAAScreenView.js b/js/common/view/BAAScreenView.js
--- a/js/common/view/BAAScreenView.js	(revision 0661b5ff8c8ffb7316bc3c8e5936b4cafe59fd1f)
+++ b/js/common/view/BAAScreenView.js	(date 1633537840919)
@@ -15,6 +15,8 @@
 import Shape from '../../../../kite/js/Shape.js';
 import ModelViewTransform2 from '../../../../phetcommon/js/view/ModelViewTransform2.js';
 import BucketFront from '../../../../scenery-phet/js/bucket/BucketFront.js';
+import Vector3 from '../../../../dot/js/Vector3.js';
+import Utils from '../../../../dot/js/Utils.js';
 import BucketHole from '../../../../scenery-phet/js/bucket/BucketHole.js';
 import ResetAllButton from '../../../../scenery-phet/js/buttons/ResetAllButton.js';
 import PhetFont from '../../../../scenery-phet/js/PhetFont.js';
@@ -195,6 +197,8 @@
       } ) );
     } );
 
+    this.pressProperty = new BooleanProperty( false );
+
     // Add the particle count indicator.
     const particleCountDisplay = new ParticleCountDisplay( model.particleAtom, 13, 250, {
       tandem: tandem.createTandem( 'particleCountDisplay' )
@@ -380,6 +384,45 @@
   reset() {
     this.periodicTableAccordionBoxExpandedProperty.reset();
   }
+
+  step( dt ) {
+    if ( window.results && window.results.multiHandLandmarks.length > 0 ) {
+
+      const thumb = window.results.multiHandLandmarks[ 0 ][ 4 ];
+      const indexFinger = window.results.multiHandLandmarks[ 0 ][ 8 ];
+
+      const thumbVector = new Vector3( thumb.x, thumb.y, thumb.z );
+      const indexVector = new Vector3( indexFinger.x, indexFinger.y, indexFinger.z );
+      const d = thumbVector.distance( indexVector );
+
+      const xValues = window.results.multiHandLandmarks[ 0 ].map( landmark => landmark.x );
+      const yValues = window.results.multiHandLandmarks[ 0 ].map( landmark => landmark.y );
+
+      const x = Utils.linear( 0.2, 0.8, phet.joist.sim.display.width, 0, _.mean( xValues ) );
+      const y = Utils.linear( 0.2, 0.8, 0, phet.joist.sim.display.height, _.mean( yValues ) );
+
+      // our move event
+      const domEvent = document.createEvent( 'MouseEvent' ); // not 'MouseEvents' according to DOM Level 3 spec
+
+      // technically deprecated, but DOM4 event constructors not out yet. people on #whatwg said to use it
+      domEvent.initMouseEvent( 'mousemove', true, true, window, 0, // click count
+        x, y, x, y,
+        false, false, false, false,
+        0, // button
+        null );
+      phet.joist.sim.display._input.mouseMove( new Vector2( x, y ), domEvent );
+
+      if ( d < 0.04 && !this.pressProperty.value ) {
+        phet.joist.sim.display._input.mouseDown( 0, new Vector2( x, y ), domEvent );
+        this.pressProperty.value = true;
+      }
+
+      if ( d > 0.10 && this.pressProperty.value ) {
+        phet.joist.sim.display._input.mouseUp( new Vector2( x, y ), domEvent );
+        this.pressProperty.value = false;
+      }
+    }
+  }
 }
 
 // @public export for usage when creating shred Particles
Index: build-an-atom_en.html
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/build-an-atom_en.html b/build-an-atom_en.html
--- a/build-an-atom_en.html	(revision 0661b5ff8c8ffb7316bc3c8e5936b4cafe59fd1f)
+++ b/build-an-atom_en.html	(date 1633535285222)
@@ -9,6 +9,11 @@
   <meta name="phet-sim-level" content="development">
 
   <title>build-an-atom</title>
+
+  <script src="https://cdn.jsdelivr.net/npm/@mediapipe/camera_utils/camera_utils.js" crossorigin="anonymous"></script>
+  <script src="https://cdn.jsdelivr.net/npm/@mediapipe/control_utils/control_utils.js" crossorigin="anonymous"></script>
+  <script src="https://cdn.jsdelivr.net/npm/@mediapipe/drawing_utils/drawing_utils.js" crossorigin="anonymous"></script>
+  <script src="https://cdn.jsdelivr.net/npm/@mediapipe/hands/hands.js" crossorigin="anonymous"></script>
 </head>
 
 <!-- body is only made black for the loading phase so that the splash screen is black -->
@@ -140,5 +145,54 @@
   // This is done in load-unbuilt-strings.js
   window.phet.chipper.loadModules = () => loadURL( 'js/build-an-atom-main.js', 'module' );
 </script>
+
+<div class="container">
+  <video class="input_video" style="display:none"></video>
+  <canvas class="output_canvas" width="1280px" height="720px" style="display:none"></canvas>
+</div>
+
+<script type="module">
+  const videoElement = document.getElementsByClassName( 'input_video' )[ 0 ];
+  const canvasElement = document.getElementsByClassName( 'output_canvas' )[ 0 ];
+  const canvasCtx = canvasElement.getContext( '2d' );
+
+  function onResults( results ) {
+    window.results = results;
+    canvasCtx.save();
+    canvasCtx.clearRect( 0, 0, canvasElement.width, canvasElement.height );
+    canvasCtx.drawImage(
+      results.image, 0, 0, canvasElement.width, canvasElement.height );
+    if ( results.multiHandLandmarks ) {
+      for ( const landmarks of results.multiHandLandmarks ) {
+        drawConnectors( canvasCtx, landmarks, HAND_CONNECTIONS,
+          { color: '#00FF00', lineWidth: 5 } );
+        drawLandmarks( canvasCtx, landmarks, { color: '#FF0000', lineWidth: 2 } );
+      }
+    }
+    canvasCtx.restore();
+  }
+
+  const hands = new Hands( {
+    locateFile: ( file ) => {
+      return `https://cdn.jsdelivr.net/npm/@mediapipe/hands/${file}`;
+    }
+  } );
+  hands.setOptions( {
+    maxNumHands: 2,
+    minDetectionConfidence: 0.5,
+    minTrackingConfidence: 0.5
+  } );
+  hands.onResults( onResults );
+
+  const camera = new Camera( videoElement, {
+    onFrame: async () => {
+      await hands.send( { image: videoElement } );
+    },
+    width: 1280,
+    height: 720
+  } );
+  camera.start();
+</script>
+
 </body>
 </html>
\ No newline at end of file

I was able to use "in the air" gestures to drag particles from the bucket, and release them to build an atom. It felt very interesting operating build an atom without touching the computer. It was precise enough that I could toggle all accordion boxes, checkboxes and use the reset all button. This implementation took 1 hour and around 40 lines of code. I started with the JS boilerplate and examples in https://google.github.io/mediapipe/solutions/hands.html. I wired input through the mouse channel, pinching index finger to thumb simulates mouse down. Future versions would want to use touch for this so we can leverage the more forgiving touch areas and support multiple hands. It's all very prototype-y, but a very neat proof of concept about controlling a phet sim using gestures only. To test, apply the patch, and launch the sim with ?screens=1&showPointers.

I recorded a demo video:

hands.mov

samreid · 2021-10-07T17:15:05Z

@mattpen reported he also used Wekinator for a similar project:

http://www.wekinator.org/ - this made it really easy to hook up an external input source, like a webcam, to anything that can receive messages via WebSockets. For a class project I hooked up a webcam to wave-on-a-string. I think I did something like connect the left arm to frequency and right arm to amplitude

zepumph · 2022-05-27T20:25:58Z

MediaPipe (ratio and propotion) and OpenCV (quad) are working really well for our purposes here. We can come back to this issue if we need to use more items.

zepumph self-assigned this Mar 23, 2021

brettfiedler mentioned this issue Oct 29, 2021

Computer Vision for TMQ global orientation: considerations for and possible implementation phetsims/quadrilateral#20

Closed

brettfiedler mentioned this issue Feb 21, 2022

Implement handtracking for left and right hand vertical positions using MediaPipe phetsims/ratio-and-proportion#431

Closed

zepumph closed this as completed May 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate other marker input strategies #7

Investigate other marker input strategies #7

zepumph commented Mar 23, 2021

samreid commented Oct 6, 2021 •

edited

Loading

samreid commented Oct 6, 2021 •

edited

Loading

samreid commented Oct 7, 2021

zepumph commented May 27, 2022

Investigate other marker input strategies #7

Investigate other marker input strategies #7

Comments

zepumph commented Mar 23, 2021

samreid commented Oct 6, 2021 • edited Loading

samreid commented Oct 6, 2021 • edited Loading

samreid commented Oct 7, 2021

zepumph commented May 27, 2022

samreid commented Oct 6, 2021 •

edited

Loading

samreid commented Oct 6, 2021 •

edited

Loading