forked from RussTedrake/underactuated
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
905 lines (866 loc) · 35.9 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
<!DOCTYPE html>
<html>
<head>
<title>Underactuated Robotics</title>
<meta name="Underactuated Robotics" content="text/html; charset=utf-8;" />
<link rel="canonical" href="https://underactuated.csail.mit.edu/index.html" />
<script src="https://hypothes.is/embed.js" async></script>
<script type="text/javascript" src="htmlbook/book.js"></script>
<script src="htmlbook/mathjax-config.js" defer></script>
<script type="text/javascript" id="MathJax-script" defer
src="htmlbook/MathJax/es5/tex-chtml.js">
</script>
<script>window.MathJax || document.write('<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js" defer><\/script>')</script>
<link rel="stylesheet" href="htmlbook/highlight/styles/default.css">
<script src="htmlbook/highlight/highlight.pack.js"></script> <!-- http://highlightjs.readthedocs.io/en/latest/css-classes-reference.html#language-names-and-aliases -->
<script>hljs.initHighlightingOnLoad();</script>
<link rel="stylesheet" type="text/css" href="htmlbook/book.css">
</head>
<body onload="loadIndex();">
<div data-type="titlepage">
<header>
<h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a></h1>
<p data-type="subtitle">Algorithms for Walking, Running, Swimming, Flying, and Manipulation</p>
<p style="font-size: 18px;"><a href="http://people.csail.mit.edu/russt/">Russ Tedrake</a></p>
<p style="font-size: 14px; text-align: right;">
© Russ Tedrake, 2023<br/>
Last modified <span id="last_modified"></span>.</br>
<script>
var d = new Date(document.lastModified);
document.getElementById("last_modified").innerHTML = d.getFullYear() + "-" + (d.getMonth()+1) + "-" + d.getDate();</script>
<a href="misc.html">How to cite these notes, use annotations, and give
feedback.</a><br/>
</p>
</header>
</div>
<div id="mathjax"/>
<p><b>Note:</b> These are working notes used for <a
href="https://underactuated.csail.mit.edu/Spring2023/">a course being taught
at MIT</a>. They will be updated throughout the Spring 2023 semester. <a
href="https://www.youtube.com/channel/UChfUOAhz7ynELF-s_1LPpWg">Lecture videos are available on YouTube</a>.</p>
<section id="search" pdf="no"><h1>Search these notes</h1>
<form id="search_form" action="https://google.com/search" method="get">
<input type="text" name="q" placeholder="Search these notes…" />
<input type="hidden" name="q" value="site:underactuated.csail.mit.edu" />
</form>
</section>
<section id="pdf"><h1>PDF version of the notes</h1>
<p pdf="no">You can also download a PDF version of these notes (updated
much less frequently) from <a
href="https://github.com/RussTedrake/underactuated/releases">here</a>.</p>
<p>The PDF version of these notes are autogenerated from the HTML version.
There are a few conversion/formatting artifacts that are easy to fix (please
feel free to point them out). But there are also interactive elements in the
HTML version are not easy to put into the PDF. When possible, I try to
provide a link. But I consider the <a
href="https://underactuated.csail.mit.edu">online HTML version</a> to be the
main version.
</p>
</section>
<section id="table_of_contents">
<h1>Table of Contents</h1>
<ul>
<li><a href="#preface">Preface</a></li>
<li><a href="intro.html">Chapter 1: Fully-actuated vs Underactuated Systems</a></li>
<ul>
<li><a href=intro.html#section1>Motivation</a></li>
<ul>
<li>Honda's ASIMO vs. passive dynamic walkers</li>
<li>Birds vs. modern aircraft</li>
<li>Manipulation</li>
<li>The common theme</li>
</ul>
<li><a href=intro.html#section2>Definitions</a></li>
<li><a href=intro.html#section3>Feedback Equivalence</a></li>
<li><a href=intro.html#section4>Input and State Constraints</a></li>
<ul>
<li>Nonholonomic constraints</li>
</ul>
<li><a href=intro.html#section5>Underactuated robotics</a></li>
<li><a href=intro.html#section6>Goals for the course</a></li>
<li><a href=intro.html#section7>Exercises</a></li>
</ul>
<p style="margin-bottom: 0; text-decoration: underline;font-variant: small-caps;"><b>Model Systems</b></p>
<li><a href="pend.html">Chapter 2: The Simple Pendulum</a></li>
<ul>
<li><a href=pend.html#section1>Introduction</a></li>
<li><a href=pend.html#section2>Nonlinear dynamics with a
constant
torque</a></li>
<ul>
<li>The overdamped pendulum</li>
<li>The undamped pendulum with zero torque</li>
<ul>
<li>Orbit calculations</li>
</ul>
<li>The undamped pendulum with a constant
torque</li>
</ul>
<li><a href=pend.html#section3>The torque-limited simple pendulum</a></li>
<ul>
<li>Energy-shaping control</li>
</ul>
<li><a href=pend.html#section4>Exercises</a></li>
</ul>
<li><a href="acrobot.html">Chapter 3: Acrobots, Cart-Poles, and Quadrotors</a></li>
<ul>
<li><a href=acrobot.html#section1>The Acrobot</a></li>
<ul>
<li>Equations of motion</li>
</ul>
<li><a href=acrobot.html#cart_pole>The Cart-Pole system</a></li>
<ul>
<li>Equations of motion</li>
</ul>
<li><a href=acrobot.html#section3>Quadrotors</a></li>
<ul>
<li>The Planar Quadrotor</li>
<li>The Full 3D Quadrotor</li>
</ul>
<li><a href=acrobot.html#section4>Balancing</a></li>
<ul>
<li>Linearizing the manipulator equations</li>
<li>Controllability of linear systems</li>
<ul>
<li>The special case of non-repeated
eigenvalues</li>
<li>A general solution</li>
<li>Controllability vs. underactuated</li>
</ul>
<li>LQR feedback</li>
</ul>
<li><a href=acrobot.html#partial_feedback_linearization>Partial feedback linearization</a></li>
<ul>
<li>PFL for the Cart-Pole System</li>
<ul>
<li>Collocated</li>
<li>Non-collocated</li>
</ul>
<li>General form</li>
<ul>
<li>Collocated linearization</li>
<li>Non-collocated linearization</li>
<li>Task-space partial feedback linearization</li>
</ul>
</ul>
<li><a href=acrobot.html#section6>Swing-up control</a></li>
<ul>
<li>Energy shaping</li>
<li>Cart-Pole</li>
<li>Acrobot</li>
<li>Discussion</li>
</ul>
<li><a href=acrobot.html#section7>Other model systems</a></li>
<li><a href=acrobot.html#section8>Exercises</a></li>
</ul>
<li><a href="simple_legs.html">Chapter 4: Simple Models
of Walking and Running</a></li>
<ul>
<li><a href=simple_legs.html#limit_cycle>Limit Cycles</a></li>
<ul>
<li>Poincaré Maps</li>
</ul>
<li><a href=simple_legs.html#section2>Simple Models of Walking</a></li>
<ul>
<li>The Rimless Wheel</li>
<ul>
<li>Stance Dynamics</li>
<li>Foot Collision</li>
<li>Forward simulation</li>
<li>Poincaré Map</li>
<li>Fixed Points and Stability</li>
<li>Stability of standing still</li>
</ul>
<li>The Compass Gait</li>
<li>The Kneed Walker</li>
<li>Curved feet</li>
<li>And beyond...</li>
</ul>
<li><a href=simple_legs.html#running>Simple Models of Running</a></li>
<ul>
<li>The Spring-Loaded Inverted Pendulum (SLIP)</li>
<ul>
<li>Analysis on the apex-to-apex map</li>
<li>SLIP Control</li>
<li>SLIP extensions</li>
</ul>
<li>Hopping robots from the MIT Leg Laboratory</li>
<ul>
<li>The Planar Monopod Hopper</li>
<li>Running on four legs as though they were
one</li>
</ul>
<li>Towards human-like running</li>
</ul>
<li><a href=simple_legs.html#section4>A simple model that can walk and run</a></li>
<li><a href=simple_legs.html#section5>Exercises</a></li>
</ul>
<li><a href="humanoids.html">Chapter 5: Highly-articulated Legged
Robots</a></li>
<ul>
<li><a href=humanoids.html#section1>A thought experiment</a></li>
<ul>
<li>A spacecraft model</li>
<li>Robots with (massless) legs</li>
</ul>
<li><a href=humanoids.html#section2>Centroidal dynamics</a></li>
<ul>
<li>Impact dynamics</li>
<li>The special case of flat terrain</li>
<ul>
<li>An aside: the zero-moment point derivation</li>
</ul>
</ul>
<li><a href=humanoids.html#section3>ZMP-based planning</a></li>
<ul>
<li>Heuristic footstep planning</li>
<li>Planning trajectories for the center of mass</li>
<ul>
<li>The ZMP "Stability" Metric</li>
</ul>
<li>From a CoM plan to a whole-body
plan</li>
</ul>
<li><a href=humanoids.html#section4>Whole-Body Control</a></li>
<li><a href=humanoids.html#section5>Footstep planning and push recovery</a></li>
<li><a href=humanoids.html#section6>Beyond ZMP planning</a></li>
<li><a href=humanoids.html#section7>Exercises</a></li>
</ul>
<li><a href="stochastic.html">Chapter 6: Model Systems
with Stochasticity</a></li>
<ul>
<li><a href=stochastic.html#section1>The Master Equation</a></li>
<li><a href=stochastic.html#section2>Stationary Distributions</a></li>
<li><a href=stochastic.html#section3>Extended Example: The Rimless Wheel on Rough
Terrain</a></li>
<li><a href=stochastic.html#section4>Noise models for real robots/systems.</a></li>
</ul>
<p style="margin-bottom: 0; text-decoration: underline;font-variant: small-caps;"><b>Nonlinear Planning and Control</b></p>
<li><a href="dp.html">Chapter 7: Dynamic Programming</a></li>
<ul>
<li><a href=dp.html#section1>Formulating control design as an optimization</a></li>
<ul>
<li>Additive cost</li>
</ul>
<li><a href=dp.html#graph_search>Optimal control as graph search</a></li>
<li><a href=dp.html#continuous>Continuous dynamic programming</a></li>
<ul>
<li>The Hamilton-Jacobi-Bellman Equation</li>
<li>Solving for the minimizing
control</li>
<li>Numerical solutions for $J^*$</li>
<ul>
<li>Value iteration with function approximation</li>
<li>Linear function approximators</li>
<li>Value iteration on a mesh</li>
<li>Neural fitted value iteration</li>
<li>Continuous-time systems</li>
</ul>
</ul>
<li><a href=dp.html#section4>Extensions</a></li>
<ul>
<li>Discounted and average cost formulations</li>
<li>Stochastic control for finite MDPs</li>
<ul>
<li>Stochastic interpretation of deterministic,
continuous-state value iteration</li>
</ul>
<li>Approximate dynamic programming with convex
optimization</li>
</ul>
<li><a href=dp.html#section5>Exercises</a></li>
</ul>
<li><a href="lqr.html">Chapter 8: Linear Quadratic Regulators</a></li>
<ul>
<li><a href=lqr.html#section1>Basic Derivation</a></li>
<ul>
<li>Local stabilization of nonlinear systems</li>
</ul>
<li><a href=lqr.html#finite_horizon>Finite-horizon formulations</a></li>
<ul>
<li>Finite-horizon LQR</li>
<li>Time-varying LQR</li>
<li>Local trajectory stabilization for nonlinear systems</li>
<li>Linear Quadratic Optimal Tracking</li>
<li>Linear Final Boundary Value Problems</li>
</ul>
<li><a href=lqr.html#section3>Variations and extensions</a></li>
<ul>
<li>Discrete-time Riccati Equations</li>
<li>LQR with input and state constraints</li>
<li>LQR on a manifold</li>
<li>LQR for linear systems in implicit form</li>
<li>LQR as a convex optimization</li>
<li>Finite-horizon LQR via least
squares</li>
<li>Minimum-time LQR</li>
</ul>
<li><a href=lqr.html#section4>Exercises</a></li>
<li><a href=lqr.html#section5>Notes</a></li>
<ul>
<li>Finite-horizon LQR derivation (general form)</li>
</ul>
</ul>
<li><a href="lyapunov.html">Chapter 9: Lyapunov
Analysis</a></li>
<ul>
<li><a href=lyapunov.html#section1>Lyapunov Functions</a></li>
<ul>
<li>Global Stability</li>
<li>LaSalle's Invariance Principle</li>
<li>Relationship to the Hamilton-Jacobi-Bellman
equations</li>
<li>Lyapunov functions for estimating regions of
attraction</li>
<li>Robustness analysis using "common Lyapunov functions"</li>
<li>Barrier functions</li>
</ul>
<li><a href=lyapunov.html#optimization>Lyapunov analysis with convex optimization</a></li>
<ul>
<li>Linear systems</li>
<li>Global analysis for polynomial systems</li>
<li>Region of attraction estimation for polynomial systems</li>
<ul>
<li>The S-procedure</li>
<li>Basic region of attraction formulation</li>
<li>The equality-constrained formulation</li>
<li>Searching for $V(\bx)$</li>
<li>Convex outer approximations</li>
<li>Regions of attraction codes in Drake</li>
</ul>
<li>Robustness analysis using the S-procedure</li>
<li>Piecewise-polynomial systems</li>
<li>Rigid-body dynamics are (rational) polynomial</li>
<ul>
<li>Linear feedback and quadratic
forms</li>
<li>Alternatives for obtaining polynomial equations</li>
</ul>
<li>Verifying dynamics in implicit form</li>
</ul>
<li><a href=lyapunov.html#finite-time>Finite-time Reachability</a></li>
<ul>
<li>Time-varying dynamics and Lyapunov functions</li>
<li>Finite-time reachability</li>
<li>Reachability via Lyapunov functions</li>
</ul>
<li><a href=lyapunov.html#control>Control design</a></li>
<ul>
<li>Control design via
alternations</li>
<ul>
<li>Global stability</li>
<li>Maximizing the region of attraction</li>
</ul>
<li>State feedback for linear systems</li>
<li>Control-Lyapunov Functions</li>
<li>Approximate dynamic programming with SOS</li>
<ul>
<li>Upper and lower bounds on cost-to-go</li>
<li>Linear Programming Dynamic Programming</li>
<li>Sums-of-Squares Dynamic
Programming</li>
</ul>
</ul>
<li><a href=lyapunov.html#section5>Alternative computational approaches</a></li>
<ul>
<li>Sampling Quotient-Ring Sum-of-Squares</li>
<li>"Satisfiability modulo theories" (SMT)
</li>
<li>Mixed-integer programming (MIP) formulations</li>
<li>Continuation methods</li>
</ul>
<li><a href=lyapunov.html#section6>Neural Lyapunov functions</a></li>
<li><a href=lyapunov.html#section7>Contraction metrics</a></li>
<li><a href=lyapunov.html#section8>Other variations and extensions</a></li>
<li><a href=lyapunov.html#section9>Exercises</a></li>
</ul>
<li><a href="trajopt.html">Chapter 10: Trajectory
Optimization</a></li>
<ul>
<li><a href=trajopt.html#section1>Problem Formulation</a></li>
<li><a href=trajopt.html#section2>Convex Formulations for Linear Systems</a></li>
<ul>
<li>Direct Transcription</li>
<li>Direct Shooting</li>
<li>Computational
Considerations</li>
<li>Continuous Time</li>
</ul>
<li><a href=trajopt.html#section3>Nonconvex Trajectory Optimization</a></li>
<ul>
<li>Direct Transcription and Direct Shooting</li>
<li>Direct Collocation</li>
<li>Pseudo-spectral Methods</li>
<ul>
<li>Dynamic constraints in implicit form</li>
</ul>
</ul>
<li><a href=trajopt.html#section4>Solution techniques</a></li>
<ul>
<li>Efficiently computing gradients</li>
<li>The special case of direct shooting without state constraints</li>
<li>Penalty methods and the Augmented Lagrangian</li>
<li>Zero-order optimization</li>
<li>Getting good solutions... in practice.</li>
</ul>
<li><a href=trajopt.html#section5>Local Trajectory Feedback Design</a></li>
<ul>
<li>Finite-horizon LQR</li>
<li>Model-Predictive Control</li>
<ul>
<li>Receding-horizon MPC</li>
<li>Recursive feasibility</li>
<li>MPC and Lyapunov functions</li>
</ul>
</ul>
<li><a href=trajopt.html#perching>Case Study: A glider that can land on a perch
like a bird</a></li>
<ul>
<li>The Flat-Plate Glider Model</li>
<li>Trajectory optimization</li>
<li>Trajectory stabilization</li>
<li>Trajectory funnels</li>
<li>Beyond a single trajectory</li>
</ul>
<li><a href=trajopt.html#pontryagin>Pontryagin's Minimum Principle</a></li>
<ul>
<li>Lagrange multiplier derivation of the adjoint equations</li>
<li>Necessary conditions for optimality in continuous time</li>
</ul>
<li><a href=trajopt.html#section8>Variations and Extensions</a></li>
<ul>
<li>Differential Flatness</li>
<li>Iterative LQR and Differential Dynamic
Programming</li>
<li>Mixed-integer convex optimization for non-convex
constraints</li>
<li>Explicit model-predictive control</li>
</ul>
<li><a href=trajopt.html#section9>Exercises</a></li>
</ul>
<li><a href="policy_search.html">Chapter 11: Policy
Search</a></li>
<ul>
<li><a href=policy_search.html#section1>Problem formulation</a></li>
<li><a href=policy_search.html#lqr>Linear Quadratic Regulator</a></li>
<ul>
<li>Policy Evaluation</li>
<li>A nonconvex objective in ${\bf K}$</li>
<li>No local minima</li>
<li>True gradient descent</li>
</ul>
<li><a href=policy_search.html#section3>More convergence results and counter-examples</a></li>
<li><a href=policy_search.html#section4>Trajectory-based policy search</a></li>
<ul>
<li>Infinite-horizon objectives</li>
<li>Search strategies for global optimization</li>
</ul>
<li><a href=policy_search.html#section5>Policy Iteration</a></li>
</ul>
<li><a href="planning.html">Chapter 12: Motion Planning as
Search</a></li>
<ul>
<li><a href=planning.html#section1>Artificial Intelligence as Search</a></li>
<li><a href=planning.html#section2>Sampling-based motion planning</a></li>
<ul>
<li>Rapidly-Exploring Random Trees (RRTs)</li>
<li>RRTs for robots with dynamics</li>
<li>Variations and extensions</li>
<li>Discussion</li>
</ul>
<li><a href=planning.html#section3>Decomposition methods</a></li>
<li><a href=planning.html#section4>Planning as Combinatorial + Continuous Optimization</a></li>
<ul>
<li>Motion Planning on Graphs of Convex Sets (GCS)</li>
</ul>
<li><a href=planning.html#section5>Exercises</a></li>
</ul>
<li><a href="feedback_motion_planning.html">Chapter 13: Feedback Motion
Planning</a></li>
<li><a href="robust.html">Chapter 14: Robust and
Stochastic Control</a></li>
<ul>
<li><a href=robust.html#section1>Stochastic models</a></li>
<li><a href=robust.html#section2>Costs and constraints for stochastic systems</a></li>
<li><a href=robust.html#section3>Finite Markov Decision Processes</a></li>
<li><a href=robust.html#section4>Linear optimal control</a></li>
<ul>
<li>Stochastic LQR</li>
<li>$L_2$ gain</li>
<ul>
<li>Dissipation inequalities</li>
<li>Small-gain theorem</li>
</ul>
<li>Robust LQR as $\mathcal{H}_\infty$</li>
<li>Linear Exponential-Quadratic Gaussian (LEQG)</li>
<li>Adaptive control</li>
<li>Structured uncertainty</li>
<li>Linear parameter-varying (LPV) control</li>
</ul>
<li><a href=robust.html#section5>Trajectory optimization</a></li>
<ul>
<li>Monte-carlo trajectory optimization</li>
<li>Iterative $\mathcal{H}_2$/iLQG</li>
<li>Finite-time (reachability) analysis</li>
</ul>
<li><a href=robust.html#section6>Nonlinear analysis and control</a></li>
<li><a href=robust.html#section7>Domain randomization</a></li>
<li><a href=robust.html#section8>Extensions</a></li>
<ul>
<li>Alternative risk/robustness metrics</li>
</ul>
</ul>
<li><a href="output_feedback.html">Chapter 15:
Output Feedback (aka Pixels-to-Torques)</a></li>
<ul>
<li><a href=output_feedback.html#section1>Background</a></li>
<ul>
<li>The classical perspective</li>
<li>From pixels to torques</li>
</ul>
<li><a href=output_feedback.html#section2>Static Output Feedback</a></li>
<ul>
<li>A hardness result</li>
<li>Via policy search</li>
</ul>
<li><a href=output_feedback.html#section3>Observer-based Feedback</a></li>
<ul>
<li>Luenberger Observer</li>
<li>Linear Quadratic Regulator w/ Gaussian Noise
(LQG)</li>
<li>Partially-observable Markov Decision Processes</li>
<li>Trajectory optimization with Iterative LQG</li>
</ul>
<li><a href=output_feedback.html#section4>Disturbance-based feedback</a></li>
<ul>
<li>System-Level Synthesis</li>
</ul>
<li><a href=output_feedback.html#section5>Feedback from Pixels</a></li>
</ul>
<li><a href="limit_cycles.html">Chapter 16: Algorithms
for Limit Cycles</a></li>
<ul>
<li><a href=limit_cycles.html#trajopt>Trajectory optimization</a></li>
<li><a href=limit_cycles.html#lyapunov>Lyapunov analysis</a></li>
<ul>
<li>Transverse coordinates</li>
<li>Transverse linearization</li>
<li>Region of attraction estimation using sums-of-squares</li>
</ul>
<li><a href=limit_cycles.html#section3>Feedback design</a></li>
<ul>
<li>For underactuation degree one.</li>
<li>Transverse LQR</li>
<li>Orbital stabilization for non-periodic trajectories</li>
</ul>
</ul>
<li><a href="contact.html">Chapter 17: Planning and
Control through Contact</a></li>
<ul>
<li><a href=contact.html#section1>(Autonomous) Hybrid Systems</a></li>
<ul>
<li>Hybrid trajectory optimization</li>
<ul>
<li>Given a fixed mode sequence</li>
<li>Direct shooting</li>
</ul>
<li>Deriving hybrid models: minimal vs floating-base
coordinates</li>
<li>Discrete control (between events)</li>
<li>Hybrid LQR</li>
<li>Hybrid Lyapunov analysis</li>
</ul>
<li><a href=contact.html#contact_implicit>Contact-implicit trajectory optimization</a></li>
<li><a href=contact.html#section3>Exercises</a></li>
</ul>
<p style="margin-bottom: 0; text-decoration: underline;font-variant: small-caps;"><b>Estimation and Learning</b></p>
<li><a href="sysid.html">Chapter 18: System Identification</a></li>
<ul>
<li><a href=sysid.html#section1>Problem formulation</a></li>
<ul>
<li>Equation error vs simulation error</li>
<li>Online optimization</li>
<li>Learning models for control</li>
</ul>
<li><a href=sysid.html#lumped>Parameter Identification for Mechanical Systems</a></li>
<ul>
<li>Kinematic parameters and calibration</li>
<li>Least-squares formulation (of the inverse dynamics).</li>
<li>Identification using energy instead of inverse
dynamics.</li>
<li>Residual physics models with linear function
approximators</li>
<li>Experiment design as a trajectory optimization</li>
<li>Online estimation and adaptive control</li>
<li>Identification with contact</li>
</ul>
<li><a href=sysid.html#section3>Identifying (time-domain) linear dynamical systems</a></li>
<ul>
<li>From state observations</li>
<ul>
<li>Model-based Iterative Learning Control (ILC)</li>
<li>Compression using the dominant eigenmodes</li>
<li>Linear dynamics in a nonlinear basis</li>
</ul>
<li>From input-output data (the state-realization problem)</li>
<li>Adding stability constraints</li>
<li>Autoregressive models</li>
<li>Statistical analysis of learning linear models</li>
</ul>
<li><a href=sysid.html#section4>Identification of finite (PO)MDPs</a></li>
<ul>
<li>From state observations</li>
<li>Identifying Hidden Markov Models (HMMs)</li>
</ul>
<li><a href=sysid.html#section5>Neural network models</a></li>
<ul>
<li>Generating training data</li>
<li>From state observations</li>
<li>State-space models from input-output data (recurrent networks)</li>
<li>Input-output (autoregressive) models</li>
<li>Particle-based models</li>
<li>Object-centric models</li>
<li>Modeling stochasticity</li>
<li>Control design for neural network models</li>
</ul>
<li><a href=sysid.html#section6>Alternatives for nonlinear system identification</a></li>
<li><a href=sysid.html#section7>Identification of hybrid systems</a></li>
<li><a href=sysid.html#section8>Task-relevant models</a></li>
<li><a href=sysid.html#section9>Exercises</a></li>
</ul>
<li><a href="state_estimation.html">Chapter 19: State Estimation</a></li>
<ul>
<li><a href=state_estimation.html#section1>Observers and the Kalman Filter</a></li>
<li><a href=state_estimation.html#section2>Recursive Bayesian Filters</a></li>
<li><a href=state_estimation.html#section3>Smoothing</a></li>
</ul>
<li><a href="rl_policy_search.html">Chapter 20: Model-Free Policy Search</a></li>
<ul>
<li><a href=rl_policy_search.html#section1>Policy Gradient Methods</a></li>
<ul>
<li>The Likelihood Ratio Method (aka
REINFORCE)</li>
<li>Sample efficiency</li>
<li>Stochastic Gradient Descent</li>
<li>The Weight Pertubation Algorithm</li>
<li>Weight Perturbation with an Estimated
Baseline</li>
<li>REINFORCE w/ additive Gaussian noise</li>
<li>Summary</li>
</ul>
<li><a href=rl_policy_search.html#section2>Sample performance via the signal-to-noise
ratio.</a></li>
<ul>
<li>Performance of Weight Perturbation</li>
</ul>
</ul>
<p style="margin-bottom: 0; text-decoration: underline;font-variant: small-caps;"><b>Appendix</b></p>
<li><a href="drake.html">Appendix A: Drake</a></li>
<ul>
<li><a href=drake.html#section1>Pydrake</a></li>
<li><a href=drake.html#notebooks>Online Jupyter Notebooks</a></li>
<ul>
<li>Running on Deepnote</li>
<li>Running on Google Colab</li>
<li>Enabling licensed solvers</li>
</ul>
<li><a href=drake.html#section3>Running on your own machine</a></li>
<li><a href=drake.html#section4>Getting help</a></li>
</ul>
<li><a href="multibody.html">Appendix B: Multi-Body
Dynamics</a></li>
<ul>
<li><a href=multibody.html#section1>Deriving the equations of motion</a></li>
<li><a href=multibody.html#manipulator>The Manipulator Equations</a></li>
<ul>
<li>Recursive Dynamics Algorithms</li>
<li>Bilateral Position Constraints</li>
<li>Bilateral Velocity Constraints</li>
</ul>
<li><a href=multibody.html#contact>The Dynamics of Contact</a></li>
<ul>
<li>Compliant Contact Models</li>
<li>Rigid Contact with Event Detection</li>
<ul>
<li>Impulsive Collisions</li>
<li>Putting it all together</li>
</ul>
<li>Time-stepping Approximations for Rigid Contact</li>
<ul>
<li>Complementarity formulations</li>
<li>Anitescu's convex formulation</li>
<li>Todorov's regularization</li>
<li>The Semi-Analytic Primal (SAP) solver</li>
<li>Beyond Point Contact</li>
</ul>
</ul>
<li><a href=multibody.html#mechanics>Variational mechanics</a></li>
<ul>
<li>Virtual work</li>
<li>D'Alembert's principle and the force of
inertia</li>
<li>Principle of Stationary Action</li>
<li>Hamiltonian Mechanics</li>
</ul>
<li><a href=multibody.html#section5>Exercises</a></li>
</ul>
<li><a href="optimization.html">Appendix C: Optimization and
Mathematical Programming</a></li>
<ul>
<li><a href=optimization.html#section1>Optimization software</a></li>
<li><a href=optimization.html#section2>General concepts</a></li>
<ul>
<li>Convex vs nonconvex optimization</li>
<li>Constrained optimization with Lagrange multipliers</li>
</ul>
<li><a href=optimization.html#section3>Convex optimization</a></li>
<ul>
<li>Linear Programs/Quadratic Programs/Second-Order Cones</li>
<li>Semidefinite Programming and Linear Matrix Inequalities</li>
<li>Sums-of-squares optimization</li>
<ul>
<li>Sums of squares on a Semi-Algebraic Set</li>
<li>Sums of squares optimization on an Algebraic Variety</li>
<li>DSOS and SDSOS</li>
</ul>
<li>Solution techniques</li>
</ul>
<li><a href=optimization.html#nonlinear>Nonlinear programming</a></li>
<ul>
<li>Second-order methods (SQP /
Interior-Point)</li>
<li>First-order methods (SGD / ADMM) </li>
<ul>
<li>Penalty methods</li>
<li>Projected Gradient Descent</li>
</ul>
<li>Zero-order methods (CMA)</li>
<li>Example: Inverse Kinematics</li>
</ul>
<li><a href=optimization.html#section5>Combinatorial optimization</a></li>
<ul>
<li>Search, SAT, First order logic, SMT solvers, LP interpretation</li>
<li>Mixed-integer convex optimization</li>
</ul>
<li><a href=optimization.html#section6>"Black-box" optimization</a></li>
</ul>
<li><a href="playbook.html">Appendix D: An Optimization
Playbook</a></li>
<ul>
<li><a href=playbook.html#section1>Matrices</a></li>
<li><a href=playbook.html#section2>Ellipsoids</a></li>
<li><a href=playbook.html#section3>Polytopes</a></li>
<li><a href=playbook.html#section4>(Mixed-)Integer Programming</a></li>
<li><a href=playbook.html#section5>Bilinear Matrix Inequalities (BMIs)</a></li>
<li><a href=playbook.html#section6>Geometry (SE(3), Penetration, and
Contact)</a></li>
</ul>
<li><a href="misc.html">Appendix E: Miscellaneous</a></li>
<ul>
<li><a href=misc.html#cite>How to cite these notes</a></li>
<li><a href=misc.html#annotation>Annotation tool etiquette</a></li>
<li><a href=misc.html#projects>Some great final projects</a></li>
<li><a href=misc.html#feedback>Please give me feedback!</a></li>
</ul>
</ul>
</section>
<section id="preface"><h1>Preface</h1>
<p>This book is about nonlinear dynamics and control, with a focus on
mechanical systems. I've spent my career thinking about how to make robots
move robustly, but also with speed, efficiency, and grace. I believe that
this is best achieved through a tight coupling between mechanical design,
passive dynamics, and nonlinear control synthesis. These notes contain
selected material from dynamical systems theory, as well as linear and
nonlinear control. But the dynamics of our robots quickly get too complex for
us to handle with a pencil-and-paper approach. As a result, the primary
focus of these notes is on computational approaches to control design,
especially using optimization and machine learning.<p>
<p>When I started teaching this class, and writing these notes, the
computational approach to control was far from mainstream in robotics. I had
just finished my Ph.D. focused on reinforcement learning (applied to a
bipedal robot), and was working on optimization-based motion planning. I
remember sitting at a robotics conference dinner as a young faculty,
surrounded by people I admired, talking about optimization. One of the
senior faculty said "Russ: the people that talk like you aren't the people
that get real robots to work." Wow, have things changed. Now almost every
advanced robot is using optimization or learning in the planning/control
system.</p>
<p>Today, the conversations about reinforcement learning (RL) are loud and
passionate enough to drown out almost every other conversation in the room.
Ironically, now I am the older professor and I find myself still believing in
RL, but not with the complete faith of my youth. There is so much one can
understand about the structure of the equations that govern our mechanical
systems; algorithms which don't make use of that structure are missing
obvious opportunities for data efficiency and robustness. The dream is to
make the learning algorithms discover this structure on their own; but even
then it pays for you (the designer) to understand the optimization landscape
the learning systems are operating on. That's why my goal for this course is
to help <i>you</i> discover this structure, and to learn how to use this
structure to develop stronger algorithms and to guide your scientific
endeavors into learning-based control.</p>
<p>I'll go even further. I'm willing to bet that our views of intelligence
in 10-20 years will look less like feedforward networks with a training mode
and a test mode, and more like a <i>system</i> with dynamics that ebb and
flow in a beautiful dance with streams of incoming data and the ever-changing
dynamics of the environment. These systems will move more flexibly between
perception, forward prediction / sequential decision making, storing and
retrieving long-term memories, and taking action. Dynamical systems theory
offers us a way to understand and harness the complexity of these systems
that we are building.</p>
<p>Although the material in the book comes from many sources, the
presentation is targeted very specifically at a handful of robotics problems.
Concepts are introduced only when and if they can help progress the
capabilities we are trying to develop. Many of the disciplines that I am
drawing from are traditionally very rigorous, to the point where the basic
ideas can be hard to penetrate for someone that is new to the field. I've made
a conscious effort in these notes to keep a very informal, conversational tone
even when introducing these rigorous topics, and to reference the most
powerful theorems but only to prove them when that proof would add particular
insights without distracting from the mainstream presentation. I hope that
the result is a broad but reasonably self-contained and readable manuscript
that will be of use to any enthusiastic roboticist.</p>
<section><h1>Organization</h1>
<p>The material in these notes is organized into a few main parts. "Model
Systems" introduces a series of increasingly complex dynamical systems and
overviews some of the relevant results from the literature for each system.
"Nonlinear Planning and Control" introduces quite general computational
algorithms for reasoning about those dynamical systems, with optimization
theory playing a central role. Many of these algorithms treat the dynamical
system as known and deterministic until the last chapters in this part which
introduce stochasticity and robustness. "Estimation and Learning" follows
this up with techniques from statistics and machine learning which
capitalize on this viewpoint to introduce additional algorithms which can
operate with less assumptions on knowing the model or having perfect
sensors. The book closes with an "Appendix" that provides slightly more
introduction (and references) for the main topics used in the course.</p>
<p>The order of the chapters was chosen to make the book valuable as a
reference. When teaching the course, however, I take a spiral trajectory
through the material, introducing robot dynamics and control problems one at
a time, and introducing only the techniques that are required to solve that
particular problem.</p>
<todo>insert figure showing progression of problems here. pendulum ->
cp/acro -> walking ... with chapter numbers associated.</todo>
</section>
<section><h1>Software</h1>
<p> All of the examples and algorithms in this book, plus many more, are now
available as a part of our open-source software project: <drake></drake>.
<drake></drake> is a C++ project, but in this text we will use Drake's <a
href="http://drake.mit.edu/python_bindings.html">Python bindings</a>. I
encourage super-users or readers who want to dig deeper to explore the C++
code as well (and to contribute back). </p>
<p>Please see the <a href="drake.html">appendix</a>
for specific instructions for using <drake></drake> along with these
notes.</p>
</section>
<p style="text-align:right;"><a href="intro.html">First chapter</a></p>
</section> <!-- end preface -->
<div id="footer">
<hr/>
<table style="width:100%;">
<tr><td><a href="https://accessibility.mit.edu/">Accessibility</a></td><td style="text-align:right">© Russ
Tedrake, 2023</td></tr>
</table>
</div>
</body>
</html>