forked from apache/mesos
-
Notifications
You must be signed in to change notification settings - Fork 1
/
CHANGELOG
1286 lines (1136 loc) · 75.9 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
(WIP) Release Notes - Mesos - Version 0.22.0
--------------------------------------
This release contains several new features:
* Support for explicitly sending status updates acknowledgements from
schedulers; refer to the upgrades document for upgrading schedulers.
* API Changes:
* [MESOS-1143] - TASK_ERROR is now sent instead of TASK_LOST when rescheduling
a task should not be attempted.
* [MESOS-2086] - Update messages.proto to use a raw bytestream instead of a
string for AuthenticationStartMessage.
* [MESOS-2322] - All arguments can now read their values from a file, just
specify --name=file://path/to/file.
* [MESOS-2347] - The C++/Java/Python APIs have been updated to provide the
ability for schedulers to explicitly send acknowledgements.
TaskStatus now includes a UUID to enable this.
* Deprecations:
* [MESOS-2058] - Deprecate stats.json endpoints for Master and Slave.
* [MESOS-2322] - Deprecated specifying JSON blobs to parse using an absolute
path to point at the filename.
Release Notes - Mesos - Version 0.21.1
--------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-2047] Isolator cleanup failures shouldn't cause TASK_LOST.
* [MESOS-2071] Libprocess generates invalid HTTP
* [MESOS-2147] Large number of connections slows statistics.json responses.
* [MESOS-2182] Performance issue in libprocess SocketManager.
** Improvement
* [MESOS-1925] Docker kill does not allow containers to exit gracefully
* [MESOS-2113] Improve configure to find apr and svn libraries/headers in OSX
Release Notes - Mesos - Version 0.21.0
--------------------------------------
This release includes several new features.
* State reconciliation for frameworks:
* Allows frameworks to reconcile the states of the tasks.
* Support for Mesos modules
* Support for modules in master, slave and tests using the --modules
flag.
* Task status now includes source and reason:
* [MESOS-343] - Expose TASK_FAILED reason to Frameworks.
* [MESOS-1143] - Add a TASK_ERROR task status.
* A shared filesystem isolator:
* Volumes can be mounted from the host into a container's
filesystem.
* Parts of the shared filesystem can be made private to each
container, e.g., a private /tmp for each container.
* A pid namespace isolator:
* Processes inside a container will not have visibility to host
processes or processes in any other container.
* Containers will be destroyed by terminating the 'init' process for
the pid namespace rather than using the freezer cgroup, avoiding known
kernel bugs.
* API Changes:
* [MESOS-1461] - Add task reconciliation to the Python API.
* Deprecations:
* [MESOS-1807] - Disallow executors with cpu only or memory only resources.
* [MESOS-1986] - Disabling checkpointing is deprecated and the --checkpoint flag will be removed in a future release.
* Build changes:
* [MESOS-1044] - Require C++11 compiler support.
This release also includes several bug fixes and stability
improvements.
** Bug
* [MESOS-487] - Balloon framework fails to run due to bad flags
* [MESOS-631] - Slave started in cleanup mode shouldn't accept new tasks
* [MESOS-947] - Slave should properly handle a killTask() that arrives between runTask() and _runTask()
* [MESOS-1081] - Master should not deactivate authenticated framework/slave on new AuthenticateMessage unless new authentication succeeds.
* [MESOS-1195] - systemd.slice + cgroup enablement fails in multiple ways.
* [MESOS-1208] - 3rdparty/libprocess/3rdparty/boost-1.53.0/boost/math/special_functions/sign.hpp:113:55: error: typedef 'fp_tag' locally defined but not used [-Werror=unused-local-typedefs]
* [MESOS-1219] - Master should disallow frameworks that reconnect after failover timeout.
* [MESOS-1389] - Reconciliation can send TASK_LOST before a terminal update reaches the framework.
* [MESOS-1392] - Failure when znode is removed before we can read its contents.
* [MESOS-1414] - Status updates should not be sent from the slave until it is registered.
* [MESOS-1463] - mesos-local.sh dumps core
* [MESOS-1668] - Handle a temporary one-way master --> slave socket closure.
* [MESOS-1676] - ZooKeeperMasterContenderDetectorTest.MasterDetectorTimedoutSession is flaky
* [MESOS-1688] - No offers if no memory is allocatable
* [MESOS-1695] - The stats.json endpoint on the slave exposes "registered" as a string.
* [MESOS-1696] - Improve reconciliation between master and slave.
* [MESOS-1703] - better error message when replicated log hasn't been initialized
* [MESOS-1712] - Automate disallowing of commits mixing mesos/libprocess/stout
* [MESOS-1715] - The slave does not send pending tasks during re-registration.
* [MESOS-1716] - The slave does not add pending tasks as part of the staging tasks metric.
* [MESOS-1722] - Wrong attributes separator in slave --help
* [MESOS-1741] - mesos-slave shouldn't fail if dockerd is down
* [MESOS-1746] - clear TaskStatus data to avoid OOM
* [MESOS-1748] - MasterZooKeeperTest.LostZooKeeperCluster is flaky
* [MESOS-1769] - Segfault when using external containerizer
* [MESOS-1774] - Fix protobuf detection on systems with Python 3 as default
* [MESOS-1782] - AllocatorTest/0.FrameworkExited is flaky
* [MESOS-1783] - MasterTest.LaunchDuplicateOfferTest is flaky
* [MESOS-1786] - FaultToleranceTest.ReconcilePendingTasks is flaky.
* [MESOS-1797] - Packaged Zookeeper does not compile on OSX Yosemite
* [MESOS-1799] - Reconciliation can send out-of-order updates.
* [MESOS-1814] - Task attempted to use more offers than requested in example jave and python frameworks
* [MESOS-1817] - Completed tasks remains in TASK_RUNNING when framework is disconnected
* [MESOS-1821] - CHECK failure in master.
* [MESOS-1824] - when "docker ps -a" returns 400+ lines enabling docker containerizer results in all executors dying
* [MESOS-1833] - Running docker container with colon in executor id generates error
* [MESOS-1834] - Default port for mesos is 5050, but documentation states it as 5051
* [MESOS-1843] - Specifying --with-curl doesn't work.
* [MESOS-1844] - AllocatorTest/0.SlaveLost is flaky
* [MESOS-1849] - Cannot execute container in privileged mode
* [MESOS-1853] - Remove /proc and /sys remounts from port_mapping isolator
* [MESOS-1854] - SlaveRecoveryTest.MultipleSlaves is flaky.
* [MESOS-1855] - Mesos 0.20.1 doesn't compile
* [MESOS-1857] - path::join() is broken
* [MESOS-1858] - Leaked file descriptors in StatusUpdateStream.
* [MESOS-1862] - Performance regression in the Master's http metrics.
* [MESOS-1866] - Race between ~Authenticator() and Authenticator::authenticate() can lead to schedulers/slaves to never get authenticated
* [MESOS-1869] - UpdateFramework message might reach the slave before Reregistered message and get dropped
* [MESOS-1873] - Don't pass task-related arguments to mesos-executor
* [MESOS-1875] - os::killtree() incorrectly returns early if pid has terminated
* [MESOS-1878] - Access to sandbox on slave from master UI does not show the sandbox contents
* [MESOS-1881] - Reviewbot should not apply reviews that are submitted.
* [MESOS-1884] - Composing Containerizer is not sending calls to still launching containers
* [MESOS-1892] - Using mesos-0.20.1.jar with libmesos-0.21.0 reliably segfaults
* [MESOS-1901] - Slave resources obtained from localhost:5051/state.json is not correct.
* [MESOS-1915] - Docker containers that fail to launch are not killed
* [MESOS-1945] - SlaveTest.KillTaskBetweenRunTaskParts is flaky
* [MESOS-1948] - Docker tests are flaky
* [MESOS-1967] - Test RoutingTest.INETSockets fails on some machine
* [MESOS-1969] - RBT only takes revision ranges as args for versions >= 0.6
* [MESOS-1970] - slave and offer ids are indistinguishable in the logs
* [MESOS-1975] - Module manager causes make check failure for annotated mesos versions.
* [MESOS-1989] - Container network stats reported by the port mapping isolator is the reverse of the actual network stats.
* [MESOS-2025] - OsTest.killtreeNoRoot: Process reparent assumes new parent is init pid 1
* [MESOS-2036] - Fix the Json format for the --modules and update the help message
* [MESOS-2046] - Configure should check headers and libraries for svn and apr
* [MESOS-2050] - InMemoryAuxProp plugin used by Authenticators results in SEGFAULT
* [MESOS-2052] - RunState::recover should always recover 'completed'
* [MESOS-2078] - Scheduler driver may ACK status updates when the scheduler threw an exception
** Documentation
* [MESOS-1506] - Update documentation/flags regarding new default hostname semantics
* [MESOS-1950] - Add module writers guide
* [MESOS-1984] - Documentation for Egress Control Limit
* [MESOS-2033] - Documentation for isolator filesystem/shared.
* [MESOS-2034] - Documentation for isolator namespaces/pid.
* [MESOS-2037] - Update docs/configuration.md
* Epic
* [MESOS-1407] - Provide state reconciliation for frameworks.
* Improvement
* [MESOS-186] - Resource offers should be rescinded after some configurable timeout
* [MESOS-750] - Require compilers that support c++11
* [MESOS-1181] - Improve cpplint rule coverage
* [MESOS-1502] - expose message event queue size from libprocess
* [MESOS-1567] - Add logging of the user uid when receiving SIGTERM.
* [MESOS-1586] - Isolate system directories, e.g., per-container /tmp
* [MESOS-1643] - Provide APIs to return port resource for a given role
* [MESOS-1656] - Do not remove docker container until gc process runs
* [MESOS-1728] - Libprocess: report bind parameters on failure
* [MESOS-1752] - Allow variadic templates
* [MESOS-1771] - introduce unique_ptr
* [MESOS-1779] - Mesos style checker should catch trailing white space
* [MESOS-1811] - Reconcile disconnected/deactivated semantics in the master code
* [MESOS-1813] - Fail fast in example frameworks if task goes into unexpected state
* [MESOS-1863] - Split launch tasks and decline offers metrics
* [MESOS-1896] - Enable module specific command line parameters
* [MESOS-1927] - Enable implicit local cluster launch to load modules
* [MESOS-1932] - Install git pre commit hook during bootstrap
* [MESOS-1951] - Add --isolation flag to mesos-tests
* [MESOS-1972] - Move TASK_LOST generations due to invalid tasks from scheduler driver to master
* [MESOS-2038] - Remove dead code in Slave::_runTask
* Story
* [MESOS-343] - Expose TASK_FAILED reason to Frameworks.
* [MESOS-1765] - Use PID namespace to avoid freezing cgroup
* Task
* [MESOS-681] - Document the reconciliation API.
* [MESOS-1410] - Keep terminal unacknowledged tasks in the master's state.
* [MESOS-1808] - Expose RTT in container stats
* [MESOS-1864] - Add test integration for module developers
* [MESOS-1931] - Add support for isolator modules
* [MESOS-1943] - Add event queue size metrics to scheduler driver
* [MESOS-1964] - 0.21.0 release
* [MESOS-1965] - Create mesos::modules namespace for all module related stuff
* [MESOS-1985] - Use more standard debug / release build flags
Release Notes - Mesos - Version 0.20.1
--------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-1705] - SubprocessTest.Status sometimes flakes out
* [MESOS-1724] - Can't include port in DockerInfo's image
* [MESOS-1727] - Configure fails with ../configure: line 18439: syntax error near unexpected token `PROTOBUFPREFIX,'
* [MESOS-1729] - LogZooKeeperTest.WriteRead fails due to SIGPIPE (escalated to SIGABRT)
* [MESOS-1730] - Should be an error if commandinfo shell=true when using docker containerizer
* [MESOS-1732] - Mesos containerizer doesn't reject tasks with container info set
* [MESOS-1737] - Isolation=external result in core dump on 0.20.0
* [MESOS-1740] - Bad error message when docker containerizer isn't enabled
* [MESOS-1749] - SlaveRecoveryTest.ShutdownSlave is flaky
* [MESOS-1755] - Add docker support to mesos-execute
* [MESOS-1758] - Freezer failure leads to lost task during container destruction.
* [MESOS-1760] - MasterAuthorizationTest.FrameworkRemovedBeforeReregistration is flaky
* [MESOS-1764] - Build Fixes from 0.20 release
* [MESOS-1766] - MasterAuthorizationTest.DuplicateRegistration test is flaky
* [MESOS-1809] - Modify docker pull to use docker inspect after a successful pull
** Improvement
* [MESOS-1621] - Docker run networking should be configurable and support bridge network
* [MESOS-1762] - Avoid docker pull on each container run
* [MESOS-1770] - Docker with command shell=true should override entrypoint
Release Notes - Mesos - Version 0.20.0
--------------------------------------
This release includes a lot of new cool features. The major new features are
listed below:
* Docker support in Mesos:
* Users now can launch executors/tasks within Docker containers.
* Mesos now supports running multiple containerizers simultaneously. The slave
can dynamically choose a containerizer to launch containers based on the
configuration of executors/tasks.
* Container level network monitoring for mesos containerizer:
* Network statistics for each active container can be retrieved through the
/monitor/statistics.json endpoint on the slave.
* Completely transparent to the tasks running on the slave. No need to change
the service discovery mechanism for tasks.
* Framework authorization:
* Allows frameworks to (re-)register with authorized roles.
* Allows frameworks to launch tasks/executors as authorized users.
* Allows authorized principals to shutdown framework(s) through HTTP endpoint.
* Framework rate limiting:
* In a multi-framework environment, this feature aims to protect the
throughput of high-SLA (e.g., production, service) frameworks by having the
master throttle messages from other (e.g., development, batch) frameworks.
* Enable building against installed third-party dependencies.
* API Changes:
* [MESOS-857] - The Python API now uses different namespacing. This will break
existing schedulers, please refer to the upgrades document.
* [MESOS-1409] - Status update acknowledgements are sent through the Master
now. This only affects you if you're using a non-Mesos binding (e.g. pure
language binding), in which case refer to the upgrades document.
* HTTP endpoint changes:
* [MESOS-1188] - "deactivated_slaves" represents inactive slaves in "/stats.json" and "/state.json".
* [MESOS-1390] - "/shutdown" authenticated endpoint has been added to master to shutdown a framework.
* Deprecations:
* [MESOS-1219] - Master should disallow completed frameworks from re-registering with same framework id.
* [MESOS-1695] - "/stats.json" on the slave exposes "registered" value as string instead of integer.
This release also includes several bug fixes and stability improvements.
** Sub-task
* [MESOS-1292] - [MESOS-1259]:Enrich the Java Docs in the src/java files. -- ZooKeeperState.java
* [MESOS-1293] - [MESOS-1259]:Enrich the Java Docs in the src/java files. -- Variable.java
* [MESOS-1294] - [MESOS-1259]:Enrich the Java Docs in the src/java files. -- State.java
** Bug
* [MESOS-445] - Scheduler driver destructor waits forever
* [MESOS-473] - Freezer fails fatally when it is unable to write 'FROZEN' to freezer.state
* [MESOS-759] - The cgroups TaskKiller should skip freezing the cgroup if it is already empty.
* [MESOS-856] - TasksKiller may run forever because the cgroup cannot be frozen.
* [MESOS-878] - Slave should not register with the master when in TERMINATING.
* [MESOS-1001] - registrar doesn't build on Linux/Clang
* [MESOS-1119] - Allocator should make an allocation decision per slave instead of per framework/role.
* [MESOS-1149] - SlaveRecovery.Reboot test doesn't reap executor
* [MESOS-1170] - Update system check (glog)
* [MESOS-1171] - Update system check (gmock)
* [MESOS-1172] - Update system check (libev)
* [MESOS-1173] - Update system check (picojson)
* [MESOS-1174] - Update system check (protobuf)
* [MESOS-1178] - Only enable the oom killer if it's not enabled
* [MESOS-1337] - AllocatorZooKeeperTest/0.FrameworkReregistersFirst runs forever
* [MESOS-1341] - AllocatorZooKeeperTest/0.FrameworkReregistersFirst is flaky
* [MESOS-1348] - The SlaveRecoveryTest.GCExecutor test leaks child processes.
* [MESOS-1354] - Resource leak in jvm.cpp
* [MESOS-1404] - Glibc 'fork()' is not async signal safe
* [MESOS-1417] - Slave should not send terminal status update before containerizer update is finished
* [MESOS-1422] - AllocatorTest/0.SchedulerFailover test is flaky
* [MESOS-1428] - Failed to update 'registry': Failed to perform store within 5secs (caused flaky MasterTest.StatusUpdateAcknowledgementsThroughMaster)
* [MESOS-1435] - RegistrarZooKeeperTest.TaskRunning is flaky, sometimes runs forever.
* [MESOS-1436] - AllocatorZooKeeperTest/0.SlaveReregistersFirst flaky and can run forever
* [MESOS-1437] - SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch is flaky
* [MESOS-1439] - SchedulerTest.MetricsEndpoint is flaky
* [MESOS-1454] - Command executor should have nonzero resources
* [MESOS-1467] - commit msg was changed after run ./support/post-reviews.py
* [MESOS-1477] - Deadlock when terminating ZooKeeperProcess
* [MESOS-1479] - Cgroups cpu isolator should only report cfs stats if cfs is enabled
* [MESOS-1492] - Add support for optionally throttling the frameworks not specified in RateLimits config
* [MESOS-1504] - mesos.pb.h header include is problematic.
* [MESOS-1513] - FaultToleranceTest.SlaveReregisterTerminatedExecutor is flaky
* [MESOS-1526] - Regression in 'make distclean': files left around.
* [MESOS-1529] - Handle a network partition between Master and Slave
* [MESOS-1532] - AllocatorZooKeeperTest/0.SlaveReregistersFirst and AllocatorZooKeeperTest/0.FrameworkReregistersFirst are flaky
* [MESOS-1533] - HealthCheck tests are flaky
* [MESOS-1536] - AllocatorZooKeeperTest/0.FrameworkReregistersFirst
* [MESOS-1540] - Fix a typo in src/Makefile.am to include java test cases
* [MESOS-1543] - MasterTest.OrphanTasks is flaky
* [MESOS-1544] - DRFAllocatorTest.SameShareAllocations is flaky
* [MESOS-1549] - The configure script should check for libnl headers as well
* [MESOS-1555] - ExecutorInfo validity check is broken in Master
* [MESOS-1578] - Improve framework rate limiting by imposing the max number of outstanding messages per framework principal
* [MESOS-1604] - LowLevelSchedulerLibprocess did not receive offers from Master
* [MESOS-1610] - Mesos containerizer should not call isolate if the child process already died.
* [MESOS-1617] - Linux kernel generates duplicated tc u32 filter handles
* [MESOS-1624] - Apache Jenkins build fails due to -lsnappy is set when building leveldb
* [MESOS-1627] - Installed protobuf header files include wrong path to mesos header file
* [MESOS-1629] - GLOG Initialized twice if the Framework Scheduler also uses GLOG
* [MESOS-1632] - Seg fault due to infinite recursion "<< RepeatedPtrField<Resource>"
* [MESOS-1633] - Create a static mesos library
* [MESOS-1635] - zk flag fails when specifying a file and the replicated logs
* [MESOS-1639] - Master OOMs when throttling traffic from LoadGeneratorFramework
* [MESOS-1649] - Network isolator should tolerate slave crashes while doing isolate/cleanup.
* [MESOS-1653] - HealthCheckTest.GracePeriod is flaky.
* [MESOS-1655] - ZooKeeperTest.LeaderDetectorTimeoutHandling is flaky
* [MESOS-1658] - Implementation of process::io::poll can lead to broken pipes.
* [MESOS-1670] - Build Failure on Mac OSX with undefined link
* [MESOS-1673] - The value of MASTER_PING_TIMEOUT is non-deterministic
* [MESOS-1677] - AllocatorTest.FrameworkReregistersFirst is flaky.
* [MESOS-1692] - Build error on gcc-4.4.
* [MESOS-1693] - Enable builds for ARM
* [MESOS-1700] - ThreadLocal does not release pthread keys or log properly.
* [MESOS-1704] - Mac OS X build breaks in DockerContainerizerProcess::fetch
* [MESOS-1705] - SubprocessTest.Status sometimes flakes out
* [MESOS-1710] - Compilation against master fails on make check
** Documentation
* [MESOS-1480] - Write Documentation for Authorization
* [MESOS-1702] - Add document for network monitoring.
** Epic
* [MESOS-1071] - Enable building against installed third-party dependencies.
* [MESOS-1228] - Container level network monitoring
* [MESOS-1342] - Add authorization support.
** Improvement
* [MESOS-292] - Remove unnecessary includes of headers to improve compile times
* [MESOS-320] - Add instrumentation into libprocess.
* [MESOS-857] - restructure mesos python namespace
* [MESOS-921] - Consider simultaneous containerizer support
* [MESOS-987] - Wire up a code coverage tool
* [MESOS-1188] - Rename slaves/frameworks.activated/deactivated
* [MESOS-1236] - stout's os module uses a mix of Try<Nothing> and bool returns
* [MESOS-1237] - stout's os::ls should return a Try<>
* [MESOS-1259] - Enrich the Java Docs in the src/java files.
* [MESOS-1312] - Show active tasks orphaned by a framework disconnect
* [MESOS-1324] - Create a network isolator based on port mapping
* [MESOS-1339] - Add "per-framework-principal" counters for all messages from a scheduler on Master
* [MESOS-1379] - Provide a reconciliation mechanism for tasks unknown to the framework.
* [MESOS-1390] - Add an authenticated '/shutdown' endpoint for shutting down a running framework
* [MESOS-1446] - Create an abstraction for launching an operation in a subprocess.
* [MESOS-1450] - Add setns utilities to stout
* [MESOS-1453] - Update reconciliation semantics send statuses for each task.
* [MESOS-1499] - Add flags parse support for specific protobufs
* [MESOS-1501] - Add flags parse support for RateLimits protobuf
* [MESOS-1511] - Simplify 'Operation' semantics to only handle logics in the subprocess side
* [MESOS-1519] - Expose constructors of types used in java APIs
* [MESOS-1523] - ZooKeeper timeout should be longer
* [MESOS-1525] - Don't require slave id for reconciliation requests.
* [MESOS-1528] - Refactor Subprocess to support execve style launch and customized clone function
* [MESOS-1557] - Allow the network isolator to handle those tasks that are not isolated by the network isolator
* [MESOS-1559] - Allow jenkins build machine to dump stack traces of all threads when timeout
* [MESOS-1590] - Allow LoadGeneratorFramework to read password from a file
* [MESOS-1591] - Do not install LoadGeneratorFramework
* [MESOS-1608] - Add support for installing stout headers
* [MESOS-1616] - ReregisterCompletedFrameworks test does not use real JSON parser
* [MESOS-1620] - Reconciliation does not send back tasks pending validation / authorization.
* [MESOS-1652] - Stream Docker logs into sandbox logs
** Story
* [MESOS-1350] - Initial implementation of framework API rate limiter, taking the config via master flag
* [MESOS-1595] - Provide a way to install libprocess
** Task
* [MESOS-1307] - Authorize offer allocations
* [MESOS-1325] - Create a linux routing library abstraction based on libnl
* [MESOS-1343] - Authorize "/shutdown" HTTP endpoint through ACLs.
* [MESOS-1374] - Verify static libprocess scheduler port works with Mesos Master
* [MESOS-1409] - Send status update acknowledgments through the Master.
* [MESOS-1443] - Create a protobuf for framework rate limit configuration and load it as JSON through master flags
* [MESOS-1444] - Integrate rate limiter into the master
* [MESOS-1445] - Add new tests for framework rate limiting
* [MESOS-1451] - Remove 'offer_id' field from LaunchTasksMessage.
* [MESOS-1505] - Add a test to verify that frameworks with same share get equal number of allocations
* [MESOS-1530] - Create LoadGeneratorScheduler to test Framework Rate Limiting
* [MESOS-1568] - Support ENTRYPOINT style containers
* [MESOS-1580] - Accept --isolation=external through a deprecation cycle.
* [MESOS-1593] - Add DockerInfo Configuration
* [MESOS-1600] - IP classifiers in routing lib should ignore IP packets with IP options
* [MESOS-1601] - Add metrics for port mapping network isolator
* [MESOS-1671] - Expose executor metrics for slave.
* [MESOS-1672] - Add filter to allocator resourcesRecovered method
* [MESOS-1674] - Kill private_resources and treat 'ephemeral_ports' as a resource.
* [MESOS-1683] - Create user doc for framework rate limiting feature
Release Notes - Mesos - Version 0.19.1
--------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-1448] - Mesos Fetcher doesn't support URLs that have 30X redirects.
* [MESOS-1534] - Scheduler process is not explicitly terminated in the destructor of MesosSchedulerDriver.
* [MESOS-1538] - A container destruction in the middle of a launch leads to CHECK failure.
* [MESOS-1539] - No longer able to spin up Mesos master in local mode.
* [MESOS-1550] - MesosSchedulerDriver should never, ever, call 'stop'.
* [MESOS-1551] - Master does not create work directory when missing.
Release Notes - Mesos - Version 0.19.0
--------------------------------------
* The primary feature of this release is the "Registrar". This is the addition
of replicated state in the master to ensure the set of slaves in the cluster
remains consistent in the presence of master failovers.
* This feature is currently used in a write-only manner by default to allow
smooth upgrades. 0.20.0 by default will be write *and* read.
* Operators must now specify the 'work_dir' for the master, along with the
'quorum' size of the ensemble of masters.
* This means adding or removing masters must be done carefully! The best
practice is to only ever add or remove a single master at a time and to
allow a small amount of time for the replicated log to catch up on the new
master.
* Authentication support has been added for slaves.
* Metrics reporting has been overhauled and is now exposed on /metrics/snapshot.
* Support for external containerization strategies has been added to support
custom container needs as well as experimentation; this is an alpha release!
* There are also several bug fixes and stability improvements.
** Sub-task
* [MESOS-562] - Update 'Getting Started' Documentation Page
* [MESOS-783] - Master::killTask must not answer with TASK_LOST when the task is unknown.
* [MESOS-841] - Enforce only leading master can write to the Registrar.
* [MESOS-880] - introduce observe endpoint to master
* [MESOS-957] - introduce RepairCoordinator stub into master
* [MESOS-1226] - Add flags for replicated log backed registry.
* [MESOS-1338] - Add global counters for each message type on Master
** Bug
* [MESOS-361] - Restrict the character space of user provided TaskIDs.
* [MESOS-577] - bootstrap fails with automake 1.14
* [MESOS-578] - configure fails on OSX 10.8.4
* [MESOS-682] - Master should properly consolidate "slaves" and "deactivated" maps
* [MESOS-743] - ReservationAllocatorTest.ResourcesReturned test is flaky
* [MESOS-767] - Slave should re-register with completed frameworks/executors
* [MESOS-779] - mesos python examples use 2 space indent
* [MESOS-873] - Crash in os::killtree on Mavericks
* [MESOS-931] - post-review is deprecated.
* [MESOS-1000] - Clang build broken on 0.18.0 master
* [MESOS-1019] - AllocatorZooKeeperTest/0.SlaveReregistersFirst is flaky.
* [MESOS-1020] - AllocatorZooKeeperTest/0.SlaveReregistersFirst is flaky
* [MESOS-1025] - json_tests fails build
* [MESOS-1042] - Fix bad CGROUPS_ROOT_Write test
* [MESOS-1048] - LimitedCpuIsolatorTest.CgroupsCfs is broken when run as non-root
* [MESOS-1053] - tar: You must specify one of the `-Acdtrux' or `--test-label' options
* [MESOS-1054] - Java extension build is broken if libsnappy is installed
* [MESOS-1058] - Master CHECK failure: hierarchical_allocator_process.hpp:421 Check failed: !slaves.contains(slaveId)
* [MESOS-1062] - CpuIsolatorTest/0.SystemCpuUsage is flaky
* [MESOS-1067] - Specifying minimum logging level doesn't work
* [MESOS-1072] - Update system check (python boto)
* [MESOS-1077] - Registrar tests are flaky.
* [MESOS-1080] - cpplint.py doesn't analyze hpp files
* [MESOS-1082] - Make fails on AWS Ubuntu 12.04 and 13.10
* [MESOS-1083] - Error in CgroupsTest::SetUpTestCase() and TearDownTestCase()
* [MESOS-1088] - ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster is flaky
* [MESOS-1092] - [Doc] "bin/mesos-master --help" to "mesos-master --help"
* [MESOS-1099] - Log health checks in mesos
* [MESOS-1100] - Drop "OOM notifier is triggered" log message
* [MESOS-1124] - Mesos EC2 scripts: Cannot find any cluster
* [MESOS-1126] - Change linkage around libjvm to use dlopen.
* [MESOS-1152] - ProcTest.MultipleThreads is flaky
* [MESOS-1157] - make dist fail
* [MESOS-1158] - make distcheck fail
* [MESOS-1161] - Inconsistent completed frameworks state between slave and master
* [MESOS-1164] - URL encoded urls do not work in slave
* [MESOS-1165] - Retry required when recovering an empty log
* [MESOS-1167] - Update system check (boost)
* [MESOS-1168] - Update system check (zookeeper)
* [MESOS-1175] - Update system check (http-parser)
* [MESOS-1191] - ProcTest unit tests flaky
* [MESOS-1202] - Make it easy to apply GitHub pull requests
* [MESOS-1210] - OsTest.children test is flaky
* [MESOS-1211] - MesosContainerizer should recover isolators after the launcher recovers
* [MESOS-1214] - CHECK failure in Group
* [MESOS-1230] - Compiler warning in libprocess statistics
* [MESOS-1231] - CHECK failed in log coordinator
* [MESOS-1235] - Metrics.Snapshot* tests fail
* [MESOS-1239] - Group CHECK failure
* [MESOS-1264] - Slave authentication retries can trigger TASK_LOST for non-checkpointing frameworks.
* [MESOS-1265] - Group should not process enqueued events from previous ZooKeeper instance (and ZK session)
* [MESOS-1268] - distclean break during maven clean up
* [MESOS-1271] - CHECK failure in replica.
* [MESOS-1273] - SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch is flaky
* [MESOS-1275] - FaultToleranceTest.SlaveReregisterOnZKExpiration is flaky
* [MESOS-1276] - Make the delay between master detection and registration configurable
* [MESOS-1310] - Queuing up slave (re-)registration during authentication causes reply() to fail
* [MESOS-1318] - ProcessWatcher triggers seg fault
* [MESOS-1331] - SlaveRecoveryTest/0.NonCheckpointingFramework is flaky.
* [MESOS-1333] - Runtime error when invoking post-reviews.py with rbt 0.6
* [MESOS-1347] - GarbageCollectorIntegrationTest.DiskUsage is flaky.
* [MESOS-1348] - The SlaveRecoveryTest.GCExecutor test leaks child processes.
* [MESOS-1361] - Flaky test: SlaveRecoveryTest/0.RecoverCompletedExecutor
* [MESOS-1362] - Flaky test: SlaveRecoveryTest/0.RemoveNonCheckpointingFramework
* [MESOS-1365] - SlaveRecoveryTest/0.MultipleFrameworks is flaky
* [MESOS-1368] - Credentials file permissions check is broken
* [MESOS-1370] - SlaveRecoveryTest/0.RemoveNonCheckpointingFramework is flaky
* [MESOS-1372] - Compiler warning from stout flags
* [MESOS-1376] - CHECK failure in the Registrar
* [MESOS-1400] - Master doesn't recover resources for invalid offers
* [MESOS-1406] - Master stats.json using boolean instead of integral value for 'elected'.
* [MESOS-1408] - Unnecessary queuing of status update acknowledgments in the scheduler driver.
* [MESOS-1413] - MesosContainerizerExecuteTest.IoRedirection fails on OSX
* [MESOS-1415] - Web UI master redirect message doesn't show up
* [MESOS-1418] - Master should remove/rescind offers for disconnected slave.
* [MESOS-1419] - Properly rescind offers
* [MESOS-1449] - Isolator::recover will attempt to remove slave cgroup when using --slave_subsystems
* [MESOS-1455] - Segfault in libprocess during Process linking.
** Documentation
* [MESOS-1002] - Add "make check" instruction to getting started doc
* [MESOS-1377] - Update configuration documentation to reflect 0.19.0 master flags.
** Epic
* [MESOS-764] - Implement Master persistence using the Registrar.
** Improvement
* [MESOS-135] - Improve javadoc (use @param, @return, etc)
* [MESOS-269] - Better JSON Support
* [MESOS-295] - Allow new masters to have better understanding of cluster state
* [MESOS-581] - Expose cpu and memory usage statistics for master and slave
* [MESOS-610] - Split slave specific tests out of master_tests
* [MESOS-922] - Containerizer to support launching tasks by TaskInfo
* [MESOS-945] - Show framework host name in the WebUI
* [MESOS-956] - Add an "Sequence" abstraction to serialize callbacks.
* [MESOS-980] - Revisit Future discard semantics to enforce that transitions occur through a Promise.
* [MESOS-982] - Relax slave (re-)registration retries and add a backoff mechanism.
* [MESOS-983] - Expose log coordinator demotion.
* [MESOS-984] - Implement "auto-initialization" of the Replicated Log.
* [MESOS-995] - Extend Subprocess to support environment variables, changing user and working directory
* [MESOS-1015] - Some header files have 'using' statements
* [MESOS-1026] - Pull std::tuple / boost::tuples::tuple into tuples namespace of stout
* [MESOS-1036] - Implement a library for exposing statistical metrics.
* [MESOS-1041] - fatal() should use abort rather than exit(1) to get stacktraces
* [MESOS-1052] - Add a script that can run via CI to verify the reviews.
* [MESOS-1055] - Add explicit to single argument constructors
* [MESOS-1057] - libprocess: Add explicit to single argument constructors
* [MESOS-1068] - No --version command line parameter
* [MESOS-1087] - Display warning for credentials file permissions
* [MESOS-1105] - TODO(benh): choose a better scheme to set mem in slave/containerizer/containerizer.cpp
* [MESOS-1112] - Refactor the Registrar to push the operations to the caller to simplify the interface
* [MESOS-1151] - Make review bot check for style issues
* [MESOS-1155] - Improve the performance of Registrar
* [MESOS-1160] - Support flattening from Try into Future.
* [MESOS-1182] - Implement an output stream operator overload for Master::Slave
* [MESOS-1224] - Add dynamic loadable library abstraction to stout.
* [MESOS-1234] - Mesos ReviewBot should look at old reviews first
* [MESOS-1252] - Support ENV MAVEN_HOME to establish the path of the `mvn` executable.
* [MESOS-1255] - Master UI should show Mesos version
* [MESOS-1270] - Reconcile logging messages in master
* [MESOS-1274] - Disallow further operations in the Registrar when a failure occurs.
* [MESOS-1287] - metrics collection should not wait indefinitely
* [MESOS-1332] - Improve Master and Slave metric names
* [MESOS-1344] - Add flags support for JSON
* [MESOS-1349] - Mesos style checker should only check for updated files
* [MESOS-1358] - Show when the leading master was elected in the webui
* [MESOS-1382] - Include the error message in routing::socket().
* [MESOS-1405] - Mesos fetcher does not support S3(n)
** Story
* [MESOS-804] - Add authentication support for slaves
* [MESOS-838] - Consider exporting queue size as a metric from the master
** Task
* [MESOS-911] - Add pluggable authorization interface
* [MESOS-974] - Add a unit test for java api of replicated log
* [MESOS-981] - Implement Storage on the Replicated Log.
* [MESOS-1116] - Create library to track statistics of metrics
* [MESOS-1123] - Implement tests for stout/cache.hpp
* [MESOS-1132] - Port master stats.json over to new metrics library
* [MESOS-1133] - Port slave stats.json over to new metrics library
* [MESOS-1146] - Port system process stats over to new metrics library
* [MESOS-1197] - Adding signal safe os::system
* [MESOS-1217] - Add Timer metric to Metrics library
* [MESOS-1284] - metrics Timer should use Clock
* [MESOS-1304] - Create framework rate limiting design document and gather feedback
* [MESOS-1305] - Export frameworks QPS through metrics endpoint
* [MESOS-1314] - Update default registry to "replicated_log".
* [MESOS-1317] - Add integration tests to enforce the semantics of a "strict" registry.
* [MESOS-1319] - Add recovery integration tests for a "strict" registry.
* [MESOS-1320] - Add reconciliation integration tests for a "strict" registry.
* [MESOS-1321] - Add killTask integration tests for a "strict" registry.
* [MESOS-1322] - Add failover integration tests for a "strict" registry.
* [MESOS-1371] - Expose libprocess queue length from scheduler driver to metrics endpoint
* [MESOS-1373] - Keep track of the principals for authenticated pids in Master.
* [MESOS-1380] - mesos-local should set default work_dir
* [MESOS-1383] - Expose the authenticated principal through Authenticator::authenticate() result
* [MESOS-1387] - Integrate Authorizer into Master
* [MESOS-1411] - Update Master and Slave to handle status update acknowledgments going through the master.
Release Notes - Mesos - Version 0.18.2
--------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-1313] - The executor bit is now essentially ignored with the 0.18.1 fetcher implementation
Release Notes - Mesos - Version 0.18.1
--------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-979] - Master segfault when tasks.json endpoint is hit
* [MESOS-1045] - Unrecognized file extension in CommandInfo.URI causes executor to exit
* [MESOS-1078] - JNI calls hasNext on ArrayList instead of iterator
* [MESOS-1221] - Slave should update the containerizers with executor resources after recovery
* [MESOS-1241] - Unable to disable the auto-extraction of URIs (mesos-fetcher)
** Improvement
* [MESOS-1212] - Use maven to compile and package Mesos' Java files
Release Notes - Mesos - Version 0.18.0
--------------------------------------
* The primary feature of this release is a refactor of the isolation
abstraction to make it easy to add pluggable isolators/containerizers.
** Sub-task
* [MESOS-1043] - Change configure.ac to use C++11 by default.
** Bug
* [MESOS-422] - Master leader election should be more robust to stale ephemeral nodes
* [MESOS-537] - ZooKeeperMasterDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster is flaky
* [MESOS-672] - Web UI redirection does not work for hosts whose ip addresses are not publicly accessible
* [MESOS-837] - AWAIT_READY should not depend on process::Clock
* [MESOS-904] - Check for libcxx is missing in configure.ac
* [MESOS-912] - Slave sometimes crashes with SIGPIPE
* [MESOS-927] - OsTest.killtree is flaky
* [MESOS-937] - Fix "pure virtual method called" bug in zookeeper::ProcessWatcher
* [MESOS-952] - Clock::resume should adjust timeouts that were created in a paused/advanced Clock context.
* [MESOS-954] - The /__processes__ endpoint in libprocess is missing a needed lock acquisition.
* [MESOS-958] - Group should not ignore the ZNOAUTH error in creating the parent path for the group
* [MESOS-963] - Compile fails on 10.9
* [MESOS-965] - GroupTest.GroupWatchWithSessionExpiration is flaky
* [MESOS-966] - symbolize.cc:235:58: error: invalid suffix on literal; C++11 requires a space between literal and identifier
* [MESOS-967] - configure: error: cannot find libsasl2
* [MESOS-977] - MasterZooKeeperTest.LostZooKeeperCluster is flaky
* [MESOS-985] - FaultToleranceTest.IgnoreKillTaskFromUnregisteredFramework is flaky
* [MESOS-989] - Flaky whitelist tests
* [MESOS-991] - hashmap.hpp error: control reaches end of non-void function
* [MESOS-1009] - src/demangle.cc:170:13: error: comparison between pointer and integer ('const char *' and 'int')
* [MESOS-1029] - lib stout compile errors on Ubuntu 13.10 with Clang 3.5
* [MESOS-1030] - Mesos compile errors on Ubuntu 13.10 with Clang 3.5: const & ..., header guard
* [MESOS-1038] - Log coordinator should demote itself after a write is discarded.
* [MESOS-1045] - Unrecognized file extension in CommandInfo.URI causes executor to exit
* [MESOS-1049] - Cpu Isolator incorrectly writes double values when writing cpu.cfs_quota_us.
* [MESOS-1050] - Containerizer broke getting hadoop binary from $HADOOP_HOME and $PATH
* [MESOS-1051] - tar command used in fetcher not portable to OS X
* [MESOS-1063] - Containerizer fails when fetching more than one URL
* [MESOS-1079] - Mesos python egg build failure on OS X Mavericks (Xcode 5.1)
* [MESOS-1086] - DRF allocator should take into account past allocations when determining an ordering so frameworks are not starved.
* [MESOS-1095] - Build failure on OSX when using gcc-4.7
* [MESOS-1121] - /usr/include/c++/4.7/type_traits:1834:9: error: no match for call to '(std::_Bind<process::Future<process::http::Response> (*(std::_Placeholder<1>))(const std::basic_string<char>&)>) ()'
* [MESOS-1128] - ':' colon in executor work directories is unusual
* [MESOS-1135] - A re-registering framework that authenticates with Master might not get any offers
* [MESOS-1176] - make distcheck fails when enabling c++11
** Documentation
* [MESOS-926] - Document change to separate cgroup mounts
** Improvement
* [MESOS-903] - Store MasterInfo in ZK to enable master web UI redirection etc.
* [MESOS-943] - Provide an abstraction for asynchronous launching of subprocesses.
* [MESOS-975] - Show git tag info in master and slave log output
** New Feature
* [MESOS-600] - Rework Isolator abstraction
Release Notes - Mesos - Version 0.17.0
--------------------------------------
* The primary feature of this release is to add recovery support for
replicated log to make it more resilient to disk failures.
* If less than quorum of disks fail, the replicated log will
automatically perform catch-up to recover lost data.
** Sub-task
* [MESOS-902] - add post to libprocess
** Bug
* [MESOS-280] - ExecutorDriver methods' javadocs should not be referring to SchedulerDriver methods
* [MESOS-533] - SlaveRecoveryTest/0.CleanupExecutor is flaky on Jenkins.
* [MESOS-789] - Make link to times in the webui clickable
* [MESOS-799] - Mesos python egg is faulty on OS X Mavericks
* [MESOS-831] - script-without-shebang
* [MESOS-861] - FaultToleranceTest.FrameworkReliableRegistration could hang
* [MESOS-875] - A recovering slave should not ignore valid status updates.
* [MESOS-877] - Future::then and Promise::associate have memory leaks.
* [MESOS-897] - Cleanup of stout headers from fedora review
* [MESOS-913] - Help endpoint does not work on slaves.
* [MESOS-916] - add .gitignore-template file for ./bootstrap generated files
* [MESOS-925] - remove --without-curl from libprocess
* [MESOS-941] - Memory limit not correctly set when no memory resource set on executor level
* [MESOS-951] - Build failure: in log/catchup.cpp on Clang
* [MESOS-993] - Performance issue during replicated log catch-up when the initial log position is large
* [MESOS-1014] - Log truncation takes a long time during catch-up if the initial position is very large
** Documentation
* [MESOS-929] - Aurora not added to the framework docs
** Improvement
* [MESOS-749] - Add support for multiple offers in launchTasks
* [MESOS-772] - expose count of running tasks
* [MESOS-827] - Create LOOP_FOR(duration) macro to guard testing loop from running indefinitely
* [MESOS-860] - Get mesos' libprocess dependency glog to compile with clang and libc++
* [MESOS-863] - Get mesos' libprocess dependency protobuf to compile with clang and libc++
* [MESOS-864] - Eliminate the use of internal stdlibc++ templates for achieving libc++ compatibility
* [MESOS-896] - Enable newer versions of http_parser.
** New Feature
* [MESOS-736] - Support catch-up replicated log
** Task
* [MESOS-323] - Get mesos compiling with clang to open up path forward to c++11
* [MESOS-519] - Deprecate and remove old monitoring endpoint.
Release Notes - Mesos - Version 0.16.0
--------------------------------------
* The primary feature of this release is major refactoring work on the
master election and detection process to improve its reliability and
flexibility.
** Sub-task
* [MESOS-645] - Improve the performance of Future.
** Bug
* [MESOS-403] - CoordinatorTest.TruncateLearnedFill test is flaky
* [MESOS-455] - ZooKeeperTest.MasterDetectorShutdownNetwork runs forever
* [MESOS-463] - Detector ZNode creation failure.
* [MESOS-465] - Failures due to ZooKeeper operation timeouts in the master detector.
* [MESOS-498] - ZooKeeperTest.MasterDetectorTimedoutSession is flaky
* [MESOS-536] - GarbageCollectorTest.Unschedule is flaky
* [MESOS-592] - Don't dump a stack trace from bad --zk flag in the detector, use EXIT(1) instead of LOG(FATAL).
* [MESOS-624] - Master improperly prints the exit status of the executor
* [MESOS-641] - Stout killtree / pstree tests fail on Ubuntu 10.04.
* [MESOS-778] - FaultToleranceTest.ReconcileIncompleteTasks test is flaky
* [MESOS-782] - Slaves in local cluster should get unique work directories
* [MESOS-795] - ZooKeeperTest.MasterDetectorTimedoutSession test is flaky
* [MESOS-800] - CHECK failure in cgroups_isolator.
* [MESOS-807] - Discard is not propagated in process::dispatch.
* [MESOS-811] - Group::cancel can return a failed future if the membership is already cancelled
* [MESOS-822] - AllocatorTest/0.SchedulerFailover is flaky
* [MESOS-823] - ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork is flaky
* [MESOS-826] - Bad 'master' flag in slave should not print a stack trace
* [MESOS-828] - CgroupsIsolator BalloonFramework Test is broken.
* [MESOS-842] - ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork runs forever
* [MESOS-844] - Slave should not recover checkpointed data immediately after reboot
* [MESOS-851] - Scheduler Driver does not guarantee that abort() prevents further calls on the Scheduler.
* [MESOS-858] - Ignore launch/kill requests in the slave originating from non-leading masters.
* [MESOS-859] - Cgroup kill should use cgroup.procs, not tasks
* [MESOS-866] - Pailer popup window is not scrollable in Chrome or Safari
* [MESOS-867] - ZK Membership IDs are 32 bit signed integers, not 64 bit unsigned integers.
* [MESOS-870] - Slave http endpoint can crash the slave when no master is detected.
* [MESOS-871] - GroupTest.RetryableErrors is flaky
* [MESOS-883] - Group's handling of non-retryable errors and local timeout is incorrect
* [MESOS-884] - Incorrect asynchronous detection and contention loops in Master
* [MESOS-889] - Bad 'master' string given by scheduler should not print a stack trace
* [MESOS-892] - Additional Issues with contender related change
* [MESOS-935] - Group should tell MasterDetector "no memberships detected" when it locally times out
* [MESOS-940] - Slave should checkpoint bootid after recovery instead of after registration
** Improvement
* [MESOS-111] - Add SVN ignore and git ignore info to repository
* [MESOS-728] - Masters should seppuku using EXIT instead of abort() when leadership is lost.
* [MESOS-756] - Improve release tooling.
* [MESOS-760] - Capture memory usage statistics before OOM
* [MESOS-761] - Export all memory stats from memory.stat via CgroupsIsolator's usage()
* [MESOS-768] - Executor driver stop() should dispatch stop to executor process instead of terminating it
* [MESOS-802] - Web UI shows no errors when navigation to slave fails
* [MESOS-806] - Allowing converting from an Owned<T> to a Shared<T>.
* [MESOS-818] - Bump up the minimum number threads libprocess creates to accommodate new tests
* [MESOS-833] - The Status Update Manager should use a back-off mechanism for retried updates.
* [MESOS-835] - Reduce the minimum amount of CPUs required to make offers
* [MESOS-849] - As a developer I should be able to set the AUTOMAKE and ACLOCAL environment variables for autoconf to pickup when using the bootstrap script.
* [MESOS-881] - Tests are slow because the scheduler attempts to authenticate before the master realizes it is elected.
* [MESOS-900] - Paginate all tables in the web UI
Release Notes - Mesos - Version 0.15.0
--------------------------------------
* The primary feature in this release is to add authentication support
between frameworks and masters.
* You can set --authentication=true on masters to only allow
authenticated frameworks to register.
* Frameworks can call the new `MesosSchedulerDriver` constructor to
enable authentication.
* This release also moves Jenkins framework out of the mesos repo to
https://github.com/jenkinsci/mesos-plugin.
** Sub-task
* [MESOS-742] - GC directories based on modification time
* [MESOS-766] - Make --checkpoint to true by default
** Bug
* [MESOS-400] - Example Java framework test is flaky
* [MESOS-467] - AllocatorTest.FrameworkExited is flaky
* [MESOS-477] - Improve stout duration tests and Stringify(Days(value))
* [MESOS-512] - GroupTest.MultipleGroups is flaky.
* [MESOS-577] - bootstrap fails with automake 1.14
* [MESOS-650] - SlaveExecutorRerouterCtrl does not handle missing slave.
* [MESOS-655] - FaultToleranceTest.MasterFailover not simulating a realistic Master shutdown
* [MESOS-661] - WebUI pailer does not preserve newlines when data is copied from firefox.
* [MESOS-664] - Type resolution issue on 32 bit systems
* [MESOS-685] - SlaveRecoveryTest/0.RecoveryTimeout Java SIGSEGV
* [MESOS-686] - Testing isolator is broken when multiple frameworks are in play
* [MESOS-702] - Webui table headers are not consistently aligned vertically
* [MESOS-729] - ./stout/include/stout/hashmap.hpp:49:5: error: ‘erase’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
* [MESOS-732] - Make slave recovery asynchronous
* [MESOS-734] - MasterTest.ReconcileTaskTest "not authenticated"
* [MESOS-737] - Recover completed frameworks/executors during recovery
* [MESOS-738] - CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework_NoBuffer can't finish
* [MESOS-746] - master error when start with --weights input parameters
* [MESOS-747] - FaultToleranceTest.ReregisterFrameworkExitedExecutor test fails
* [MESOS-758] - Incorrect memory statistics are reported under linux
* [MESOS-762] - Revert the use of the soft limit and memory threshold notifications.
* [MESOS-771] - StatusUpdateManagerTest.DuplicateTerminalUpdateBeforeAck is flaky
* [MESOS-773] - StatusUpdateManagerTest.DuplicateTerminalUpdateBeforeAck is flaky
* [MESOS-774] - FaultToleranceTest.MasterFailover test is flaky
* [MESOS-777] - GarbageCollectorIntegrationTest.ExitedFramework test is flaky
* [MESOS-787] - Authenticatee process deadlocks
* [MESOS-792] - FaultToleranceTest.SchedulerFailoverFrameworkMessage is flaky
* [MESOS-801] - SlaveRecoveryTest / ReconcileTasksMissingFromSlave is flaky.
** Documentation
* [MESOS-518] - Improve README with Markdown
** Improvement
* [MESOS-769] - Master's authenticate should not depend on 'from'
** New Feature
* [MESOS-704] - Add authentication support using SASL and CRAM-MD5
** Task
* [MESOS-608] - Move Jenkins code out of the mesos repo to Jenkins CI repo
Release Notes - Mesos - Version 0.14.1
--------------------------------------
* This is a bug fix release.
** Sub-task
* [MESOS-725] - Slave should cleanup meta directory if started in non-strict mode and slave info changes.
** Bug
* [MESOS-420] - Master crashes when re-registering framework
* [MESOS-488] - The Master incorrectly sends a "Framework failed over" message when the scheduler driver retries an initial failover re-registration.
* [MESOS-641] - Stout killtree / pstree tests fail on Ubuntu 10.04.
* [MESOS-658] - A framework can be incorrectly removed by the Master.
* [MESOS-662] - Executor OOM could lead to a kernel hang
* [MESOS-679] - Inability to find a latest run should not be considered a recovery error
* [MESOS-680] - Empty files should not be considered as recovery errors
* [MESOS-690] - Slave finalize() throws segfault
* [MESOS-694] - Preserve exit status for SIGTERM.
* [MESOS-711] - Master::reconcile incorrectly recovers resources from reconciled tasks.
* [MESOS-714] - Slave should check if the (re-)registered is from the expected master
** Improvement
* [MESOS-620] - Add slaveDisconnected and slaveReconnected calls to the Allocator
Release Notes - Mesos - Version 0.14.0
--------------------------------------
* The primary feature in this release is "Slave Recovery" which allows
restarted slaves (e.g., deploys, crashes) to reconnect with old live
executors/tasks. To enable slave recovery:
* First you need to enable checkpointing on slaves with "--checkpoint" flag.
* Frameworks can opt in to this feature by setting "FrameworkInfo.checkpoint"
when registering with the master.
* Once a Framework opts in, a restarted slave will recover all the framework's
tasks and executors. The tasks/executors stay alive through a slave
restart and reconnect with the restarted slave.
* Slave recovery also improves the reliability of delivering status updates.
* The release also includes a new feature called "Resource Reservations" which
allows reserving resources on a slave to particular roles (This is an
experimental feature).
* This release also includes a new Mesos plugin for Jenkins which allows Jenkins
to dynamically launch Jenkins slaves on a Mesos cluster (This is an
experimental feature).
* There are also several bug fixes and stability improvements.
** Sub-task
* [MESOS-548] - Upgrade angular.js to use the full angular-ui.js
* [MESOS-549] - Change truncated IDs to show on hover
* [MESOS-630] - Improve the performance of Master::Http::stats().
** Bug
* [MESOS-235] - Mesos daemon ignores --conf option
* [MESOS-368] - HTTP.Endpoints test is flaky.
* [MESOS-370] - The process based isolation module should walk the process tree to collect resource usage.
* [MESOS-380] - Command Executor doesn't send TASK_KILLED for killed tasks.
* [MESOS-434] - Process isolator libprocess throws exception
* [MESOS-449] - CgroupsTests are flaky on Ubuntu
* [MESOS-451] - Always update resources for re-registered executors.
* [MESOS-461] - Freezer failure while in FREEZING state.
* [MESOS-479] - SlaveRecoveryTest/0.CleanupExecutor failure.
* [MESOS-485] - Latest trunk fails on strict aliasing on CentOS
* [MESOS-490] - Update mesos-daemon.sh (and associated scripts) to work with new flags mechanisms.
* [MESOS-497] - Queued tasks should be launched in the order they were received
* [MESOS-499] - Local slave run crashes on startup
* [MESOS-508] - Master crash due to Broken Pipe
* [MESOS-514] - FaultToleranceTest.ReconcileIncompleteTasks is flaky
* [MESOS-522] - ZooKeeperMasterDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster
* [MESOS-534] - ReaperTest.TerminatedChildProcess is flaky on Jenkins.
* [MESOS-545] - Remove hack in post-reviews.py for tracking parent branch
* [MESOS-582] - HTTP.Endpoints is flaky
* [MESOS-594] - Add CXXFLAGS='-fno-strict-aliasing' if using gcc 4.4.*.
* [MESOS-597] - Set MESOS_NATIVE_LIBRARY or (DY)LD_LIBRARY_PATH before launching an executor in order to enable JVM based executors to easily find libmesos.so.
* [MESOS-599] - Make sure stderr/stdout get launcher output.
* [MESOS-607] - Slave recovery should properly handle executors that were cleanly terminated in the previous run
* [MESOS-611] - Refactor slave recovery to ensure slave recovers its state first
* [MESOS-612] - Slave should not recover completed executors
* [MESOS-614] - Master should remove checkpointing slave that gets disconnected when the new slave tries to register
* [MESOS-616] - The Master / Slave should not store frameworks as both active and completed.
* [MESOS-619] - Master should properly reconcile KillTasks
* [MESOS-627] - Slave should offer total disk instead of available disk by default
* [MESOS-628] - A non-checkpointing slave should still cleanup the latest slave symlink
* [MESOS-632] - Executor driver should commit suicide if it cannot re-connect with a slave after a timeout
* [MESOS-633] - Master should inform a recovered slave about frameworks that were completed
* [MESOS-635] - Master doesn't update the task state when it generates TASK_LOST
* [MESOS-636] - Executors under cgroups isolator die immediately when a slave dies if it has a controlling TTY attached
* [MESOS-637] - Executor should re-register with the updates in the same order as it received them
* [MESOS-638] - Slave should not send command executor infos to master when it reregisters
* [MESOS-640] - Duplicate status update with same UUID crashes the slave
* [MESOS-644] - Slave doesn't correctly handle checkpointed terminal update whose ack doesn't reach the executor
* [MESOS-646] - Slave recovery doesn't properly handle checkpointed queued tasks
* [MESOS-648] - Slave should properly handle partial writes of status updates
* [MESOS-657] - SlaveRecoveryTest/1.PartitionedSlave fails with cgroups
* [MESOS-668] - SlaveRecoveryTest/0.MultipleFrameworks flaky
* [MESOS-671] - CgroupsIsolator does not listen for OOM events on recovered executors.
* [MESOS-673] - Task reconciliation does not properly release executor resources.
* [MESOS-675] - CHECK failure in the Master.
* [MESOS-676] - Slave::reregistered LOG(FATAL)s due to being in RECOVERING state.
* [MESOS-689] - Master incorrectly rejects tasks for long lived executors if they don't have FrameworkID set
** Improvement
* [MESOS-179] - Need to check for Python development headers
* [MESOS-221] - New Allocators
* [MESOS-329] - Add 'help' endpoints to libprocess.
* [MESOS-552] - Jenkins scheduler should use the latest Mesos jar built from the repo
* [MESOS-553] - Jenkins plugin should bundle the native Mesos library
* [MESOS-554] - Jenkins scheduler should properly handle TASK_LOST
* [MESOS-555] - Jenkins scheduler should reuse a Jenkins slave
* [MESOS-557] - Upgrade to Bootstrap CSS v2.3.2
* [MESOS-558] - Upgrade to full release of Angular JS
* [MESOS-559] - Replace Bootstrap's JS with Angular UI Bootstrap
* [MESOS-580] - Improve Command Executor
* [MESOS-613] - Give better guidance when recovery fails
* [MESOS-626] - Add the ability for example frameworks to checkpoint
* [MESOS-634] - Make slave recovery more robust by ignoring absence of files
* [MESOS-651] - Expose slave re-registration time in the Web UI
* [MESOS-663] - Expose recovery errors when running recovery in --no-strict mode
** New Feature
* [MESOS-110] - Slave Recovery: A slave restart should not restart tasks
* [MESOS-203] - Killtree that recursively kills sessions