Skip to content

Commit

Permalink
Add A/B testing examples
Browse files Browse the repository at this point in the history
  • Loading branch information
stefanprodan committed Mar 8, 2019
1 parent bf1ca29 commit 49c942b
Show file tree
Hide file tree
Showing 5 changed files with 188 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Flagger documentation can be found at [docs.flagger.app](https://docs.flagger.ap
* [Routing](https://docs.flagger.app/how-it-works#istio-routing)
* [Canary deployment stages](https://docs.flagger.app/how-it-works#canary-deployment)
* [Canary analysis](https://docs.flagger.app/how-it-works#canary-analysis)
* [A/B testing](https://docs.flagger.app/how-it-works#ab-testing)
* [HTTP metrics](https://docs.flagger.app/how-it-works#http-metrics)
* [Custom metrics](https://docs.flagger.app/how-it-works#custom-metrics)
* [Webhooks](https://docs.flagger.app/how-it-works#webhooks)
Expand Down Expand Up @@ -167,7 +168,6 @@ For more details on how the canary analysis and promotion works please [read the
### Roadmap
* Add A/B testing capabilities using fixed routing based on HTTP headers and cookies match conditions
* Integrate with other service mesh technologies like AWS AppMesh and Linkerd v2
* Add support for comparing the canary metrics to the primary ones and do the validation based on the derivation between the two
Expand Down
61 changes: 61 additions & 0 deletions artifacts/ab-testing/canary.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
apiVersion: flagger.app/v1alpha3
kind: Canary
metadata:
name: abtest
namespace: test
spec:
# deployment reference
targetRef:
apiVersion: apps/v1
kind: Deployment
name: abtest
# the maximum time in seconds for the canary deployment
# to make progress before it is rollback (default 600s)
progressDeadlineSeconds: 60
# HPA reference (optional)
autoscalerRef:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
name: abtest
service:
# container port
port: 9898
# Istio gateways (optional)
gateways:
- public-gateway.istio-system.svc.cluster.local
# Istio virtual service host names (optional)
hosts:
- abtest.istio.weavedx.com
canaryAnalysis:
# schedule interval (default 60s)
interval: 10s
# max number of failed metric checks before rollback
threshold: 10
# total number of iterations
iterations: 10
# canary match condition
match:
- headers:
user-agent:
regex: "^(?!.*Chrome)(?=.*\bSafari\b).*$"
- headers:
cookie:
regex: "^(.*?;)?(user=test)(;.*)?$"
metrics:
- name: istio_requests_total
# minimum req success rate (non 5xx responses)
# percentage (0-100)
threshold: 99
interval: 1m
- name: istio_request_duration_seconds_bucket
# maximum req duration P99
# milliseconds
threshold: 500
interval: 30s
# external checks (optional)
webhooks:
- name: load-test
url: http://flagger-loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://podinfo.test:9898/"
67 changes: 67 additions & 0 deletions artifacts/ab-testing/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: abtest
namespace: test
labels:
app: abtest
spec:
minReadySeconds: 5
revisionHistoryLimit: 5
progressDeadlineSeconds: 60
strategy:
rollingUpdate:
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
app: abtest
template:
metadata:
annotations:
prometheus.io/scrape: "true"
labels:
app: abtest
spec:
containers:
- name: podinfod
image: quay.io/stefanprodan/podinfo:1.4.0
imagePullPolicy: IfNotPresent
ports:
- containerPort: 9898
name: http
protocol: TCP
command:
- ./podinfo
- --port=9898
- --level=info
- --random-delay=false
- --random-error=false
env:
- name: PODINFO_UI_COLOR
value: blue
livenessProbe:
exec:
command:
- podcli
- check
- http
- localhost:9898/healthz
initialDelaySeconds: 5
timeoutSeconds: 5
readinessProbe:
exec:
command:
- podcli
- check
- http
- localhost:9898/readyz
initialDelaySeconds: 5
timeoutSeconds: 5
resources:
limits:
cpu: 2000m
memory: 512Mi
requests:
cpu: 100m
memory: 64Mi
19 changes: 19 additions & 0 deletions artifacts/ab-testing/hpa.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: abtest
namespace: test
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: abtest
minReplicas: 2
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
# scale up if usage is above
# 99% of the requested CPU (100m)
targetAverageUtilization: 99
40 changes: 40 additions & 0 deletions docs/gitbook/how-it-works.md
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,46 @@ At any time you can set the `spec.skipAnalysis: true`.
When skip analysis is enabled, Flagger checks if the canary deployment is healthy and
promotes it without analysing it. If an analysis is underway, Flagger cancels it and runs the promotion.

### A/B Testing

Besides weighted routing, Flagger can be configured to route traffic to the canary based on HTTP match conditions.
In an A/B testing scenario, you'll be using HTTP headers or cookies to target a certain segment of your users.

Spec:

```yaml
canaryAnalysis:
# schedule interval (default 60s)
interval: 1m
# total number of iterations
iterations: 10
# max number of failed iterations before rollback
threshold: 2
# canary match condition
match:
- headers:
user-agent:
regex: "^(?!.*Chrome)(?=.*\bSafari\b).*$"
- headers:
cookie:
regex: "^(.*?;)?(user=test)(;.*)?$"
```

The above configuration will run an analysis for ten minutes targeting the Safari users and those that have a test cookie.
You can determine the minimum time that it takes to validate and promote a canary deployment using this formula:

```
interval * iterations
```

And the time it takes for a canary to be rollback when the metrics or webhook checks are failing:

```
interval * threshold
```

Make sure that the analysis threshold is lower than the number of iterations.

### HTTP Metrics

The canary analysis is using the following Prometheus queries:
Expand Down

0 comments on commit 49c942b

Please sign in to comment.