Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design: Microservice Health Check Endpoints #1

Open
StephenOTT opened this issue Jul 16, 2017 · 0 comments
Open

Design: Microservice Health Check Endpoints #1

StephenOTT opened this issue Jul 16, 2017 · 0 comments

Comments

@StephenOTT
Copy link
Member

StephenOTT commented Jul 16, 2017

WIP. Designs for Health Check Endpoints

Some things may not just be for monitoring. For example some items may go into Authentication or security modules.

Potential Features:

  1. Online/Offline
  2. Number of Database Rows
  3. Database Size
  4. Connectivity Checks with Other Microservices that the current service connects to.
  5. IsHealthy status based on throw exceptions in N period of time.
  6. Current Ports
  7. Last Ping DateTime
  8. Next Ping DateTime
  9. System Timezone
  10. System time.
  11. The rate of requests directed at each service or subsystem.
  12. The response times of these requests.
  13. The volume of data flowing into and out of each service.
  14. Middleware indicators, such as queue length.
  15. All sign-in attempts, whether they fail or succeed.
  16. Detect attempted intrusions by an unauthenticated entity.
  17. Identify attempts by entities to perform operations on data for which they have not been granted access.
  18. If one account makes repeated failed sign-in attempts within a specified period.
  19. Meeting SLAs for response times.
  20. Whether the server is able to read documents from database.
  21. Whether the server is able to write documents to the database.
  22. Whether the server is dissociated from the cloud.
  23. Whether the server is considered to be in an active state.
  24. Whether the server is a member of a specific cluster.
  25. Whether the server is in the list of servers.
  26. Specific Configs in the Database
  27. ...

Special Features

  1. Ping URL: used to constantly ping the system to determine generic response (only returns a response code if the system respond).

Types of Checks to preform:

  1. State of the DB
  2. State of the various Application
  3. The connectivity to other Microservices.
  4. Symfony Errors recently thrown
  5. Storage
  6. SSL
  7. Last Admin Login?

Relavent Links:

  1. https://github.com/hootsuite/health-checks-api#api-documentation
  2. https://www.consul.io/api/health.html

Examples:

Example from JIRA:

{
  "statuses": [
    {
      "id": 0,
      "completeKey": "com.atlassian.jira.plugins.jira-healthcheck-plugin:hsqlHealthCheck",
      "name": "Embedded database",
      "description": "Checks if the instance is connected to an HSQL or H2 database",
      "isHealthy": true,
      "failureReason": "You are not using an HSQL or H2 embedded database with a production license.",
      "application": "JIRA",
      "time": 1484054268590,
      "severity": "undefined",
      "documentation": "https://confluence.atlassian.com/x/1guaEQ",
      "tag": "Supported Platforms",
      "healthy": true
    },
    {
      "id": 0,
      "completeKey": "com.atlassian.jira.plugins.jira-healthcheck-plugin:eolHealthCheck",
      "name": "End of Life",
      "description": "Checks if the running version of JIRA is approaching, or has reached End of Life.",
      "isHealthy": true,
      "failureReason": "JIRA version 7.3.x has not reached End of Life. This version will reach End of Life in 722 days.",
      "application": "JIRA",
      "time": 1484054268591,
      "severity": "undefined",
      "documentation": "https://confluence.atlassian.com/x/HjnRLg",
      "tag": "Supported Platforms",
      "healthy": true
    },
    {
      "id": 0,
      "completeKey": "com.atlassian.jira.plugins.jira-healthcheck-plugin:baseUrlHealthCheck",
      "name": "Base URL",
      "description": "Checks if JIRA is able to access itself through the configured Base URL.",
      "isHealthy": false,
      "failureReason": "JIRA is not able to access itself through the configured Base URL. This is necessary so that dashboard gadgets can be generated successfully. Please verify the current Base URL and if necessary, review your network configurations to resolve the problem.",
      "application": "JIRA",
      "time": 1484054344127,
      "severity": "warning",
      "documentation": "https://confluence.atlassian.com/x/WCA6Mw",
      "tag": "Supported Platforms",
      "healthy": false
    },

Consul Example:

$ curl \
    https://consul.rocks/v1/health/service/my-service
[
  {
    "Node": {
      "ID": "40e4a748-2192-161a-0510-9bf59fe950b5",
      "Node": "foobar",
      "Address": "10.1.10.12",
      "Datacenter": "dc1",
      "TaggedAddresses": {
        "lan": "10.1.10.12",
        "wan": "10.1.10.12"
      },
      "Meta": {
        "instance_type": "t2.medium"
      }
    },
    "Service": {
      "ID": "redis",
      "Service": "redis",
      "Tags": ["primary"],
      "Address": "10.1.10.12",
      "Port": 8000
    },
    "Checks": [
      {
        "Node": "foobar",
        "CheckID": "service:redis",
        "Name": "Service 'redis' check",
        "Status": "passing",
        "Notes": "",
        "Output": "",
        "ServiceID": "redis",
        "ServiceName": "redis",
        "ServiceTags": ["primary"]
      },
      {
        "Node": "foobar",
        "CheckID": "serfHealth",
        "Name": "Serf Health Status",
        "Status": "passing",
        "Notes": "",
        "Output": "",
        "ServiceID": "",
        "ServiceName": "",
        "ServiceTags": null 
      }
    ]
  }
]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant