Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate incident Response Plans #77

Closed
9 of 11 tasks
afeld opened this issue Aug 19, 2019 · 16 comments
Closed
9 of 11 tasks

Consolidate incident Response Plans #77

afeld opened this issue Aug 19, 2019 · 16 comments
Assignees

Comments

@afeld
Copy link
Contributor

afeld commented Aug 19, 2019

User Story:

As a member of TTS, I don't want to have to figure out the differences per project team/system, especially in real-time.

As a member for the Tech Portfolio, we should see where we can reconcile, to make ATOs and incident response easier / more consistent.

Problem Statement:

Seems that we have a number of incident response plans floating around

Actions to take:

  • Create a list of known incident response plans
  • Propose thinning of the cloud.gov plan
  • Propose thinning of the login.gov plan
  • Figure out how to integrate with Continuity of Operations Plan (COOP)
  • Clearly state required resolution times and where they come from

Acceptance criteria:

  • Obtain cloud.gov IR plan
  • Obtain login.gov IR plan
  • Obtain search.gov IR plan
  • Obtain data.gov IR plan
  • resolution times and where they come from are clearly stated
  • setup bi-weekly meetings with interested TTS programs

Supporting Documentation:

Related issues:

cc #49

@hillaryj
Copy link

Here's the IR guide and checklist for cloud.gov:

@adborden
Copy link
Contributor

Data.gov tracks incidents in gsa/datagov-incident-response and the plan/checklist is in Drive.

@afeld
Copy link
Contributor Author

afeld commented Aug 29, 2019

From #49 (comment):

Re SLAs, here's a document to be updated: https://github.com/18F/bug-bounty/blob/master/project-instructions.md#slas-your-program-must-meet

Bug bounty refers to a 90-day SLA, but recommends 7-day for high, 30-day for medium.

Though the 90-day SLA above is the only required timeline, we do (strongly) suggest that your program aims for faster remediation of more severe issues.

We might want to align them better with GSA's policies.

@afeld afeld changed the title As a Tech Portfolio member, I want a consistent approach to incident response As a TTS member responding to / triaging incidents, I want a consistent approach to incident response Aug 29, 2019
@afeld afeld changed the title As a TTS member responding to / triaging incidents, I want a consistent approach to incident response As a TTS member responding to / triaging security reports, I want a consistent approach to incident response Aug 29, 2019
@afeld
Copy link
Contributor Author

afeld commented Aug 29, 2019

Clarified this issue to be about about triaging and response, while #49 is about reporting.

@afeld afeld changed the title As a TTS member responding to / triaging security reports, I want a consistent approach to incident response As a TTS member responding to / triaging security reports, I want a consistent approach Aug 29, 2019
@afeld afeld changed the title As a TTS member responding to / triaging security reports, I want a consistent approach As a TTS member triaging / responding to security reports, I want a consistent approach Aug 29, 2019
@afeld
Copy link
Contributor Author

afeld commented Sep 24, 2019

@its-a-lisa-at-work its-a-lisa-at-work self-assigned this Oct 9, 2019
@dawnpm
Copy link

dawnpm commented Oct 10, 2019

Are you talking about Contingency Plans (which would include COOP) or Incident Response Plans?

GSA uses the terms separately - Incidents are human driven (e.g., attacks, infiltrations, PII leakage), Contingencies are force majeure (e.g., the Cloud Host Has Major Problems, COOP situations). The official incident response plan is managed by the GSA IR team, so much so that there is common language for that control in the SSP templates. We are finalizing our official Contingency Plan in the GSA template, which will be posted to Google Drive. We also have an emergency procedures doc for our team's use.

@afeld
Copy link
Contributor Author

afeld commented Oct 10, 2019

Good point on that distinction. I think we should consider both, though maybe not at the same time.

@gbinal
Copy link

gbinal commented Oct 11, 2019

For api.data.gov, you can partially see this in how we document the process for handling abnormalities that show up in the regular monitoring. But the process then basically says that we follow the 18F incident response process.

https://github.com/18F/api.data.gov/blob/master/docs/procedures.md#weekly-monitoring-checklist

@dawnpm
Copy link

dawnpm commented Oct 15, 2019

Good point on that distinction. I think we should consider both, though maybe not at the same time.

Search.gov manages our own contingency plan, and uses GSA's IR plan.

@afeld
Copy link
Contributor Author

afeld commented Nov 15, 2019

@dawnpm Mind providing a link?

@afeld
Copy link
Contributor Author

afeld commented Nov 15, 2019

  • Split the Contingency / Continuity of Operations plan consolidation into its own issue

@hillaryj
Copy link

hillaryj commented Nov 15, 2019

Per discussion at the wg-security meeting, @its-a-lisa will be leading the unification of IR plans from the TTS Tech Portfolio side. She'll coordinate simplifying the plans into one base plan with addendums.

Marshall Brown (representing the COOP process & docs) says that within the next two months are needed to integrate login.gov and cloud.gov into the COOP contingency process.

@dawnpm
Copy link

dawnpm commented Nov 15, 2019

@its-a-lisa-at-work its-a-lisa-at-work removed this from the Nov IRL milestone Nov 23, 2019
@its-a-lisa-at-work its-a-lisa-at-work changed the title As a TTS member triaging / responding to security reports, I want a consistent approach Consolidate incident Response Plans Nov 29, 2019
@mogul
Copy link
Contributor

mogul commented Dec 13, 2019

We ran an IR exercise for data.gov this week, and just did a retro; notes are here.

One of our action items was to follow up and make sure y'all know we really want this consolidation. It's confusing to have our own process, and then have to fork up/out to the broader TTS process, which results in a lot of redundant comms juggling as well as documentation overload in-the-moment.

@pburkholder
Copy link

Back to @hillaryj 's point, cloud.gov has four guides actually:

https://cloud.gov/docs/ops/security-ir/ - the documentation
https://cloud.gov/docs/ops/security-ir-checklist/ - what's supposed to be quicker in real-time process guide
https://cloud.gov/docs/ops/service-disruption-guide/ - incidents that aren't security incidents
https://cloud.gov/docs/ops/contingency-plan/ - sustained loss of services we depend on

I would like to see consolidation so we start with "Service disruption" and then follow paths depending on what the source of the disruption is.

@its-a-lisa-at-work
Copy link
Contributor

Closing this issue today based on today's meeting.

Opened up new issue to continue on this effort: #229

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants