Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create UPGRADING.md #1607

Merged
merged 3 commits into from
Sep 3, 2020
Merged

Create UPGRADING.md #1607

merged 3 commits into from
Sep 3, 2020

Conversation

hilagross
Copy link

@hilagross hilagross commented Jun 10, 2020

What does this PR do?

In this PR, we add upgrade instructions to this project. The instructions apply to docker-compose deployments of Conjur OSS. For helm-based upgrade instructions, users can see the Conjur helm chart upgrade guide.

In addition, this adds custom upgrade instructions for users who will be upgrading from pre-1.8 versions to 1.8+.

What ticket does this PR close?

Connected to #1584, #1528.

Checklists

Change log

  • The CHANGELOG has been updated, or
  • This PR does not include user-facing changes and doesn't require a CHANGELOG update

Test coverage

  • This PR includes new unit and integration tests to go with the code changes, or
  • The changes in this PR do not require tests

Documentation

  • This PR does not require updating any documentation, or
  • Docs (e.g. READMEs) were updated in this PR, and/or there is a follow-on issue to update docs

UPGRADING.md Outdated

## Specific steps

**This step should be done when upgrading from version 1.7.2 and below to a newer version**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a markdown subsection? It would be convenient to use this going forward to note any additional version-specific instructions, and if the line is

### 1.8.0

Then we can link to it in our release notes and it's clear that the custom upgrade instructions apply to this specific version

Note that I am assuming the changes will land before the next tag, and that the tag will be a minor version bump.

Copy link
Author

@hilagross hilagross Jun 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izgeri just to make sure I understood correctly, also the from version should be change to 1.8.0 or 1.7.2 is ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming that the FIPS changes will be included in the next tag, and that we'll bump the minor version since we're adding functionality in a backwards compatible manner (following semver - see here)

That would mean they'll be included in 1.8.0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's for sure, what I meant should we say 'from 1.7.2 and below' or 'below version 1.8.0'?
what should be the correct term for this FYO?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just make it say Upgrading to 1.8.0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will happen if a user will upgrade from 1.7.0 to 1.8.1 or even 1.9.0 in the future?
I think it should be clear the any upgrade from a version before 1.8.0 should run this step.
WDYT?
@alexkalish @izgeri

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you have is fine for now.

@izgeri
Copy link
Contributor

izgeri commented Jun 16, 2020

@hilagross I'd like to have a technical review of this before we merge it. @diverdane and @sgnn7 have recently been through the upgrade process in great detail for the helm chart - the docker-compose upgrade flow is by comparison much simpler. I am checking with @alexkalish about whether we might be able to schedule a detailed technical review of this in the next week or so - I think the end result would really benefit from this.

Copy link
Contributor

@alexkalish alexkalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hilagross: Thank you for doing this! I made a ton of suggestions, mostly around wording and punctuation. I think that only my comment about the special step requires a manual change.

UPGRADING.md Outdated

## Specific steps

**This step should be done when upgrading from version 1.7.2 and below to a newer version**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you have is fine for now.

@InbalZilberman
Copy link
Contributor

@alexkalish @izgeri can you please take over this PR or close it and open another one?
Hila is not expected to return any time soon :)

@izgeri
Copy link
Contributor

izgeri commented Jul 15, 2020

@alexkalish I will make your changes and try a quick sanity check before asking you to rereview

@izgeri izgeri force-pushed the hilagross-patch-1 branch from bdceab6 to d5bedcd Compare July 15, 2020 21:06
@hilagross hilagross requested a review from a team as a code owner July 15, 2020 21:06
@izgeri izgeri force-pushed the hilagross-patch-1 branch from d5bedcd to fa57c08 Compare July 15, 2020 21:06
@izgeri
Copy link
Contributor

izgeri commented Jul 15, 2020

@InbalZilberman @alexkalish I applied Alex's updates and then tried to run through the upgrade steps as documented using the quick start docker-compose with the Conjur image version specified (e.g. I modified this line to read image: cyberark/conjur:1.5.0, for example).

I was not able to get the original draft instructions to work, but a quick bit of research allowed me to tweak them to get them working for a standard upgrade from 1.5.0 to 1.7.4. Those updates are now in a separate commit on this branch.

Once I got those working, I tried the custom upgrade instructions for upgrading from 1.7.4 to 1.8.0. I was NOT SUCCESSFUL. Running bundle exec rake slosilo:migrate on the Conjur server container did not produce any log output, and when I attempted to interact with the API afterwards:

  • I was able to successfully authenticate, and exchange an API key for an access token

  • when I tried to use that access token on a secret retrieval request, I got an Invalid token response:

    The retrieved value is: HTTP/1.1 401 Unauthorized
    Server: nginx/1.13.6
    Date: Wed, 15 Jul 2020 21:03:35 GMT
    Content-Type: text/plain
    Content-Length: 27
    Connection: keep-alive
    Cache-Control: no-cache
    X-Request-Id: a27dc38f-0772-4fdc-b62b-a3181a641123
    X-Runtime: 0.001891
    
    Unauthorized: Invalid token
    

    The Conjur logs show:

     Started POST "/authn/myConjurAccount/host%2FBotApp%2FmyDemoApp/authenticate" for 192.168.32.3 at 2020-07-15 21:03:33 +0000
    [origin=192.168.32.3] [request_id=29c707d4-a7ff-4e58-a317-0071c60f6a30] [tid=36] Processing by AuthenticateController#authenticate as */*
    [origin=192.168.32.3] [request_id=29c707d4-a7ff-4e58-a317-0071c60f6a30] [tid=36]   Parameters: {:controller=>"authenticate", :action=>"authenticate", :authenticator=>"authn", :account=>"myConjurAccount", :id=>"host/BotApp/myDemoApp"}
    [origin=192.168.32.3] [request_id=29c707d4-a7ff-4e58-a317-0071c60f6a30] [tid=36] myConjurAccount:host:BotApp/myDemoApp successfully authenticated with authenticator authn service myConjurAccount:webservice:conjur/authn
    [origin=192.168.32.3] [request_id=29c707d4-a7ff-4e58-a317-0071c60f6a30] [tid=36] Completed 200 OK in 17ms (Views: 0.2ms)
    [origin=192.168.32.3] [request_id=a27dc38f-0772-4fdc-b62b-a3181a641123] [tid=34] Started GET "/secrets/myConjurAccount/variable/BotApp%2FsecretVar" for 192.168.32.3 at 2020-07-15 21:03:35 +0000
    

    There is nothing after the Started GET line in the Conjur logs - the server appears to hang. The nginx logs show:

    192.168.32.2 - - [15/Jul/2020:21:03:33 +0000] "POST /authn/myConjurAccount/host%2FBotApp%2FmyDemoApp/authenticate HTTP/1.1" 200 644 "-" "curl/7.55.0"
    192.168.32.2 - - [15/Jul/2020:21:03:35 +0000] "GET /secrets/myConjurAccount/variable/BotApp%2FsecretVar HTTP/1.1" 401 27 "-" "curl/7.55.0"
    

I am going to try again going from 1.7.4 to 1.8.1, but if I get the same result I will likely need help from @uCatu and team to figure out next steps. I don't think we'll include post-1.8 releases in the suite release until this is resolved.

ETA: I have confirmed that the same problem occurs:

  • when upgrading from 1.7.4 to 1.8.1 following the standard instructions + the custom documented step for 1.8.0
  • when upgrading from 1.7.4 to 1.8.0 following the standard instructions

Copy link
Contributor

@uCatu uCatu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP please

UPGRADING.md Outdated

1. View Docker containers and verify all are healthy, up and running:
```
docker ps -a
Copy link
Contributor

@uCatu uCatu Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using docker-compose as we assum user also use. And in case user has additional container that are not related to conjur they will not appear in output
docker-compose ps -a
image

UPGRADING.md Outdated
using Docker Compose. These steps assume you have defined your Conjur image in
a service named `conjur`, and that you have access to the Conjur data key
that was used when you originally deployed your Conjur server.
1. In your terminal window, set the `CONJUR_DATA_KEY` environment variable:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. In your terminal window, set the `CONJUR_DATA_KEY` environment variable:
1. In local terminal session, set `CONJUR_DATA_KEY` environment variable:

UPGRADING.md Outdated

The following steps describe a standard upgrade of the Conjur server, when deployed
using Docker Compose. These steps assume you have defined your Conjur image in
a service named `conjur`, and that you have access to the Conjur data key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mention of Conjur data key is much more important than naming of conjur service
Can we put the spotlight on it? It can be confusing with API KEY I think we need to give here a link (maybe from here or here ) to where we suggsted to store it, that it's a Base64-encoded 256 bit data encrytion key etc, without it - there is no upgrade

UPGRADING.md Outdated
that was used when you originally deployed your Conjur server.
1. In your terminal window, set the `CONJUR_DATA_KEY` environment variable:
```
export CONJUR_DATA_KEY={your Conjur data key}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export CONJUR_DATA_KEY={your Conjur data key}
export CONJUR_DATA_KEY=<Conjur data key content>

UPGRADING.md Outdated
```

It may also be useful to check the logs of the Conjur
container to ensure that Puma started successfully, which can be done by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know if we want to talk about Puma here, adding comlpexitiy

UPGRADING.md Outdated
### Troubleshooting

If you run through the steps above _without_ setting the `CONJUR_DATA_KEY`
environment variable first, you will be able to complete the steps successfully
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change "succesfully" to "without visible/explicit error message"

UPGRADING.md Outdated
environment variable first, you will be able to complete the steps successfully
but the logs of the new Conjur container will show an error like:
```
$ docker logs conjur_server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we chose to use docer-compose please switch this example to the one above

UPGRADING.md Outdated
Comment on lines 61 to 62
`conjur_server` in the example command above is defined as the `container_name`
of the `conjur` service in the `docker-compose.yml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundent in any case

UPGRADING.md Outdated
newer version.**

Starting in version 1.8.0, the hashing algorithm used to fingerprint and identify
encryption keys was changed from MD5 to SHA256. This fingerprint is stored in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to add
"from MD5 to a more secure SHA256" ?

uCatu
uCatu previously requested changes Jul 16, 2020
Copy link
Contributor

@uCatu uCatu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP please

UPGRADING.md Outdated
the process again. This time when you check the logs of the Conjur server
container you should see the Puma service starting as expected:
```
$ docker logs conjur_server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this command should be better? encapsulate with CLI
docker-compose exec conjur conjurctl wait

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you suggested that! I use that command myself and was trying to find a way to work it in - I hesitated, because I'm not sure it works well enough when the server fails. I'll play around with it and see how it goes.

@izgeri
Copy link
Contributor

izgeri commented Jul 16, 2020

@uCatu thanks for your feedback, and I'll try to work in your changes.

IMPORTANTLY the custom upgrade instructions for v1.8.0 don't work as currently written. do you have any more info on this?

@uCatu
Copy link
Contributor

uCatu commented Jul 16, 2020

@uCatu thanks for your feedback, and I'll try to work in your changes.

IMPORTANTLY the custom upgrade instructions for v1.8.0 don't work as currently written. do you have any more info on this?

@izgeri
Yes I'm on it will take me 30 minutes to figure out if we can use slosilo task - as it never been tested. The original approach was to change fingerprint manual on PG via psql - just wanted to touch base on the whole tutorial since it's the first time I'm reading it, thats why I commented on other issues

Available on Slack for quick response :)

@uCatu
Copy link
Contributor

uCatu commented Jul 16, 2020

@uCatu thanks for your feedback, and I'll try to work in your changes.
IMPORTANTLY the custom upgrade instructions for v1.8.0 don't work as currently written. do you have any more info on this?

@izgeri
Yes I'm on it will take me 30 minutes to figure out if we can use slosilo task - as it never been tested. The original approach was to change fingerprint manual on PG via psql - just wanted to touch base on the whole tutorial since it's the first time I'm reading it, thats why I commented on other issues

Available on Slack for quick response :)

Revisiting Slosilo migrate task:
https://github.com/cyberark/slosilo/blob/4c4c6d2beac698b8958acab5627d926fe240c6fb/lib/slosilo/adapters/sequel_adapter.rb#L52

tells us that in order to get fingerprint value, fingrperint column must be deleted or not present
as we are talking about upgrade the latter is not relevent

That means that we can stil utilize that task by still needs to (was able to create this oneliner):
docker-compose exec -u postgres database psql -c "ALTER TABLE slosilo_keystore DROP COLUMN fingerprint"

and than run:
docker-compose exec conjur bundle exec rake slosilo:migrate

I tested it, looks fine can you verify please?

Copy link
Contributor

@izgeri izgeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deployed v1.5.0 using the quick start instructions

Then I built Conjur locally and changed the docker-compose to read conjur:1.8.1-14274915 for the image. I followed the upgrade instructions and it worked beautifully! Very nice work @h-artzi 👏

I did make one suggestion to change a section title. But other than that, I think this PR is good to go.

@izgeri
Copy link
Contributor

izgeri commented Sep 3, 2020

@h-artzi it looks like it also needs a rebase before I can approve

izgeri
izgeri previously approved these changes Sep 3, 2020
Copy link
Contributor

@izgeri izgeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work to everyone involved! Thanks @hilagross for kicking this off, @uCatu for drafting an initial process, and @h-artzi for getting it over the finish line! It feels great to have this added to our project :)

@izgeri izgeri dismissed uCatu’s stale review September 3, 2020 19:22

@uCatu I believe @h-artzi addressed all of your requested changes, with one optional request that seems ok to skip. If you have any additional requests, let's work on them in a follow-up issue - this one's been open long enough!

@izgeri izgeri dismissed alexkalish’s stale review September 3, 2020 19:23

Dismissing your review @alexkalish because it would appear all of your changes have been addressed. Thanks for helping make this a better document!!

izgeri
izgeri previously approved these changes Sep 3, 2020
@codeclimate
Copy link

codeclimate bot commented Sep 3, 2020

Code Climate has analyzed commit 9a661bc and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (50% is the threshold).

This pull request will bring the total coverage in the repository to 86.8% (0.0% change).

View more on Code Climate.

@izgeri izgeri merged commit 3b7cbe2 into master Sep 3, 2020
@izgeri izgeri deleted the hilagross-patch-1 branch September 3, 2020 20:33
@alexkalish
Copy link
Contributor

Wohoo!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants