Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to runtimes variables in support of retiring 'whisk/*' runtimes #3412

Closed
wants to merge 1 commit into from

Conversation

jonpspri
Copy link
Contributor

@jonpspri jonpspri commented Mar 8, 2018

Changes to ansible processes for determining and maybe pulling runtime images, as a preliminary to eliminating the hardcoding of the 'whisk' prefix for runtime docker images.

Description

As per discussion in #3407 , this PR contains changes to ansible processes that will be needed to support the removal of the need for 'whisk/*' images to be in place for builtin runtimes. Instead,
runtimes may (do we eventually want should or must?) be specified in the runtimes.json file.

A pleasant side effect of this change is that all variables (I think) impacting runtimes have now been consolidated into the 'runtimes' YAML object in ansible/group_vars/all, and all overrides are tested and defaulted at that point. There my be some redundant defaulting remaining in ansible/roles/(controller|invoker)/tasks/deploy.yml, but I'm trying to manage the number of changes and other maintenance risks that may creep in.

Review and PG requested -- there may be breakage in the IBM Functions build.

Related issue and scope

  • I opened an issue to propose and discuss this change (#????)

My changes affect the following components

  • API
  • Controller
  • Message Bus (e.g., Kafka)
  • Loadbalancer
  • Invoker
  • Intrinsic actions (e.g., sequences, conductors)
  • Data stores (e.g., CouchDB)
  • Tests
  • Deployment
  • CLI
  • General tooling
  • Documentation

Types of changes

  • Bug fix (generally a non-breaking change which closes an issue).
  • Enhancement or new feature (adds new functionality).
  • Breaking change (a bug fix or enhancement which changes existing behavior).

Checklist:

  • I signed an Apache CLA.
  • I reviewed the style guides and followed the recommendations (Travis CI will check :).
  • I added tests to cover my changes.
  • My changes require further changes to the documentation.
  • I updated the documentation where necessary.

@jonpspri
Copy link
Contributor Author

jonpspri commented Mar 8, 2018

Yep, I see the Travis failure. Something funky going on with how the templating of 'whisk.properties' works with my changes. Need to set up a regression environment to cross-test against. Gotta go do the day job now, but will likely sort this out during the breaks.

@jonpspri jonpspri force-pushed the runtime-prefix-tag branch from 0a9ab41 to d0b05d1 Compare March 8, 2018 22:42
@jonpspri
Copy link
Contributor Author

jonpspri commented Mar 8, 2018

^^^ Rebased in case a fix to the Kafka heisenbug has been committed.

@rabbah rabbah requested review from dgrove-oss and csantanapr March 20, 2018 03:16
@rabbah rabbah added deployment review Review for this PR has been requested and yet needs to be done. labels Mar 20, 2018
@dgrove-oss
Copy link
Member

dgrove-oss commented Mar 20, 2018

I thought we decided to move the misc defaults to pureconfig (https://github.com/apache/incubator-openwhisk/blob/8fb3a6fda1700b8cde1de300492b1461a176df15/common/scala/src/main/resources/application.conf#L131-#L137) and removed them from ansible. Why do they need to move back to ansible? (we don't use ansible to deploy on kube, which was why moving the defaults out of ansible was desirable)

@jonpspri
Copy link
Contributor Author

Hi @dgrove-oss, I missed that memo, so forgive me if I’m catching up. Currently there’s logic in the ansible deploy that pre-fetches the images listed in ‘runtime_manifest.json’ and therefore needs to know the defaults as defined in application.conf. Unfortunately, ansible doesn’t natively support parsing application.conf (one of several reasons I dislike the pure config approach). For the time being, that means I do have to keep that separate list of overrides in group_vars/all. Ideally, I’d want to do away with the pre-loading because Controller/Invoker were smart enough to initialize their own runtimes space, but I don’t think we’re there yet.

@chetanmeh
Copy link
Member

Unfortunately, ansible doesn’t natively support parsing application.conf (one of several reasons I dislike the pure config approach)

@jonpspri One possible workaround to address that may be to convert the application.conf to JSON using typesafe api (see serialization)

val config = ConfigFactory.load(); 
val configJSON : String = config.root().render( ConfigRenderOptions.concise() )

And then consume that in ansible logic. We can add script to dev tools which can then be invoked from ansible via gradle command like gradlew -p tools/dev confToJson

@dgrove-oss
Copy link
Member

I'd like to have just one place to get these defaults. I like @chetanmeh's suggestion of parsing application.conf into ansible so we could make application.conf available to the non-Scala parts of the system.
@markusthoemmes any thoughts?

@jonpspri
Copy link
Contributor Author

I was thinking about something along @chetanmeh ‘s way of thinking. Although I’d rather go the other way... a JSON or YAML config file that is used for configuration of the scala code. Then we’re less likely to run into synchronization problems and edits/overrides are understood. I expect that’s a non-starter from an effort standpoint.

Can we leave gradle out of this solution? Ideally in this scenery, one would use ‘shell:’ to invoke an entry point in the invoker jar to dump the JSON to stdout. That way we avoid all sorts of build out-of-sync issues.

@chetanmeh
Copy link
Member

chetanmeh commented Mar 20, 2018

Ideally in this scenery, one would use ‘shell:’ to invoke an entry point in the invoker jar to dump the JSON to stdout.

Makes sense. We can add a utility main class in common (implemented in java to avoid scala dependency in classpath) to dump the config. And in ansible just invoke

java -cp typesafe-config.jar:openwhisk-common.jar whisk.utils.ConfigDumper /path/to/output.json

@jonpspri
Copy link
Contributor Author

I like that approach, assuming we need to continue to pre-pull Docker images in ansible. My preferred approach would be for Invoker to pre-pull its needed images on startup. I’m opening the covers on invoker in my next PR, so can we table this conversation until then? I’m referring #3407 here so we track this conversation. I’ll also add a TODO to the Ansible files where the suspect variables are defined and used so we don’t build in more dependencies. We can disappear them over the next few weeks while the JARs are open. This PRs intention was to do pre-JAR work. Deal?

@jonpspri jonpspri force-pushed the runtime-prefix-tag branch from 1de490c to d615b6e Compare March 22, 2018 17:17
@dgrove-oss
Copy link
Member

As long as we leave the redundant defaults in application.conf so that kube deploy won't suddenly break, I'm ok with staging things and having duplication for a couple of weeks while the changes flow through.

fwiw, in the kube deployments, the pulls are done by an init container that runs before the main invoker container starts. it does the pulls by reading (a duplicated... 😬 ) runtimes.json file.

@@ -33,7 +33,7 @@ whisk.api.host.name={{ whisk_api_host_name | default(groups['edge'] | first) }}
whisk.api.localhost.name={{ whisk_api_localhost_name | default(whisk_api_host_name) | default(whisk_api_localhost_name_default) }}
whisk.api.vanity.subdomain.parts=1

runtimes.manifest={{ runtimesManifest | to_json }}
runtimes.manifest={{ runtimes.manifest | to_json }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think... we can drop this... gulp, i didn't try.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's scary. Let me dig through the invoker code -- I'm not sure how much indirection it uses to get the manifest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is used for tests only not the invoker. It might still be needed to run tests actually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to slowly factor it out of tests. I'll take a look there. In the meantime, I guess I'll have to raise an issue so we don't lose track of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to ignore my comment it's inconsequential for this pr.

@jonpspri
Copy link
Contributor Author

@dgrove-oss Yeah, ansible is doing the pulls based on runtimes.json as well, hence some of this stuff. If we get invoker to do its own pulls, does that let you cut out a container? I'm working on the extra Java class now -- probably will have something on Monday.

@dgrove-oss
Copy link
Member

it won't help with the KubernetesContainerFactory because in that architecture we don't run an invoker on every node. (We run a smaller number of invokers as a StatefulSet + an agent (small & slim; written in Go) on every worker node that handles the interactions with docker.

@jonpspri
Copy link
Contributor Author

@dgrove-oss Hmm... so maybe we should eventually factor out an invoker-helper that runs in the worker pods and manages the associated image-space. Or perhaps we defer the whole docker-caching problem up to kube? We can't be the only kube users with that concern...

@jonpspri jonpspri force-pushed the runtime-prefix-tag branch from 8a32e2e to 426f56a Compare May 5, 2018 12:27
@jonpspri jonpspri force-pushed the runtime-prefix-tag branch from 426f56a to e2163c8 Compare May 5, 2018 13:37
@codecov-io
Copy link

Codecov Report

Merging #3412 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #3412   +/-   ##
======================================
  Coverage    74.4%   74.4%           
======================================
  Files         125     125           
  Lines        5951    5951           
  Branches      384     384           
======================================
  Hits         4428    4428           
  Misses       1523    1523

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6fef5c4...e2163c8. Read the comment docs.

@rabbah
Copy link
Member

rabbah commented Jun 11, 2018

We've settled on #3680.

@rabbah rabbah closed this Jun 11, 2018
@jonpspri jonpspri deleted the runtime-prefix-tag branch April 19, 2020 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployment review Review for this PR has been requested and yet needs to be done.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants