Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portmap doesn't fail if chain doesn't exist #421

Merged
merged 4 commits into from
Dec 18, 2019

Conversation

aojea
Copy link
Contributor

@aojea aojea commented Dec 11, 2019

It turns out that the portmap plugin is not idempotent if its executed in parallel.
The errors are caused due to a race of different instantiations deleting the chains.
This patch does that the portmap plugin doesn't fail if the errors are because the chain doesn't exist on teardown.

xref: #418

Signed-off-by: Antonio Ojea [email protected]

Copy link
Member

@mars1024 mars1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 93 in chain.go if err := ipt.Delete(c.table, entryChain, chainParts...); err != nil { should be idempotent.

plugins/meta/portmap/chain.go Outdated Show resolved Hide resolved
@aojea
Copy link
Contributor Author

aojea commented Dec 11, 2019

Line 93 in chain.go if err := ipt.Delete(c.table, entryChain, chainParts...); err != nil { should be idempotent.

Do you mean that I should check if there is an error of non existent chain and swallow it?

@mars1024
Copy link
Member

Line 93 in chain.go if err := ipt.Delete(c.table, entryChain, chainParts...); err != nil { should be idempotent.

Do you mean that I should check if there is an error of non existent chain and swallow it?

Yes, that's what I want, and if you could print errs in test, I believe we will locate the root cause exactly.

@aojea
Copy link
Contributor Author

aojea commented Dec 11, 2019

Line 93 in chain.go if err := ipt.Delete(c.table, entryChain, chainParts...); err != nil { should be idempotent.

Do you mean that I should check if there is an error of non existent chain and swallow it?

Yes, that's what I want, and if you could print errs in test, I believe we will locate the root cause exactly.

I've already found the root cause in #418, and It's easy to reproduce #418 (comment)

You can see that it always fail in those lines, however, you are right and I've found a small percentage of time that fails in the line you mention

@aojea
Copy link
Contributor Author

aojea commented Dec 11, 2019

@mars1024 PTAL, I added your suggestions and it's working now, I can't reproduce the problem
Thank you for the guidance

EDIT

it failed in the CI :/, adding logging to the test

EDIT2

this is the failure

• Failure [0.084 seconds]
chain tests
/home/travis/gopath/src/github.com/containernetworking/plugins/gopath/src/github.com/containernetworking/plugins/plugins/meta/portmap/chain_test.go:33
  deletes chains idempotently in parallel [It]
  /home/travis/gopath/src/github.com/containernetworking/plugins/gopath/src/github.com/containernetworking/plugins/plugins/meta/portmap/chain_test.go:201
  Expected error:
      <*errors.errorString | 0xc00007cfe0>: {
          s: "Failed to delete referring rule filter -A cni-test-8177199 -d 203.0.113.1/32 -j cni-test-5330018: running [/sbin/iptables -t filter -D cni-test-8177199 -d 203.0.113.1/32 -j cni-test-5330018 --wait]: exit status 2: iptables v1.6.0: Couldn't load target `cni-test-5330018':No such file or directory\n\nTry `iptables -h' or 'iptables --help' for more information.\n",
      }
      Failed to delete referring rule filter -A cni-test-8177199 -d 203.0.113.1/32 -j cni-test-5330018: running [/sbin/iptables -t filter -D cni-test-8177199 -d 203.0.113.1/32 -j cni-test-5330018 --wait]: exit status 2: iptables v1.6.0: Couldn't load target `cni-test-5330018':No such file or directory
      
      Try `iptables -h' or 'iptables --help' for more information.
      
  not to have occurred

@aojea aojea force-pushed the portmapErrors2 branch 4 times, most recently from 91c31ea to 8a57c17 Compare December 11, 2019 17:17
@aojea
Copy link
Contributor Author

aojea commented Dec 11, 2019

@squeed please take look to current PR
the failure is because of this error

running [/sbin/iptables -t filter -X cni-test-6142911 --wait]: exit status 1: Another app is currently holding the xtables lock; waiting (1s) for it to exit...
      iptables: No chain/target/match by that name.

the xtables lock message makes that
https://github.com/coreos/go-iptables/blob/af017ce15ce0adc4beab89d270abfeeb4d262cf1/iptables/iptables.go#L50-L54

fails to parse correctly the error message, thus it doesn't swallow the error

Should I submit a new patch to go-iptables to consider this scenario? add it to current PR? or is it ok submitting the PR as it is now?

Use a Describe container for the It code block of the
portmap port forward integration test.

Signed-off-by: Antonio Ojea <[email protected]>
It turns out that the portmap plugin is not idempotent if its
executed in parallel.
The errors are caused due to a race of different instantiations
deleting the chains.
This patch does that the portmap plugin doesn't fail if the
errors are because the chain doesn't exist on teardown.

Signed-off-by: Antonio Ojea <[email protected]>
@aojea
Copy link
Contributor Author

aojea commented Dec 12, 2019

@mars1024 @squeed please take a look, I've rebased over the commit that fixes the flaky test and I think that is ready to go

@mars1024
Copy link
Member

Thanks for your efforts, almost LGTM, and I think it's better to squash some lines into some idempotent utility functions in pkg/utils/iptables.go just like #408 .

@squeed
Copy link
Member

squeed commented Dec 12, 2019

Should I submit a new patch to go-iptables to consider this scenario? add it to current PR? or is it ok submitting the PR as it is now?

Yes, please! Though, I'm surprised that's failing. It should wait indefinitely. (I have merge rights in go-iptables too).

Add the following idempotent functions to iptables utils:

DeleteRule: idempotently delete an iptables rule
DeleteChain: idempotently delete an iptables chain
ClearChain: idempotently flush an iptables chain

Signed-off-by: Antonio Ojea <[email protected]>
@aojea
Copy link
Contributor Author

aojea commented Dec 12, 2019

@squeed @mars1024 I think that I've addressed all the remaining comments, can you please take a look again?

@squeed
Copy link
Member

squeed commented Dec 13, 2019

@aojea I've released go-iptables, can you pick that up in this PR?

@aojea
Copy link
Contributor Author

aojea commented Dec 13, 2019

@squeed here you go, go-iptables upgraded to 0.4.4 with the last changes
by the way, I had a fun time with go1.13 removing some dependencies from the vendor folder :-)

Is it possible to release a new CNI version?
This is introducing flakiness in kubernetes CI
kubernetes/kubernetes#85727 (comment)

@squeed
Copy link
Member

squeed commented Dec 13, 2019

lgtm!

Copy link
Member

@mars1024 mars1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm, thanks!

@aojea
Copy link
Contributor Author

aojea commented Dec 16, 2019

/hold @squeed @mars1024 there is a bug in the go-iptables code I've introduced, my apologies, we need to bump the iptables code again
I've submitted the fix coreos/go-iptables#72

bump the go-iptables module to v0.4.5 to avoid
concurrency issues with the portmap plugin and
errors related to iptables not able to hold the
lock.

Signed-off-by: Antonio Ojea <[email protected]>
@aojea
Copy link
Contributor Author

aojea commented Dec 16, 2019

/hold cancel

updated the go-iptables dependency to 0.4.5 with the fix
@squeed seems we are ready to go 😄

@mars1024
Copy link
Member

/lgtm for the newest push

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants