Skip to content
This repository has been archived by the owner on Dec 8, 2022. It is now read-only.

Support caching more than 1 IP address per DNS cache entry #1981

Merged
merged 47 commits into from
May 14, 2020
Merged

Support caching more than 1 IP address per DNS cache entry #1981

merged 47 commits into from
May 14, 2020

Conversation

gkwicker
Copy link
Contributor

@gkwicker gkwicker commented May 6, 2020

Description

These changes address this FreeRTOS+TCP issue, add a test for the new multiple IP address DNS feature, and re-enable DNS caching on all FreeRTOS+TCP platforms. If FreeRTOS+TCP is configured with (ipconfigUSE_DNS_CACHE == 1) && (ipconfigDNS_CACHE_ADDRESSES_PER_ENTRY > 1), FreeRTOS+TCP will store multiple answers from a DNS query response; successive calls to FreeRTOS_gethostbyname() will return the next answer in the list.

Configuring ipconfigDNS_CACHE_ADDRESSES_PER_ENTRY > 1 when the DNS cache is enabled improves reliability in systems which contact load-balanced services, since addresses in the answer list can be stale. This change updates the build configurations for all FreeRTOS+TCP platforms to store up to 6 IP addresses per DNS cache entry.

Testing

Tested with Full_TCP and MQTT_System test groups on the following boards (all passing):

  • pc.windows
  • infineon.xmc4800_iotkit
  • infineon.xmc4800_plus_optiga_trust_x
  • nuvoton.numaker_iot_m487_wifi
  • marvell.mw320
  • microchip.curiosity_pic32mzef

Checklist:

  • [ X ] I have tested my changes. No regression in existing tests.
  • My code is Linted.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@markrtuttle
Copy link
Contributor

Feel free to contact me if you want help with those broken CBMC proofs.

@AniruddhaKanhere
Copy link
Member

@gkwicker "Harden the code to handle uninitialized/corrupted DNS cache fields." This is nice 👍

@gkwicker
Copy link
Contributor Author

gkwicker commented May 7, 2020

/bot run checks

ulIPAddressIndex = ( xDNSCache[ x ].ucCurrentIPAddress %
xDNSCache[ x ].ucNumIPAddresses ) % ipconfigDNS_CACHE_ADDRESSES_PER_ENTRY;

xDNSCache[ x ].ucCurrentIPAddress++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose that 10 addresses were found. When calling this function repeatedly, I would see the addresses 0..9, 0..9, and when ucCurrentIPAddress becomes 255, I would see 0..5, followed by 0..9 again?

Why not:

if( ++xDNSCache[ x ].ucCurrentIPAddress == xDNSCache[ x ].ucNumIPAddresses )
{
	xDNSCache[ x ].ucCurrentIPAddress = 0;
}

Making ucCurrentIPAddress a uint8_t is no problem, DNS can not return more than +/- 27 answers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more general question about the change:
Suppose I'm looking up the addresses of some host and it receives 5 IP-addresses.
I will call FreeRTOS_gethostbyname() repeatedly to see all addresses.
How do I know when I've seen all IP-addresses? Should I remember the first IP-address received?
Why not return 0 after the last IP-address was returned? And then the first IP-address again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applications using FreeRTOS_gethostbyname() should interpret the result as an independent scalar value rather than a member of a list, and they should not prefer any value over any other. When we have an implementation of FreeRTOS_getaddrinfo(), applications will have the option of choosing a particular address, which could be useful in some circumstances.

The reasoning behind this change is to give app developers an easy way to increase reliability of their app when connecting to a load-balanced service, without requiring them to change any application logic. As just one example, the amazon-freertos repo has dozens of tests and demos which can benefit immediately without any modification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose that 10 addresses were found. When calling this function repeatedly, I would see the addresses 0..9, 0..9, and when ucCurrentIPAddress becomes 255, I would see 0..5, followed by 0..9 again?

Why not:

if( ++xDNSCache[ x ].ucCurrentIPAddress == xDNSCache[ x ].ucNumIPAddresses )
{
	xDNSCache[ x ].ucCurrentIPAddress = 0;
}

Making ucCurrentIPAddress a uint8_t is no problem, DNS can not return more than +/- 27 answers.

I didn't choose this option because I didn't want to have to do anything for thread safety for cases where two tasks both call FreeRTOS_gethostbyname(). In the event that there was a context switch after the if but before the statement in {...}, we could have a TOCTOU error resulting in an out-of-bound array access.

The side effect of doing the modulo approach as you have pointed out is that you the order that addresses appear may not reflect how they are stored internally, but this isn't important since the caller should make no assumption about these values. The only thing that is promised is that FreeRTOS_gethostbyname() will return an answer if one is available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't choose this option because I didn't want to have to do anything for thread safety for cases where two tasks both call FreeRTOS_gethostbyname(). In the event that there was a context switch after the if but before the statement in {...}, we could have a TOCTOU error resulting in an out-of-bound array access.

What if we do

if( ++xDNSCache[ x ].ucCurrentIPAddress >= xDNSCache[ x ].ucNumIPAddresses )
{   /* Use >=  instead of == */
     .... 
}

Would it remove TOCTOU error? I could not think of a case where it fails. Do correct me if I am wrong.
I am just trying to reduce computation as much as we can. Modulo operation is quite expensive than a comparison statement in a lot of architectures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that if interrupts aren't masked, a context switch can happen at any time. And since this code is not protected by any mutual-exclusion mechanism such as a mutex or semaphore (and we don't want to add those if we don't have to), you can still have the error.

In any case, the code as written is simpler and avoids any possibility of this occurring.

@gkwicker
Copy link
Contributor Author

gkwicker commented May 9, 2020

/bot run checks

@@ -805,6 +805,7 @@ TEST_GROUP_RUNNER( Full_TCP )
RUN_TEST_CASE( Full_TCP, AFQP_SOCKETS_Recv_Invalid );
RUN_TEST_CASE( Full_TCP, AFQP_SOCKETS_htons_HappyCase );
RUN_TEST_CASE( Full_TCP, AFQP_SOCKETS_inet_addr_quick_HappyCase );
RUN_TEST_CASE( Full_TCP, test_dns_cache_multiple_addresses );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the correct place for this test? Since it doesn't relate to secure sockets. This is not blocking but more of a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test suite is our "networking" test suite, so it's the best place for it. I think you're right though, ideally we would have a set of layered networking test suites. But that's another set of changes.

ulIPAddressIndex = ( xDNSCache[ x ].ucCurrentIPAddress %
xDNSCache[ x ].ucNumIPAddresses ) % ipconfigDNS_CACHE_ADDRESSES_PER_ENTRY;

xDNSCache[ x ].ucCurrentIPAddress++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't choose this option because I didn't want to have to do anything for thread safety for cases where two tasks both call FreeRTOS_gethostbyname(). In the event that there was a context switch after the if but before the statement in {...}, we could have a TOCTOU error resulting in an out-of-bound array access.

What if we do

if( ++xDNSCache[ x ].ucCurrentIPAddress >= xDNSCache[ x ].ucNumIPAddresses )
{   /* Use >=  instead of == */
     .... 
}

Would it remove TOCTOU error? I could not think of a case where it fails. Do correct me if I am wrong.
I am just trying to reduce computation as much as we can. Modulo operation is quite expensive than a comparison statement in a lot of architectures.

*/
#define ipconfigUSE_DNS_CACHE ( 0 )
#define ipconfigUSE_DNS_CACHE ( 1 )
#define ipconfigDNS_CACHE_ADDRESSES_PER_ENTRY ( 6 )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this 6U? MISRA tends to dislike operations between signed and unsigned values. (Such as modulo operation in FreeRTOS_DNS.c)

Copy link
Contributor Author

@gkwicker gkwicker May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is vendor code and it likely has the same issues throughout (see line 77). If Coverity raises a warning, we'll address that in a separate PR.

*/
#define ipconfigUSE_DNS_CACHE ( 0 )
#define ipconfigUSE_DNS_CACHE ( 1 )
#define ipconfigDNS_CACHE_ADDRESSES_PER_ENTRY ( 6 )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this 6U? MISRA tends to dislike arithmetic operations between signed and unsigned variables,.

Copy link
Contributor Author

@gkwicker gkwicker May 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment, that will be addressed in a PR if Coverity raises a warning.

*/
#define ipconfigUSE_DNS_CACHE ( 0 )
#define ipconfigUSE_DNS_CACHE ( 1 )
#define ipconfigDNS_CACHE_ADDRESSES_PER_ENTRY ( 6 )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this 6U? MISRA tends to dislike arithmetic operations between signed and unsigned variables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction in other ports as well if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment.

* address is out of date, no connections to the host will succeed until the TTL
* expires. See this GitHub issue for more details.
*
* https://github.com/FreeRTOS/FreeRTOS/issues/58
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have a similar note to this for a scenario where ipconfigUSE_DNS_CACHE == 1 and ipconfigDNS_CACHE_ADDRESSES_PER_ENTRY == 1? This seems like it would have the same issue that this note addresses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would, that's a good point. I'll add a note in the example FreeRTOSIPConfig.h about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a clarifying comment in FreeRTOSIPConfigDefaults.h

yanjos-dev
yanjos-dev previously approved these changes May 12, 2020
@@ -2961,6 +2962,61 @@ TEST( Full_TCP, AFQP_SOCKETS_htons_HappyCase )
}
/*-----------------------------------------------------------*/

TEST( Full_TCP, test_dns_cache_multiple_addresses )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, can we name the test AFQP_SECURE_SOCKETS_TestDnsCacheMultipleAddresses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AFQP standards for Amazon FreeRTOS Qualification Program, and this test won't be added to that test suite now since that would be a qualification change. I think it would make sense to add it to that test suite eventually, however this would have to be coordinated with that program. We would change the name to an AFQP name at that time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense - however can we use camel casing instead of snake case?

@@ -2961,6 +2962,61 @@ TEST( Full_TCP, AFQP_SOCKETS_htons_HappyCase )
}
/*-----------------------------------------------------------*/

TEST( Full_TCP, test_dns_cache_multiple_addresses )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test verifying DNS caching? The implementation of SOCKETS_GetHostByName may not use DNS caching and instead rely on DNS round robin- the test would still pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good point. This wouldn't be the case in the FreeRTOS+TCP implementation, but it could be in others. This test will be disabled in all platforms which use non-FreeRTOS+TCP networking until we can verify that it works on them. In fact, having the word cache in this test name may be misleading, since it's not verifying that a DNS cache is enabled (I'm not sure if this would even be possible in this test), only that different IP addresses are returned for different invocations of SOCKETS_GetHostByName(). I will rename the test to something more accurate, perhaps test_dns_multiple_addresses.

/*
* Resolve the broker endpoint to an array of IP addresses. Each
* subsequent call will return one of the addresses which the
* name resolves to. Call once more than the maximum number of addresses
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call once more than the maximum number of addresses per cache entry to ensure that wraparound works.

How is this ensured?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is misleading for a couple of reasons, I will change it. First, since this is Secure Sockets API-level test, it has no idea what sort of DNS caching strategy is employed (if any as pointed out in a previous comment), how many addresses could be stored per entry, etc. The only thing it needs to ensure is that subsequent calls to SOCKETS_GetHostByName() on an AWS IoT endpoint return different IP addresses.

Wraparound testing in the FreeRTOS+TCP implementation of FreeRTOS_gethostbyname() needs to be handled by a unit test in that code, not a system-level test here.

ulUnique = 0;
}
}
if( ( ulUnique == 1 ) && ( ulNumUniqueIPAddresses < 10 ) )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we trying to obtain 10 different addresses while getting only 4 is required to pass the test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, those numbers should match – will change.

@gkwicker gkwicker dismissed stale reviews from AniruddhaKanhere and yanjos-dev via 23a4ca2 May 13, 2020 23:16
@gkwicker gkwicker merged commit bb3c7e6 into aws:master May 14, 2020
alfred2g pushed a commit to alfred2g/amazon-freertos that referenced this pull request May 21, 2020
* Support multiple IP addresses per DNS cache entry.
* Add test for multiple IP address DNS results from AWS IoT Core endpoint.
* Re-enable DNS cache on FreeRTOS+TCP platforms
* Configure FreeRTOS+TCP platforms to remember up to 6 DNS answers per cache entry

More information at:

FreeRTOS/FreeRTOS#58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants