Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in detecting "expires" timestamp #1028

Open
sdc50 opened this issue Feb 26, 2025 · 5 comments
Open

Bug in detecting "expires" timestamp #1028

sdc50 opened this issue Feb 26, 2025 · 5 comments

Comments

@sdc50
Copy link
Contributor

sdc50 commented Feb 26, 2025

In the expiration_timedelta method there is a bug when determining if the expires value from extra_data is a timestamp or a number of seconds since the auth_time:

https://github.com/python-social-auth/social-core/blob/master/social_core/storage.py#L73

            now = datetime.now(timezone.utc)

            # Detect if expires is a timestamp
            if expires > now.timestamp():

The logic here seems to be that if expires is a number greater than now.timestamp() (i.e. around 1740600285) then it must be a timestamp, and not the number of seconds since auth_time. However, when the access_token is actually expired, then by definition, expires is less than now.timestamp().

With the current logic, when the token is expired then it is not determined to be a timestamp and is treated like the number of seconds since auth_time, so it calculates the expiration_timedelta to around 20,000 days in the future rather than being in the past. It therefore, never tries to refresh the token since it never realizes that it's expired.

@sdc50
Copy link
Contributor Author

sdc50 commented Feb 26, 2025

I think the solution is to come up with a hardcoded number that would be able to distinguish timestamps from seconds-from-auth_time.

Access tokens are generally not valid for more than several days, so 1 million seconds (~11.5 days) seems like a reasonable lower cutoff.

Current time stamps (anything after around Nov 14, 2023) are greater than 1.7 billion, so that seem like a reasonable upper cutoff.

Timestamp for 1.7B
In [1]: datetime.fromtimestamp(1.7e9)
Out[1]: datetime.datetime(2023, 11, 14, 15, 13, 20)

In other words comparing expired to anything between 1e6 and 1.7e9 seems like it would reasonably distinguish between (modern) timestamps and number of seconds. In my opinion, 1e8 seems like a reasonable selection:

    def expiration_timedelta(self):
        """Return provider session live seconds. Returns a timedelta ready to
        use with session.set_expiry().

        If provider returns a timestamp instead of session seconds to live, the
        timedelta is inferred from current time (using UTC timezone). None is
        returned if there's no value stored or it's invalid.
        """
        if self.extra_data and "expires" in self.extra_data:
            try:
                expires = int(self.extra_data.get("expires"))
            except (ValueError, TypeError):
                return None

            now = datetime.now(timezone.utc)

            # Detect if expires is a timestamp
            if expires > 1e8:
                # expires is a datetime, return the remaining difference
                expiry_time = datetime.fromtimestamp(expires, tz=timezone.utc)
                return expiry_time - now
            # expires is the time to live seconds since creation,
            # check against auth_time if present, otherwise return
            # the value
            auth_time = self.extra_data.get("auth_time")
            if auth_time:
                reference = datetime.fromtimestamp(auth_time, tz=timezone.utc)
                return (reference + timedelta(seconds=expires)) - now
            return timedelta(seconds=expires)
        return None

@sdc50
Copy link
Contributor Author

sdc50 commented Feb 26, 2025

My above suggestion would fix the way that it is currently supposed to work. To avoid this issue all together it seems like it may be better to check for two separate values in extra_data:

expires_on - a timestamp for when the token expires
expires_in - the number of seconds from auth_time when the token expires

These values seem to be more common among providers and remove the ambiguity.

I suggest that the expiration_timedelta tries to use these attributes in the following order:

expires_on
expires_in
expires - use the logic above to translate to either expires_on or expires_in.

Here is my suggested implementation:

    def _get_extra_data_value_as_int(self, key):
        """Utility to safely cast an extra_data value as an int.
        """
        try:
            return int(self.extra_data.get(key))
        except (ValueError, TypeError):
            return None

    def expiration_from_expires_on(self, expires_on=None):
        """Returns a datetime of when the provider session expires, calculated from the "expires_on" value.
        """
        expires_on = expires_on or self._get_extra_data_value_as_int("expires_on")
        if expires_on is not None:
            return datetime.fromtimestamp(expires_on, tz=timezone.utc)

    def expiration_from_expires_in(self, expires_in=None):
        """Returns a datetime of when the provider session expires, calculated from the "expires_in" value.
        """
        expires_in = expires_in or self._get_extra_data_value_as_int("expires_in")
        if expires_in is not None:
            auth_time = self.extra_data.get("auth_time")
            if auth_time:
                return datetime.fromtimestamp(auth_time, tz=timezone.utc) + timedelta(seconds=expires_in)

    def expiration_timedelta(self):
        """Return provider session live seconds. Returns a timedelta ready to
        use with session.set_expiry().

        If provider returns a timestamp instead of session seconds to live, the
        timedelta is inferred from current time (using UTC timezone). None is
        returned if there's no value stored or it's invalid.
        """
        if self.extra_data:
            expiry_time = self.expiration_from_expires_on()
            if expiry_time is None:
                expiry_time = self.expiration_from_expires_in()
                if expiry_time is None:
                    expires = self._get_extra_data_value_as_int("expires")
                    if expires is None:
                        return
                    # Detect if expires is a timestamp
                    if expires > 1e8:
                        expiry_time = self.expiration_from_expires_on(expires_on=expires)
                    else:
                        expiry_time = self.expiration_from_expires_in(expires_in=expires)

            now = datetime.now(timezone.utc)
            return expiry_time - now

If this approach is deemed acceptable, I'd be happy to submit a PR.

@nijel
Copy link
Member

nijel commented Feb 27, 2025

While your approach is definitely better than the current implementation, wouldn't it be better to store directly expires_on / expires_in in the extra data and avoid the heuristics completely? It would be definitely more work as each backend currently storing expires needs to be reviewed, but that would make the solution more robust.

@sdc50
Copy link
Contributor Author

sdc50 commented Feb 28, 2025

While your approach is definitely better than the current implementation, wouldn't it be better to store directly expires_on / expires_in in the extra data and avoid the heuristics completely? It would be definitely more work as each backend currently storing expires needs to be reviewed, but that would make the solution more robust.

@nijel Yes, it would definitely be better to store expires_on and expires_in directly, which is what I am proposing here. My suggested implementation keeps expires for backward compatibility which would allow backends to migrate on their own timeline to using expires_on or expires_in.

@nijel
Copy link
Member

nijel commented Feb 28, 2025

I don't think the backends would ever be fully migrated if this is not done at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants