Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_sql Fails with if_exists="replace" When Table or Schema Names Contain Special Characters #562

Closed
AJM10565 opened this issue Dec 2, 2024 · 1 comment

Comments

@AJM10565
Copy link
Contributor

AJM10565 commented Dec 2, 2024

Issue Summary

When using the to_sql function in PyAthena with if_exists="replace", SQL statements like DROP TABLE fail if the table name or schema contains special characters (e.g., hyphens -). This happens because Athena requires such identifiers to be enclosed in backticks (`identifier`) to avoid syntax errors.

Steps to Reproduce

  1. Create a schema or table name containing special characters, such as my-schema.my-table.
  2. Use the to_sql function with the if_exists="replace" parameter.
  3. Observe that the generated DROP TABLE or DDL query fails with a syntax error.

Suggested Solution

Modify the to_sql implementation to properly escape identifiers wherever they appear in SQL statements. Specifically:

  1. DROP TABLE Statements:
    Update to:

    DROP TABLE `{schema}`.`{name}`
  2. CREATE TABLE (DDL) Statements:
    Ensure that schema and table names are enclosed in backticks in the generated DDL.

  3. Information Schema Queries:
    Escape schema and table names:

    SELECT table_name
    FROM information_schema.tables
    WHERE table_schema = '{schema}'
    AND table_name = '{name}'
  4. Add a Utility for Escaping:
    Create a helper function for escaping identifiers:

    def escape_identifier(identifier: str) -> str:
        return f"`{identifier}`"
  5. Update the Codebase:
    Apply the escape_identifier function wherever schema or table names are referenced in SQL.

Why This Matters

Properly escaping identifiers ensures compatibility with Athena's SQL syntax rules, avoids syntax errors, and makes PyAthena more robust when dealing with schemas or tables containing special characters. This change would also make the library easier to use without requiring additional handling from users.

laughingman7743 added a commit that referenced this issue Dec 3, 2024
Escape schema and table names in drop statements (fix #562)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants