Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fractional seconds dropped when using fast_executemany with PostgreSQL ODBC #1023

Closed
ghost opened this issue Feb 4, 2022 · 4 comments
Closed

Comments

@ghost
Copy link

ghost commented Feb 4, 2022

Hello,

I've been investigating the issues with ETL that is transferring data from MS SQL to PostgreSQL using following environment:

  • Python 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)] on win32
  • pyodbc: 4.0.32
  • OS: Windows 10
  • DB: MS SQL Server 2019 to PostgreSQL 14 (latest docker image)

Was able to reproduce issue without sourcing data from MS SQL Server, just mocking some datetimes.
datetime2(7) from MS SQL should be saved to timestamp(6) in PosgreSQL with 6 digits in milliseconds.

import datetime
import pyodbc

print('pyodbc version: %s' % pyodbc.version)

destination_pg = "DRIVER={PostgreSQL Unicode};SERVER=localhost;PORT=5432;DATABASE=XXX;UID=XXX;PWD=XXX"
pg_conn = pyodbc.connect(destination_pg, autocommit=True)

pg_cur = pg_conn.cursor()
pg_cur.fast_executemany = True

single = (datetime.datetime(2021, 1, 1, 11, 21, 11, 311114),)

many = [(datetime.datetime(2022, 2, 3, 17, 28, 15, 344014),),
        (datetime.datetime(2022, 2, 4, 12, 48, 47, 309952),)
        ]

print(f"Single = {single}")
print(f"Many = {many}")

pg_cur.execute("INSERT INTO public.data_table VALUES (?)", single)
pg_cur.executemany("INSERT INTO public.data_table VALUES (?)", many)

pg_cur.execute("SELECT datetime_value FROM public.data_table")

inserted = pg_cur.fetchmany(10)
print(f"Inserted = {inserted}")
# pg_cur.execute("truncate table public.data_table")

pg_conn.close()

Actual output:

pyodbc version: 4.0.32
Single = (datetime.datetime(2021, 1, 1, 11, 21, 11, 311114),)
Many = [(datetime.datetime(2022, 2, 3, 17, 28, 15, 344014),), (datetime.datetime(2022, 2, 4, 12, 48, 47, 309952),)]
Inserted = [(datetime.datetime(2021, 1, 1, 11, 21, 11, 311114), ), (datetime.datetime(2022, 2, 3, 17, 28, 15), ), (datetime.datetime(2022, 2, 4, 12, 48, 47), )]

Expected output:

pyodbc version: 4.0.32
Single = (datetime.datetime(2021, 1, 1, 11, 21, 11, 311114),)
Many = [(datetime.datetime(2022, 2, 3, 17, 28, 15, 344014),), (datetime.datetime(2022, 2, 4, 12, 48, 47, 309952),)]
Inserted = [(datetime.datetime(2021, 1, 1, 11, 21, 11, 311114), ), (datetime.datetime(2022, 2, 3, 17, 28, 15, 344014), ), (datetime.datetime(2022, 2, 4, 12, 48, 47, 309952), )]

Notice that milliseconds are not saved into PostgreSQL, when executemany is used to save data.

This is how it looks in database:

image

@gordthompson
Copy link
Collaborator

gordthompson commented Feb 5, 2022

I am able to reproduce the issue. fast_executemany = True causes PostgreSQL ODBC to drop fractional seconds

from datetime import datetime

import pyodbc

connection_string = (
    "DRIVER=PostgreSQL Unicode;"
    "SERVER=192.168.0.199;"
    "UID=scott;PWD=tiger;"
    "DATABASE=test;"
)
cnxn = pyodbc.connect(connection_string, autocommit=True)
print(cnxn.getinfo(pyodbc.SQL_DRIVER_VER))  # 13.02.0000
crsr = cnxn.cursor()
crsr.execute("TRUNCATE TABLE data_table")

data = [(datetime(2022, 2, 3, 17, 28, 15, 344014),)]

crsr.fast_executemany = True
crsr.executemany("INSERT INTO data_table (dt) VALUES (?)", data)

print(crsr.execute("SELECT * FROM data_table").fetchall())
# [(datetime.datetime(2022, 2, 3, 17, 28, 15), )]

postgresql.LOG

The fractional seconds are preserved if fast_executemany = False.

Fractional seconds are preserved with fast_executemany = True when using ODBC Driver 17 for SQL Server.

mssql.LOG

A possible explanation for the difference is that PostgreSQL ODBC returns this

gh1023          1824-ae4	EXIT  SQLDescribeParam  with return code 0 (SQL_SUCCESS)
        HSTMT               0x000000B57096CC00
        UWORD                        1 
        SWORD *             0x000000B570A431C2 (93)
        SQLULEN *           0x000000B570A431C8 (26)
        SWORD *             0x000000B570A431D0 (-1)
        SWORD *             0x000000B57085F458 (1)

whereas msodbcsql returns this

gh1023          1094-a18	EXIT  SQLDescribeParam  with return code 0 (SQL_SUCCESS)
        HSTMT               0x0000007F5ABA1180
        UWORD                        1 
        SWORD *             0x0000007F580E2CA2 (93)
        SQLULEN *           0x0000007F580E2CA8 (27)
        SWORD *             0x0000007F580E2CB0 (7)
        SWORD *             0x0000007F5770EF38 (1)

@gordthompson gordthompson changed the title pyodbc drops milliseconds when using executemany with posgresql milliseconds dropped when using fast_executemany with PostgreSQL ODBC Feb 5, 2022
@ghost
Copy link
Author

ghost commented Feb 7, 2022

Thanks. fast_executemany = False helps.

According to Driver-support-for-fast_executemany this shouldn't impact speed significantly.
One might put a note on that page linking this exceptional case.

@gordthompson
Copy link
Collaborator

The wiki page has been updated.

@v-chojas
Copy link
Contributor

v-chojas commented Feb 7, 2022

https://docs.microsoft.com/en-us/sql/odbc/reference/appendixes/decimal-digits
"The number of digits to the right of the decimal point in the seconds part of the value (fractional seconds). This number cannot be negative."

Looks like buggy ODBC driver.

@gordthompson gordthompson changed the title milliseconds dropped when using fast_executemany with PostgreSQL ODBC fractional seconds dropped when using fast_executemany with PostgreSQL ODBC Feb 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants