Skip to content

Latest commit

 

History

History
69 lines (57 loc) · 4.25 KB

mssql.md

File metadata and controls

69 lines (57 loc) · 4.25 KB

MsSQL

SQLServer does not need to specify protocol.

MsSQL Connection

if the user password has special characters, they need to be sanitized. example: `from urllib import parse; password = parse.quote_plus(password)`
import connectorx as cx
conn = 'mssql://username:password@server:port/database?encrypt=true&trusted_connection=true'         # connection token
query = 'SELECT * FROM table'                                   # query string
cx.read_sql(conn, query)                                        # read data from MsSQL

Connection Parameters

  • By adding trusted_connection=true to connection uri parameter, windows authentication will be enabled.
    • Example: mssql://host:port/db?trusted_connection=true
  • By adding encrypt=true to connection uri parameter, SQLServer will use SSL encryption.
    • Example: mssql://host:port/db?encrypt=true&trusted_connection=true
  • By adding trust_server_certificate=true to connection uri parameter, the SQLServer certificate will not be validated and it is accepted as-is.
    • Example: mssql://host:port/db?trust_server_certificate=true&encrypt=true
  • By adding trust_server_certificate_ca=/path/to/ca-cert.crt to connection uri parameter, the SQLServer certificate will be validated against the given CA certificate in addition to the system-truststore.
    • Example: mssql://host:port/db?encrypt=true&trust_server_certificate_ca=/path/to/ca-cert.crt

SQLServer-Pandas Type Mapping

SQLServer Type Pandas Type Comment
TINYINT int64, Int64(nullable)
SMALLINT int64, Int64(nullable)
INT int64, Int64(nullable)
BIGINT int64, Int64(nullable)
FLOAT float64
NUMERIC float64
DECIMAL float64 cannot support precision larger than 28
BIT bool, boolean(nullable)
VARCHAR object
CHAR object
TEXT object
NVARCHAR object
NCHAR object
NTEXT object
VARBINARY object
BINARY object
IMAGE object
DATETIME datetime64[ns]
DATETIME2 datetime64[ns]
SMALLDATETIME datetime64[ns]
DATE datetime64[ns]
DATETIMEOFFSET datetime64[ns]
TIME object
UNIQUEIDENTIFIER object

Performance (r5.4xlarge docker in another EC2 instance)

Modin does not support read_sql on Mssql

  • Time chart, lower is better.

time chart

  • Memory consumption chart, lower is better.

memory chart

In conclusion, ConnectorX uses 3x less memory and 14x less time compared with Pandas.