Multiple components are involved when CSV Fast Importer
is executed:
- file
- ruby
File
wrapper - database client (managed by
ActiveRecord
connection) - SQL command (
COPY
for PostgreSQL) - database server
Encoding must be consistent accross all these components. Here is how to specify or check each component encoding.
You can get current file encoding with file -i [file_path]
(-I
on macOS) command.
Some tools like iconv can modify file encoding.
File
uses default Ruby encoding (given by Encoding.default_external
. See External / Internal Encoding which might be different from file enoding!
File.new 'path/to/file.csv'
But, you can specify encoding with encoding
parameter:
File.new 'path/to/file.csv', encoding: 'ISO-8859-1'
Ruby File
can also handle internal and external encoding (see File::new which can be useful to manage automatic conversion:
File.new 'path/to/file.csv', external_encoding: 'ISO-8859-1', internal_encoding: 'UTF-8'
# or
File.new 'path/to/file.csv', encoding: 'ISO-8859-1:UTF-8'
Database is accessed through a dedicated client.
This client is managed by ActiveRecord
with some configuration (database.yml
in Rails application) where encoding
parameter can be defined.
By default, COPY
and LOAD DATA INFILE
commands follow database client encoding configuration. But you can override this with dedicated parameter.
This is the purpose of CSV FAST Importer
's encoding
parameter.
Each Postgres server instance is encoded with a specific table. You can show this with following command:
psql -l
Or, from psql
client:
\l