Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: database/sql: add configuration option to ignore null values in Scan #57099

Open
MisterSquishy opened this issue Dec 6, 2022 · 3 comments
Labels
Milestone

Comments

@MisterSquishy
Copy link

MisterSquishy commented Dec 6, 2022

like #28414, but for all null values

copied from #57066:

Currently, when scanning a NULL database value into a go type that is not nil-able, this package throws an error like "sql: Scan error on column index 0, name "foo": converting NULL to int is unsupported". This makes logical sense; nil and 0 are different values, and only the latter can be assigned to an int. The Null* types (NullString, NullInt, etc) account for this difference, giving programmers the ability to safely bridge the gap between nullable SQL values and primitive go types.

I propose we give programmers the option to change that behavior and instead ignore NULL values. There are a few reasons I think this option makes sense:

Alignment with programmer intent

In go, if I want a struct field with a string field, I would write type foo struct{ bar string }
In mysql, if I want a table with a string column, I would write CREATE TABLE foo (bar varchar(255));

But this produces incompatible types -- in mysql, any value can be NULL unless you explicitly say otherwise. If we wanted to enforce that our foo.bar column could never be NULL, we would have to write CREATE TABLE foo (bar varchar(255) NOT NULL); This nuance is seldom clear to programmers, and is a lesson that generally only gets learned the hard way. For instance, if you google "go sql null", you will get back numerous pages of blog posts, stack overflow posts, etc -- whether or not it is "right", the undeniable fact is that this behavior subverts programmer expectations on a regular basis.

Furthermore, when a programmer writes some go code to fetch sql data and work with it in a go program, choosing to type their variable as a non-nilable type is a signal that they do not need to write special handling for the null case. It isn't up to us to determine whether or not they should; by writing their code this way, the programmer has indicated that they have no use for this distinction. When we force them to care (at runtime, often long after the code has been written, when we unexpectedly encounter a NULL value in the database), this produces broken software that isn't always straightforward to triage/remediate.

Fundamentally, the responsibility of this package is to bridge the gap between sql and go. Accounting for the disparate treatment of NULL in these two languages is currently the responsibility of the programmer, but by reconciling this difference in the translation layer, we can prevent bugs and make it easier to write great go code.

Consistency with json unmarshaling behaviors

As shown in this playground, unmarshaling json with NULL values into non-nilable types already works according to the programmer expectations described above -- we leave these values untouched. Whether or not the json unmarshaler is "right" is up for debate (see #33835), but the fact of the matter is that this will likely have to remain the default behavior forever. The fact that these two standard libraries have opposite opinions is unfortunate, and can probably never be changed. However, we compound this issue by giving programmers no mechanism to make them behave the same way. So, every go programmer has to be aware of this distinction and correctly design their types for the environment(s) that they expect to instantiate them from. This is clunky at best, and in practice it creates vast swaths of bug habitat.

It would be great to introduce an analogous configuration option in the standard json unmarshaler (or perhaps even reuse the same configuration flag), but that is out of scope for this issue.

For the above reasons, it's important to give programmers the ability to opt out of this behavior. Using Null* types or pointers to primitives remain viable for situations where we need to distinguish between NULL and other values, but in situations where the programmer does not care about this distinction, we should allow them this freedom.

I'm not totally sure of the most standard way to specify this bit of configuration; I think it would likely suffice to be a global setting (not per connection, transaction, statement, etc)

@gopherbot gopherbot added this to the Proposal milestone Dec 6, 2022
@rittneje
Copy link

rittneje commented Dec 6, 2022

Why is this necessary over using COALESCE/IFNULL in the SQL query itself?

unmarshaling json with NULL values into non-nilable types already works according to the programmer expectations described above

The JSON unmarshaler does not turn null into the zero value. Rather, it treats this the same as the field being absent.
https://go.dev/play/p/iTpNi9LewMR

choosing to type their variable as a non-nilable type is a signal that they do not care about the distinction between NULL and zero

It could also mean that they have an application bug. You could make the same argument about Go turning nil pointer dereferences into the corresponding zero value.

@MisterSquishy
Copy link
Author

Why is this necessary over using COALESCE/IFNULL in the SQL query itself?

This is preferable as it doesn't require the programmer to know/remember whether or not a given SQL column is nullable. It's not necessary though, you're right that writing all your queries this way would avoid the problem too (as would remembering to include NOT NULL in the definitions of every column you intend to someday scan into a non-nilable type)

The JSON unmarshaler does not turn null into the zero value. Rather, it treats this the same as the field being absent

This is a great point, thanks! I was mistaken about how that worked. The sql package should be able to behave like the JSON unmarshaler, so I think we should instead give the option to ignore null values (and not coalesce them to zero). I'll update the proposal to be less prescriptive.

It could also mean that they have an application bug. You could make the same argument about Go turning nil pointer dereferences into the corresponding zero value.

Agreed that there are scenarios where programmers would not want this behavior (as described in #33835). We should allow programmers to make this decision for themselves.

@MisterSquishy MisterSquishy changed the title proposal: database/sql: optionally coalesce nulls to zero proposal: database/sql: add configuration option to ignore null values in Scan Dec 6, 2022
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Dec 7, 2022
@hungtcs
Copy link

hungtcs commented Apr 27, 2023

This problem has bothered me for days.
I finally realized that the data cannot be scanned directly to the destination, my current workaround is to use a data wrapper.

type DatabaseValueScanner struct {
	Value            any
	DatabaseTypeName string
}

func (scanner *DatabaseValueScanner) Scan(src interface{}) error {
	switch value := src.(type) {
	case int8:
		scanner.Value = value
	// ... 
        // ...
	case []uint8:
		scanner.Value = fmt.Sprintf("%s", value)
	case time.Time:
		scanner.Value = value
	case nil:
		scanner.Value = nil
	default:
		scanner.Value = value
	}
	return nil
}

Here is a common use case

pointers := make([]any, rowsCount)
for i := range pointers {
	pointers[i] = &DatabaseValueScanner{
		Name: columnTypes[i].Name(),
		Type: columnTypes[i].DatabaseTypeName(),
	}
}

for rows.Next() {
	if err := rows.Scan(pointers...); err != nil {
		return err
	}

	var values []any
	for _, valueScanner := range pointers {
		values = append(values, valueScanner.(*DatabaseValueScanner).Value)
	}
	dataset.Rows = append(dataset.Rows, values)
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

4 participants