Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3 source stream doesn't have cursor field defined #575

Open
risky-rickman opened this issue Jan 15, 2025 · 0 comments
Open

s3 source stream doesn't have cursor field defined #575

risky-rickman opened this issue Jan 15, 2025 · 0 comments

Comments

@risky-rickman
Copy link

I am trying out PyAirbyte and have configured the below script with credentials that work fine in airbyte cloud. I'm passing all of the config spec checks, but I keep getting an error. The logs appear to show that the streams cursor is not defined, but i don't see any place to specify the cursor in the config schema and in airbyte cloud it appears the cursor is implicilty the modified date associated with the file in the s3 bucket.

Here is my script (with parts redacted):

import airbyte as ab


source = ab.get_source("source-s3")
source.set_config(
    {
        "bucket": "REDACTED",
        "aws_access_key_id": "REDACTED",
        "aws_secret_access_key": "REDACTED",
        "streams": [
            {
                "name": "all data",
                "globs": ["**/REDACTED*.csv"],
                "start_date": "2020-01-01T00:00:00.000000Z",
                "format": {
                    "filetype": "csv",
                },
                "primary_key": "REDACTED",
            }
        ],
    }
)
source.check()
# source.print_config_spec(output_file="./source.yml")

destination = ab.get_destination("destination-mysql")
destination.set_config(
    {
        "database": "REDACTED",
        "host": "REDACTED",
        "password": "REDACTED",
        "port": 3306,
        "username": "REDACTED",
    }
)
destination.check()
# destination.print_config_spec(output_file="./destination.yml")
destination.write(source)

and here is the error I am getting in the logfiles:

2025-01-15 09:06:31 - INFO - ERROR main i.a.c.i.b.s.SshWrappedDestination(getSerializedMessageConsumer):136 Exception occurred while getting the delegate consumer, closing SSH tunnel java.lang.NullPointerException: Cannot invoke "java.util.List.size()" because the return value of "io.airbyte.protocol.models.v0.ConfiguredAirbyteStream.getCursorField()" is null
at io.airbyte.integrations.base.destination.typing_deduping.CatalogParser.toStreamConfig(CatalogParser.kt:134) ~[airbyte-cdk-typing-deduping-0.33.0.jar:?]
at io.airbyte.integrations.base.destination.typing_deduping.CatalogParser.parseCatalog(CatalogParser.kt:29) ~[airbyte-cdk-typing-deduping-0.33.0.jar:?]
at io.airbyte.cdk.integrations.destination.jdbc.AbstractJdbcDestination.getV2MessageConsumer(AbstractJdbcDestination.kt:294) ~[airbyte-cdk-db-destinations-0.33.0.jar:?]
at io.airbyte.cdk.integrations.destination.jdbc.AbstractJdbcDestination.getSerializedMessageConsumer(AbstractJdbcDestination.kt:264) ~[airbyte-cdk-db-destinations-0.33.0.jar:?]
at io.airbyte.cdk.integrations.base.ssh.SshWrappedDestination.getSerializedMessageConsumer(SshWrappedDestination.kt:130) [airbyte-cdk-core-0.33.0.jar:?]
at io.airbyte.cdk.integrations.base.IntegrationRunner.runInternal(IntegrationRunner.kt:208) [airbyte-cdk-core-0.33.0.jar:?]
at io.airbyte.cdk.integrations.base.IntegrationRunner.run(IntegrationRunner.kt:116) [airbyte-cdk-core-0.33.0.jar:?]
at io.airbyte.integrations.destination.mysql.MySQLDestination$Companion.main(MySQLDestination.kt:234) [io.airbyte.airbyte-integrations.connectors-destination-mysql.jar:?]
at io.airbyte.integrations.destination.mysql.MySQLDestination.main(MySQLDestination.kt) [io.airbyte.airbyte-integrations.connectors-destination-mysql.jar:?]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant