Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [connector-file-ftp] When accessing an FTP server in a different subnet, it results in a failure to retrieve the FTP directory listing,show: Get file list failed #8493

Open
2 of 3 tasks
justlkp opened this issue Jan 10, 2025 · 0 comments
Labels

Comments

@justlkp
Copy link

justlkp commented Jan 10, 2025

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

Within the same network segment, the FTP connector can normally read files. However, when accessing the same FTP service with an IP address that is not in the same segment, access is not possible. Yet, under the same circumstances, using an FTP client such as FileZilla allows for normal access. After multiple tests in this scenario, it was found that the FTP connector only functions properly when used within the same network segment.

To clarify further:

When both the FTP server and the connector are on the same subnet, file reading works as expected.
When the FTP server and a device using the connector are on different subnets, the connection fails.
An FTP client like FileZilla can successfully connect to the FTP server regardless of being on the same or a different subnet, indicating that the issue is specific to the FTP connector's configuration or limitations.

在同一个网段下,可以使用ftp连接器可以正常读取文件,但是访问同一个ftp服务,ip地址不在同一个网段,则无法获取ftp文件列表,显示文件不存在,显示Get file list from this path [/test/test.csv] failed , 但是同样情况下使用ftp客户端,比如filezilla可以正常访问,多次测试此种情况,发现只有在同一个网段下该ftp连接器才可以正常使用,已经使用vsftpd 3.0.2和vsftpd 3.0.3多次测试,复现这个问题

successful config

env {
  parallelism = 1
  job.mode = "BATCH"
}

source {
  FtpFile {
    result_table_name = "fake2"
     host = "192.168.8.44"
      port = 19872
      user = "test"
      password = "123456"
      path = "/test/test.csv"
    file_filter_pattern = "test.csv"
    file_format_type = "csv"
    schema = {
      fields {
        name = "string"
        age = "string"
      }
    }
  }
}

transform {
}

sink {
   Jdbc {
        url = "jdbc:mysql://192.168.8.42:3306/test?useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&useSSL=false&serverTimezone=GMT%2B8"
        driver = "com.mysql.cj.jdbc.Driver"
        user = "test"
        password = "123456"
        database = "sea_test"
        table = "dev_user_ftp"
        schema_save_mode = "RECREATE_SCHEMA"
        data_save_mode="APPEND_DATA"
        generate_sink_sql = true
    }

}

fail config

env {
  parallelism = 1
  job.mode = "BATCH"
}

source {
  FtpFile {
    result_table_name = "fake2"
     host = "111.xxx.xx.xxx"
      port = 19872
      user = "test"
      password = "123456"
      path = "/test/test.csv"
    file_filter_pattern = "test.csv"
    file_format_type = "csv"
    schema = {
      fields {
        name = "string"
        age = "string"
      }
    }
  }
}

transform {
}

sink {
   Jdbc {
        url = "jdbc:mysql://192.168.8.42:3306/test?useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&useSSL=false&serverTimezone=GMT%2B8"
        driver = "com.mysql.cj.jdbc.Driver"
        user = "test"
        password = "123456"
        database = "sea_test"
        table = "dev_user_ftp"
        schema_save_mode = "RECREATE_SCHEMA"
        data_save_mode="APPEND_DATA"
        generate_sink_sql = true
    }

}

SeaTunnel Version

2.3.8

SeaTunnel Config

seatunnel:
  engine:
    http:
      enable-http: true
      port: 6086
    history-job-expire-minutes: 1440
    backup-count: 1
    queue-type: blockingqueue
    print-execution-info-interval: 60
    print-job-metrics-info-interval: 60
    slot-service:
      dynamic-slot: true
    checkpoint:
      interval: 300000
      timeout: 60000
      storage:
        type: hdfs
        max-retained: 3
        plugin-config:
          namespace: /tmp/seatunnel/checkpoint_snapshot
          storage.type: hdfs
          fs.defaultFS: file:///tmp/ # Ensure that the directory has written permission
    telemetry:
      metric:
        enabled: false

Running Command

./bin/seatunnel.sh --config ./config/testftp2mysqlremote.template -m local

Error Exception

Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:213)
	at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
	at org.apache.seatunnel.example.engine.SeaTunnelEngineExample.main(SeaTunnelEngineExample.java:43)
Caused by: org.apache.seatunnel.connectors.seatunnel.file.exception.FileConnectorException: ErrorCode:[FILE-03], ErrorDescription:[Get file list failed] - Get file list from this path [/test/test.csv] failed
	at org.apache.seatunnel.connectors.seatunnel.file.ftp.source.FtpFileSource.prepare(FtpFileSource.java:96)
	at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSource(JobConfigParser.java:83)
	at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSource(MultipleTableJobConfigParser.java:370)
	at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:209)
	at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.getLogicalDag(ClientJobExecutionEnvironment.java:114)
	at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.execute(ClientJobExecutionEnvironment.java:182)
	at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:160)
	... 2 more
Caused by: java.io.FileNotFoundException: File /test/test.csv does not exist.
	at org.apache.seatunnel.connectors.seatunnel.file.ftp.system.SeaTunnelFTPFileSystem.getFileStatus(SeaTunnelFTPFileSystem.java:523)
	at org.apache.seatunnel.connectors.seatunnel.file.ftp.system.SeaTunnelFTPFileSystem.listStatus(SeaTunnelFTPFileSystem.java:439)
	at org.apache.seatunnel.connectors.seatunnel.file.ftp.system.SeaTunnelFTPFileSystem.listStatus(SeaTunnelFTPFileSystem.java:421)
	at org.apache.seatunnel.connectors.seatunnel.file.hadoop.HadoopFileSystemProxy.listStatus(HadoopFileSystemProxy.java:154)
	at org.apache.seatunnel.connectors.seatunnel.file.source.reader.AbstractReadStrategy.getFileNamesByPath(AbstractReadStrategy.java:110)
	at org.apache.seatunnel.connectors.seatunnel.file.ftp.source.FtpFileSource.prepare(FtpFileSource.java:92)
	... 8 more

Process finished with exit code 1

Zeta or Flink or Spark Version

No response

Java or Scala Version

Java 1.8.0_202

Screenshots

image

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@justlkp justlkp added the bug label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant