Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Mock capability for taps #2845

Closed
ReubenFrankel opened this issue Jan 29, 2025 · 1 comment
Closed

feat: Mock capability for taps #2845

ReubenFrankel opened this issue Jan 29, 2025 · 1 comment

Comments

@ReubenFrankel
Copy link
Contributor

ReubenFrankel commented Jan 29, 2025

Feature scope

Taps (catalog, state, tests, etc.)

Description

Do you see value in adding a mock data capability to SDK taps? In the spirit of #1009 and MeltanoLabs/Singer-Most-Wanted#90 (comment), the idea would be to generate mock records for each stream based on their schemas (JSON Schema property type, format, enum, etc) using Faker - possibly reusing/refactoring the current stream maps implementation. This is obviously already possible with stream maps today, but would require stubbing out each stream property individually - the aim is to streamline this process; reduce the amount of necessary config and assume sensible mock values. Another benefit would be for taps with statically defined schemas that would be able to run without any config or connection to the source as a result, allowing for quick onboarding/prototyping. This would also work for taps that perform dynamic discovery, although config would still be required (in most cases).

Spec

  • New MOCK (or similar) capability, which exposes some SDK-native settings
    capabilities:
    - MOCK
    # perhaps these could be declared per-stream, although some overlap with stream maps
    settings:
    - name: mock_enabled
      kind: boolean
      default: false
    - name: mock_records
      kind: integer
      default: 10
  • Underlying Faker instance seed and locale configurable with faker_config (as with stream maps)
    config:
      faker_config:
        seed: 0
        locale:
        - en_US
        - en_GB

Considerations

  • Inference of Faker provider from just JSON Schema property type/format is fairly straightforward - it may be possible to resolve a more accurate provider in a more granular context when taking the property name into account (this would be more of a "best guess", however)
  • With Meltano: warn if running meltano config <tap> test with mock_enabled: true, as this may otherwise indicate a false-positive
@edgarrmondragon
Copy link
Collaborator

Hey @ReubenFrankel 👋

  • How do you imagine invoking a tap in "mock" mode? Something like MELTANO_EXTRACTOR_MOCK_ENABLED meltano run ...?
  • Other than your comment about meltano config <tap> test, do you think most of the work to enable this would be done in the SDK? Do you think non-SDK taps are out of scope for this feature?

@edgarrmondragon edgarrmondragon removed their assignment Feb 6, 2025
@meltano meltano locked and limited conversation to collaborators Feb 6, 2025
@edgarrmondragon edgarrmondragon converted this issue into discussion #2874 Feb 6, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

2 participants