Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Subtask] Support Column and its default value #5202

Open
Tracked by #5198
unknowntpo opened this issue Oct 21, 2024 · 6 comments
Open
Tracked by #5198

[Subtask] Support Column and its default value #5202

unknowntpo opened this issue Oct 21, 2024 · 6 comments
Assignees
Labels
subtask Subtasks of umbrella issue

Comments

@unknowntpo
Copy link
Contributor

unknowntpo commented Oct 21, 2024

Describe the subtask

@xunliu @SophieTech88 This issue should also implement public class Column at api/src/main/java/org/apache/gravitino/rel/Column.java

Parent issue

#5198

@unknowntpo unknowntpo added the subtask Subtasks of umbrella issue label Oct 21, 2024
@unknowntpo unknowntpo changed the title [Subtask] Support Column default value [Subtask] Support Column and its default value Dec 23, 2024
@tsungchih
Copy link

tsungchih commented Feb 23, 2025

Hi, I am George. I was wondering if I could take this one provided that there's no one working on it.

@justinmclean
Copy link
Member

Sure, just be aware this is a slightly old issue and it may have already been done.

@tsungchih
Copy link

tsungchih commented Feb 24, 2025

Thanks for the prompt reply and the reminder. So far in the repo, I have not seen the code related to this issue though. After having dedicated myself into the code for a couple of hours, I came out the following list of Java classes needs to be implemented in client-python.

  • Column
  • ColumnDTO
  • SupportsTags
  • Tag
  • PartitionUtils
  • FieldReferenceDTO
  • FuncExpressionDTO
  • LiteralDTO
  • UnparsedExpressionDTO
  • FunctionArg
  • TypesSerializer
  • TypesDeserializer
  • ColumnDefaultValueSerializer
  • ColumnDefaultValueDeserializer
  • NoSuchTagException
  • TagAlreadyExistsException

Could you please take a look at it to see if I missed something?
I feel the PR for this issue will be a little bit fat. Would you prefer to embrace them all in a single PR or multiple PRs?

Here's the initial version of class diagram for your reference.

---
title: Class Diagram
---
classDiagram
    class Tag {
        <<interface>>
        +PROPERTY_COLOR = "color"
        +name(self) str
        +comment(self) str
        +properties(self) Dict[str, str]
        +inherited(self) Optional[bool]
    }
    class SupportsTags {
        <<interface>>
        +list_tags(self) List[str]
        +list_tags_info(self) List[Tag]
        +get_tag(self, name: str) Tag
        +associate_tags(self, tags_to_add: List[str], tags_to_remove: List[str]) List[str]
    }
    class Column {
        <<interface>>
        +name(self) str
        +data_type(self) Type
        +comment(self) Optional[str]
        +nullable(self) bool
        +auto_increment(self) bool
        +default_value(self) Expression
        +supports_tags(self) SupportsTags
        +of(name: str, data_type: Type, comment: Optional[str]=None, nullable: bool=True, auto_increment: bool=False, default_value: Optional[Expression]=None) ColumnImpl
    }
    class ColumnImpl {
        __init__(self, name: str, data_type: Type, comment: Optional[str], nullable: bool, auto_increment: boool, default_value: Optional[Expression])
    }
    class ColumnDTO {
        -name: str
        -data_type: Type
        -comment: Optional[str]
        -nullable: bool
        -auto_increment: bool
        -default_value: Expression
    }
    class ColumnDefaultValueSerde {
        +serialize(cls, data: Expression) Optional[dict]
        +deserialize(cls, data: Optional[dict]) Expression
    }
    class FieldReferenceDTO {
        -field_name: List[str]
        __init__(self, field_name: List[str])
        -field_name(self) List[str]
        -arg_type(self) FunctionArg.ArgType
    }
    class FuncExpressionDTO {
        -function_name: str
        -function_args: List[FunctionArg]
        __init__(self, function_name: str, function_args: List[FunctionArg])
        +args(self) List[FunctionArg]
        +function_name(self) str
        +arguments(self) List[Expression]
        +arg_type(self) FunctionArg.ArgType
    }
    class PartitionUtils {
        +validate_field_existence(columns: List[ColumnDTO], field_name: List[str]) None
    }
    class FunctionArg {
        <<interface>>
        EMPTY_ARGS: list[FunctionArg] = []
        +arg_type() ArgType
        +validate(columns: list[ColumnDTO]) None
    }
    class LiteralDTO~str~ {
        +NULL: LiteralDTO = LiteralDTO("NULL", Types.NullType.get())
        -value: str
        -data_type: Type
        __init__(self, value: str, data_type: Type) None
        +value() str
        +data_type() Type
        +arg_type() FunctionArg.ArgType
        __repr__(self) str
    }
    class SerdeBase~T,U~ {
        <<interface>>
        +serialize(cls, data: T) U
        +deserialize(cls, data: U) T
    }
    class TypesSerde {
        +serialize(cls, data: Type) str
        +deserialize(cls, data: str) Type
    }
    
    Auditable <|-- Tag
    Expression <|-- Literal

    Tag <.. SupportsTags
    SupportsTags <.. Column
    Column <|.. ColumnImpl: implements
    Column <|.. ColumnDTO: implements
    ColumnDTO <.. FunctionArg
    ColumnDTO <.. PartitionUtils
    DataClassJsonMixin <|.. ColumnDTO: implements
    SerdeBase <|.. TypesSerde: implements
    SerdeBase <|.. ColumnDefaultValueSerde: implements
    Expression <|-- FunctionArg
    PartitionUtils <.. FunctionArg
    TypesSerde <.. ColumnDTO
    ColumnDefaultValueSerde <.. ColumnDTO

    Literal <|.. LiteralDTO: implements
    FunctionArg <|.. LiteralDTO: implements
    NamedReference <|.. FieldReferenceDTO: implements
    FunctionArg <|.. FieldReferenceDTO: implements
    FunctionArg <|.. UnparsedExpressionDTO: implements
    UnparsedExpression <|.. UnparsedExpressionDTO: implements
    FunctionExpression <|.. FuncExpressionDTO: implements
    FunctionArg <|.. FuncExpressionDTO: implements
    LiteralDTO <.. ColumnDefaultValueSerde
    FieldReferenceDTO <.. ColumnDefaultValueSerde
    FuncExpressionDTO <.. ColumnDefaultValueSerde
    UnparsedExpressionDTO <.. ColumnDefaultValueSerde
Loading

@justinmclean
Copy link
Member

Smaller PRs are easier to review so if you can break it up into several PRs that would be a great help.

@tsungchih
Copy link

The plan is to have four PRs. Classes implemented in each PR are listed as follows for your reference.

  • PART1: Column, SupportsTags, Tag, NoSuchTagException, TagAlreadyExistsException
  • PART2: FunctionArg, PartitionUtils, ColumnDTO, TypesSerializer, TypesDeserializer
  • PART3: LiteralDTO, FieldReferenceDTO, FuncExpressionDTO, UnparsedExpressionDTO
  • PART4: ColumnDefaultValueSerializer, ColumnDefaultValueDeserializer

@justinmclean
Copy link
Member

Thanks, that sounds fine to me.

tsungchih added a commit to tsungchih/gravitino that referenced this issue Feb 26, 2025
add implementation of class Column

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
tsungchih added a commit to tsungchih/gravitino that referenced this issue Feb 28, 2025
rename method to follow snak_case convention

apache#5202

Signed-off-by: George T. C. Lai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
subtask Subtasks of umbrella issue
Projects
None yet
Development

No branches or pull requests

3 participants