-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add v1 enpoint to set Response
status
#2756
feat: add v1 enpoint to set Response
status
#2756
Conversation
A review @frascuchon and @jfcalvo would be much appreciated! I would like to also hear your opinion. Also, I've some questions:
|
Hi @gabrielmbmb, thanks for the PR. I will take a look to it. Regarding your questions:
Although is not within the immediate scope of the issue, if you have the time and desire to work on it, you're welcome to do so.
I think it's correct to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added some questions that you can answer at your convenience and also included some suggestions to help improve the solution. Please feel free to respond to the questions whenever you have time.
@@ -36,6 +36,7 @@ def upgrade() -> None: | |||
sa.Column("values", sa.JSON, nullable=False), | |||
sa.Column("record_id", sa.Uuid, sa.ForeignKey("records.id", ondelete="CASCADE"), nullable=False, index=True), | |||
sa.Column("user_id", sa.Uuid, sa.ForeignKey("users.id", ondelete="SET NULL"), index=True), | |||
sa.Column("status", sa.String, nullable=False, index=True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain your rationale for modifying this database migration instead of opting to create a new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in the current latest version 1.6.0
, only three tables have been created and managed by alembic workspaces
, users
, and workspaces_users
.
In the branch feat/2615-instructions-task
, migrations files have been created for some other tables. I've opted to modify the migration file instead of generating a new one because I've assumed that as these changes have not been used yet in any production environment, it is better to modify the migration file directly to add this new column, instead of creating a new migration file for just adding one column. One more file, but a little bit more noise.
Of course, if the migrations files in feat/2615-instruction-task
were already applied in any production DB, then the logical thing to do would have been to generate a new migration script to add this new column, so alembic would have detected that there is a revision that has not been applied and that it needs to apply to the DB.
Having that said, for me it was not super clear how are you planning to handle the migrations with alembic
. Maybe it's a good idea to have one single alembic revision/script to update the DB per the released version of the package.
Also, I've realized that alembic revision --autogenerate
command cannot be used to generate migrations automatically because
argilla/src/argilla/server/alembic/env.py
Line 37 in 5e3ecd7
target_metadata = None |
target_metadata
has not been set to argilla/src/argilla/server/database.py
Line 51 in 5e3ecd7
class Base(DeclarativeBase): |
Base.metadata
.
Another doubt that I have, is that I've seen that the attributes nullable
, index
, etc are being set in the migrations files, but not in the mapped_column
function of sqlalchemy
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I've realized that alembic revision --autogenerate command cannot be used to generate migrations automatically
In your opinion, what are the benefits of utilizing automatic migrations?
Another doubt that I have, is that I've seen that the attributes nullable, index, etc are being set in the migrations files, but not in the mapped_column function of sqlalchemy.
In the same vein as the previous question, what are your views on the advantages of including these attributes in models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that I can answer both questions at the same time 😁
So if we include these attributes in the models, then alembic
will be available to automatically pick up them and it will generate a statement in the migration script with all the columns constraints, index, etc.
I think that it's really beneficial to use alembic
automatic migrations, because alembic
is able to automatically detect the updates you have done in your model class, or in other words, it is able to generate a diff
between the schema generated with all the previous revisions and the one you're about to generate, and it will only create statements for the things that need to be updated.
So in my opinion, it is better to let alembic
generate the migration file automatically with the SQLAlchemy
model metadata than to let a human generate it manually and increase the chances of forgetting to include something in the migration script.
|
||
class Response(BaseModel): | ||
id: UUID | ||
values: Dict[str, Any] | ||
record_id: UUID | ||
user_id: UUID | ||
status: ResponseStatus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In src/argilla/server/schemas/v1/datasets.py
, there exists a distinct Response
schema. Would you suggest including the status
field in this schema as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't realize there was this Response
schema. Yes, we should include the status
field here too, otherwise when calling the listing records /api/v1/datasets/{dataset_id}/records
the status for each response won't be there, and the frontend would have to send an additional request to get the status for getting the responses status, which is not ideal.
I think it makes sense to have a ResponseBase
class in src/argilla/server/schemas/v1/responses.py
, this way we avoid code duplication.
- We can create a
Response
class inheriting fromResponseBase
class with additional attributes that have to be exposed when listing responses (like therecord_id
) insrc/argilla/server/schemas/v1/responses.py
. - We can create a
Response
class inheriting fromResponseBase
with no additional attributes that can be used when listing records insrc/argilla/server/schemas/v1/datasets.py
.
hey @jfcalvo, I've changed the status of the PR to ready for review. Is there anything else missing that I should add? |
Thank you, @gabrielmbmb, for your answers. The PR looks fine, and @frascuchon should be reaching out to you soon for the next steps. |
Description
I've added a new endpoint
PUT /api/v1/responses/{response_id}/status
that allows setting the status of aResponse
.To do so, I've added a new column to the
Response
model/table calledstatus
, which right now has 3 possible values:pending
(which is the default value),submitted
, anddiscarded
. I've also updated thealembic
migrations files to include this new column, which is notnullable
(it will have a default value always), and also anindex
will be created for it, as I suppose that we would like to filter by it and having an index will allow querying using a filter on this column faster.Regarding the endpoint definition, I've decided to just add a new
PUT
endpoint to update thestatus
of theResponse
instead of using thePUT /v1/response/{response_id}
endpoint, because I think:As an alternative to this new endpoint, we could have:
PUT /api/v1/responses/{response_id}/submit
andPUT /api/v1/responses/{response_id}/discard
), one for submitting and one for discarding, but I think this is not a good idea, because if a new possible status is introduced, then a new endpoint has to be created and maintained. The logic of both endpoints will be exactly the same, so we would have two endpoints almost identical. Also, this approach would require the client to send a PUT request to a different endpoint depending on the status, and this can make the API harder to use.PUT /api/v1/responses/{response_id}/status/{status}
. This approach has the inconvenience that the client would have to send a PUT request to a different endpoint each time the status has to be changed.Closes #2733
Type of change
(Please delete options that are not relevant. Remember to title the PR according to the type of change)
How Has This Been Tested
Checklist