-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement: Add user feedback for responses #287
Conversation
Added basic feedback mechanism for responses generated by the chatbot. The feedbacks are stored in DynamoDB which can be queried to do analysis as required by admin users. In future we can add a UI page to display the feedbacks, but for now these are being stored and manual analysis would be required. The feedbacks are not adding to the learning of the chatbot.
@massi-ang could you please have a look? |
Hi @azaylamba thanks for your submission. I really like this feature as it opens up to offline analyses and more importantly fine tuning of models where possible and I think this should be the end goal. Though, I think this should be stored in a form of dataset that then can be used for fine-tuning. See Bedrock's prepare for finetuning docs which requires dataset in the following format:
Now, in this case, we can bring on more fields such as
and stored in S3 maybe with a Infrequent Access storage class. The advantages would be:
You can even store 1 object per feedback, Athena, Glue/EMR will do the heavy-lifting to merge them when needed. |
@bigadsoleiman Thanks for your input. I will try the approach you suggested. |
…mal information. This was good for beta phase but was not useful if we want to do analysis on it using other services like Athena/Glue etc. As part of this change, we have changed the mechanism and the feedback is being stored in S3 bucket now. We are storing the feedback in the following format: { "feedbackId": "<feedbackId>", "sessionId": "<sessionId>", "userId": "<userId>", "key": "<key>", "prompt": "<question>", "completion": "<generated text>", "model": "<model name>", "feedback": "thumbsUp/thumbsDown", "createdAt": "<timestamp>" } This feedback can be used by AWS Athena/Glue etc to do analysis and also to train the model if required. We are keeping one object per feedback in S3 'STANDARD_IA' class.
@bigadsoleiman I have modified the code to store the user feedback in S3 bucket as per the suggested format. Please have a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!! going in right direction.
One thing I would change is the value of the feedback, I would avoid strings such as thumbsUp/thumbsDown
as it's easier in a context of EMR, Athena to deal with numbers especially when it comes to aggregations so I'd rather use 0/1
Unless there is a specific reason I'm missing for using thumbsUp/thumbsDown
, let me know.
Key=feedbackId, | ||
Body=json.dumps(item), | ||
ContentType="application/json", | ||
StorageClass='STANDARD_IA', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's handle storage class in CDK through lifecycle rules instead.
something along these line should ensure that objects are transitioned to the right class and it's easier to handle more complex rules in future
const userFeedbackBucket = new s3.Bucket(this, 'UserFeedbackBucket', {
...
lifecycleRules: [{
transitions: [{
storageClass: s3.StorageClass.INFREQUENT_ACCESS,
transitionAfter: cdk.Duration.days(0), // Transition immediately after creation
}],
}],
});
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The minimum allowed value of transitionAfter days is 30. So we cannot use transitionAfter: cdk.Duration.days(0),
The recommended approach to directly store the objects in STANDARD_IA is setting the storage class at object level.
@@ -39,7 +40,31 @@ export class ChatBotS3Buckets extends Construct { | |||
], | |||
}); | |||
|
|||
const userFeedbackBucket = new s3.Bucket(this, "UserFeedbackBucket", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's set the lifecycle rules here for simplicity and having them handled in one place (instead of object creation)
const userFeedbackBucket = new s3.Bucket(this, 'UserFeedbackBucket', {
...
lifecycleRules: [{
transitions: [{
storageClass: s3.StorageClass.INFREQUENT_ACCESS,
transitionAfter: cdk.Duration.days(0), // Transition immediately after creation
}],
}],
});
}
transferAcceleration: true, | ||
enforceSSL: true, | ||
serverAccessLogsBucket: logsBucket, | ||
cors: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why CORS are enabled here? We are not dealing with objects directly from web clients
blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL, | ||
removalPolicy: cdk.RemovalPolicy.DESTROY, | ||
autoDeleteObjects: true, | ||
transferAcceleration: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we need transfer acceleration here right?
Changed the user feedback type from string to number as number are easy to handle in Athena and other such tools.
@bigadsoleiman Addressed the review comments related to cors, transfer acceleration and feedback type. |
thanks for your inputs @bigadsoleiman |
Enhancement: Add user feedback for responses
Issue #151
Description of changes: Added basic feedback mechanism for responses generated by the chatbot.
As part of this change, we have stored the user feedback in S3 bucket.
We are storing the feedback in the following format:
{
"feedbackId": "",
"sessionId": "",
"userId": "",
"key": "",
"prompt": "",
"completion": "",
"model": "",
"feedback": "1/0",
"createdAt": ""
}
This feedback can be used by AWS Athena/Glue etc to do analysis and also to train the model if required. We are keeping one object per feedback in S3 'STANDARD_IA' class.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.