Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: backup support for new multitenant system #2549

Merged
merged 39 commits into from
Nov 4, 2022
Merged

Conversation

BonapartePC
Copy link
Contributor

@BonapartePC BonapartePC commented Oct 11, 2022

Description

Supporting backup upload for the multitenant system. The following are the changes:

  1. We support using self-storage by the customer now
  2. In a table with multiple customers, we might have to send different customers to different storage. With this PR, it should be possible.
  3. Every gz file will now have a workspace id at the end attached to the file name.
  4. Depending on the user preferences, proc_errors and gw dumps will be uploaded if they are set to true.

Notion Ticket

https://www.notion.so/rudderstacks/Allow-Customers-to-use-self-storage-26f45f75e38b4ef6bd872bea6187aa06
https://www.notion.so/rudderstacks/Divide-customers-events-in-jobs-table-and-upload-them-in-respective-storages-04dbf3dbd7bd435786e6a3808bed73ed

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

@codecov
Copy link

codecov bot commented Oct 11, 2022

Codecov Report

Base: 43.74% // Head: 43.65% // Decreases project coverage by -0.08% ⚠️

Coverage data is based on head (774836b) compared to base (00ba231).
Patch coverage: 77.36% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2549      +/-   ##
==========================================
- Coverage   43.74%   43.65%   -0.09%     
==========================================
  Files         187      191       +4     
  Lines       39993    40486     +493     
==========================================
+ Hits        17494    17675     +181     
- Misses      21403    21704     +301     
- Partials     1096     1107      +11     
Impacted Files Coverage Δ
config/backend-config/types.go 0.00% <0.00%> (ø)
jobsdb/jobsdb.go 68.79% <20.00%> (-0.02%) ⬇️
services/fileuploader/gzwriter.go 64.00% <64.00%> (ø)
processor/stash/stash.go 21.39% <73.68%> (ø)
jobsdb/backup.go 76.30% <74.46%> (+1.39%) ⬆️
services/fileuploader/fileuploader.go 96.96% <96.96%> (ø)
processor/manager.go 86.04% <100.00%> (+0.33%) ⬆️
processor/processor.go 71.21% <100.00%> (-0.04%) ⬇️
middleware/middleware.go 86.20% <0.00%> (-8.63%) ⬇️
services/rsources/handler.go 69.72% <0.00%> (-1.39%) ⬇️
... and 5 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

config/backend-config/types.go Outdated Show resolved Hide resolved
config/backend-config/types.go Outdated Show resolved Hide resolved
@atzoum atzoum requested a review from saurav-malani October 24, 2022 07:58
@atzoum
Copy link
Contributor

atzoum commented Oct 24, 2022

@BonapartePC #2542 introduced a major refactoring on most backup functions. Furthermore, all backup-related functions are now moved in a separate file, backup.go. @saurav-malani may provide you with some more information regarding the scope of this aforementioned refactoring.
Please resolve all relevant conflicts, in order to proceed with the review.

services/backup/backup.go Outdated Show resolved Hide resolved
@cisse21 cisse21 assigned cisse21 and unassigned cisse21 Oct 26, 2022
@cisse21 cisse21 self-requested a review October 26, 2022 08:04
@atzoum atzoum requested a review from saurav-malani November 3, 2022 17:53
Copy link
Contributor

@atzoum atzoum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern is test coverage.
Such large refactorings should increase overal code coverage instead of decreasing it.

@BonapartePC
Copy link
Contributor Author

My only concern is test coverage. Such large refactorings should increase overal code coverage instead of decreasing it.

The coverage is a little better now. Most of the impacted files are the ones where the fileuploader is just passed but there are no tests that result in decreasing coverage a bit.

@@ -68,7 +70,6 @@ func (jd *HandleT) backupDSLoop(ctx context.Context) {
opID := jd.JournalMarkStart(backupDSOperation, opPayload)
err = jd.backupDS(ctx, backupDSRange)
if err != nil {
stats.Default.NewTaggedStat("backup_ds_failed", stats.CountType, stats.Tags{"customVal": jd.tablePrefix, "provider": config.GetString("JOBS_BACKUP_STORAGE_PROVIDER", "S3")}).Increment()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this not change the semantics of the backup_ds_failed metric which would count all failures (including issues while calling cleanStatusTable, which gets skipped if you remove this). I see this has been moved to the uploadDumps function with a different set of labels. Should we not keep this as is and rename the stat in uploadDumps to somethinglike upload_dump_failed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to remove this here because at the point where we are making this stat we can't determine the provider tag. This will be different for each workspace which was previously taken from the env.

jobsdb/backup.go Outdated
Comment on lines 132 to 133
g, _ := errgroup.WithContext(ctx)
g.SetLimit(config.GetInt("JobsDB.JobsBackupUploadWorkers", 100))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Should this not be

g, ctx := errgroup.WithContext(ctx)

and we should be passing the derived ctx is passed to all the goroutines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. added

jobsdb/backup.go Outdated
Comment on lines 182 to 197
dumps, err := jd.createTableDumps(getFailedOnlyBackupQueryFn(backupDSRange), getFileName, totalCount)
if err != nil {
return fmt.Errorf("error while creating table dump: %w", err)
}
defer func() { _ = os.Remove(path) }()

err = jd.uploadTableDump(ctx, path)
defer func() {
for _, filePath := range dumps {
os.Remove(filePath)
}
}()
err = jd.uploadDumps(ctx, dumps)
if err != nil {
jd.logger.Errorf("[JobsDB] :: Failed to upload table %v", tableName)
return err
return fmt.Errorf("error while uploading dumps for table: %s: %w", tableName, err)
}

stats.Default.NewTaggedStat("total_TableDump_TimeStat", stats.TimerType, stats.Tags{"customVal": jd.tablePrefix}).Since(start)
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: This code is almost identical to the code in lines 241-256, 294-311 factorize it maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to leave it this way so it is clearly shown as different steps being taken

jobsdb/backup.go Outdated
err = jd.uploadTableDump(ctx, path)
defer func() {
for _, filePath := range dumps {
os.Remove(filePath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ = os.Remove(filePath)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change

Comment on lines +60 to +69
func (p *provider) GetFileManager(workspaceID string) (filemanager.FileManager, error) {
<-p.init
settings, ok := p.storageSettings[workspaceID]
if !ok {
return nil, fmt.Errorf("no storage settings found for workspace: %s", workspaceID)
}
return filemanager.DefaultFileManagerFactory.New(&filemanager.SettingsT{
Provider: settings.Bucket.Type,
Config: settings.Bucket.Config,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a test scenario for this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Comment on lines +38 to +44
func NewStaticProvider(storageSettings map[string]StorageSettings) Provider {
s := &provider{
init: make(chan struct{}),
storageSettings: storageSettings,
}
close(s.init)
return s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a test scenario for this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

p.onceInit.Do(func() {
close(p.init)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this appears to be not covered with tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test for that

Comment on lines +120 to +125
func (*defaultProvider) GetFileManager(_ string) (filemanager.FileManager, error) {
defaultConfig := getDefaultBucket(context.Background(), config.GetString("JOBS_BACKUP_STORAGE_PROVIDER", "S3"))
return filemanager.DefaultFileManagerFactory.New(&filemanager.SettingsT{
Provider: defaultConfig.Type,
Config: defaultConfig.Config,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a test scenario for this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,62 @@
package fileuploader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the rational behind creating this services/fileuploader/gzwriter? I mean it seems to be doing things very specific to our jobsDB backup (atleast as of now). Creating a gzFileHandler which has a map of path & gzipwriter. Where else are we going to use it? If no where, then won’t it be better to move it to jobsdb backup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is it can be used anywhere not just jobsdb so I kept it in services folder

@atzoum atzoum requested a review from fracasula November 4, 2022 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants