Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC] S3 multi-tenancy on tmail-backend #1315

Open
Arsnael opened this issue Nov 18, 2024 · 11 comments
Open

[EPIC] S3 multi-tenancy on tmail-backend #1315

Arsnael opened this issue Nov 18, 2024 · 11 comments

Comments

@Arsnael
Copy link
Member

Arsnael commented Nov 18, 2024

After the latest discussions on this topic on the james ML, it seems that multi-tenancy might be more suitable for implementing it on tmail-backend side.

Initial topic on james backlog: linagora/james-project#5263

We need to groom and find a solution how to achieve a similar result on Tmail side in a smart way

@vttranlina
Copy link
Member

vttranlina commented Nov 20, 2024

After analyzing the related code that uses the method org.apache.james.blob.api.BlobStore#getDefaultBucketName, here's my proposal:
We will modify all class where the BlobStore APIs are used to explicitly pass the calculated BucketName (based on the domain).

For the Tmail-backend module:

  • EncryptedEmailContentStore
  • PublicAssetRepository

We will update the interfaces to include an additional argument, such as a Domain or Tenant.

For example:
Current:
def store(messageId: MessageId, encryptedEmailContent: EncryptedEmailContent): Publisher[Unit]
Proposed:
def store(domain: Optional[Domain], messageId: MessageId, encryptedEmailContent: EncryptedEmailContent): Publisher[Unit]

And using MailboxSession to bypass Domain value

For the James site (classes from james-project module):

We will create new classes specifically for Tmail-backend and use Guice to rebind them:

  • DeleteMessageListener
  • CassandraAttachmentMapper (and related class, eg: StoreAttachmentManager...)
  • CassandraMessageDAOV3
  • DeletedMessageVaultDeletionCallback
  • ExportService

To reduce effort, we will focus on BlobStore for storing body data. For metadata, we can retain the existing logic and continue using a shared default bucket.

@quantranhong1999
Copy link
Member

My idea:

  • Make BucketName an interface in James (Or make the BucketName overriable), and provide the James default implementation for it as of today DefaultBucketName.

  • Implement a TenancyBucketName implementation on TMail, which carries on the domain information.
    How to propagate domain information from James callers to TMail blob store:

    Introduce in James a BucketNameSupplier:

    @FunctionalInterface
    public interface BucketNameSupplier {
        BucketName apply(BucketName sourceBucketName, Optional<Domain> domain);
    }

    James would implement a DefaultBucketNameSupplier that is tenant agnostic:

    public class DefaultBucketNameSupplier implements BucketNameSupplier {
        @Override
        public BucketName apply(BucketName sourceBucketName, Optional<Domain> domain) {
            return sourceBucketName;
        }
    }

    Meanwhile, TMail would implement its TMailBucketNameSupplier:

    public class TMailBucketNameSupplier implements BucketNameSupplier {
        @Override
        public BucketName apply(BucketName sourceBucketName, Optional<Domain> domain) {
            //Take domain/tenant information into account, and return a TenancyBucketName
        }
    }

    Then we would try on James side to make the callers provide the tenant/domain information at best.

  • On TMail, implement a TenancyS3BlobStoreDAO that applies the tenancy logic (bucket/prefix/sse-c) and, of course, takes into account the TenancyBucketName with tenant information.
    Then replace S3BlobStoreDAO binding with TenancyS3BlobStoreDAO, especially for the Postgres-app.

@hungphan227
Copy link
Contributor

UploadRepository implementations may be also needed to override

@Arsnael
Copy link
Member Author

Arsnael commented Nov 27, 2024

I think we should avoid as much as possible (if not entirely) code changes on James side, and I feel like the changes proposed above would enrage the community again.

How about implementing a TenancyBlobStoreDAO that would take as arguments:

  • BlobStoreDAO : the blobstore dao that we are using in the end to read/write on object storage (S3, Postgres, Cassandra, ...)
  • TenancyBucketNameResolver : would be responsible to resolve the bucket name if tenancy is enabled and the tenancy conf is on bucket.
  • TenancyBlobIdResolver: would be responsible to resolve the blobId if tenancy is enabled and the tenancy conf is prefix for example.

Then TenancyBlobStoreDAO would reimplement the read and write operations, resolve the bucket name depending if we have a domain or not, resolve the blobId depending if we have a prefix for domain or not, then call the function of the blobStoreDAO with those resolved names.

For ssec key derivation, I believe just adding like in here https://github.com/apache/james-project/pull/2493/files#diff-0ba4d3f414a7fb4f8a8843347bf31ee1e51fe6b4a7cfef4b4a878ec7a2e5a311 the DerivedCustomerKeyFactory with proper conf on tmail with a module chooser for the bindings might do the trick.

TenancyBucketNameResolver:

public class TenancyBucketNameResolver {
    public static BucketName resolve(BucketName bucketName, Optional<Domain> maybeDomain) {
        Preconditions.checkNotNull(bucketName);

        return BucketName.of(maybeDomain.map(domain -> domain + "-" + bucketName.asString())
            .orElse(bucketName.asString())
    }
}

TenancyBlobIdResolver:

public class TenancyBlobIdResolver {
    private final BlobId.Factory blobIdFactory;

    @Inject
    public TenancyBlobIdResolver(BlobId.Factory blobIdFactory) {
        this.blobIdFactory = blobIdFactory;
    }

    public BlobId resolve(BlobId blobId, Optional<Domain> maybeDomain) {
        Preconditions.checkNotNull(blobId);

        return blobIdFactory.of(maybeDomain.map(domain -> domain + "-" + blobId.asString())
            .orElse(blobId.asString())
    }
}

We then likely avoid changing code on james side?

However I still have some remaining issues that I'm thinking of:

  • let's not forget about edge cases mentioned by Hung like UploadRepository but also the Deleted message vault
  • how do we get the domain in the first place??? I've been wrapping my head around it to no success, and I don't see other propositions mentioning it either. How do we get the proper Username that is reading or writing a blob so we are able to identify the Domain that he belongs to? Ideas? Cause BlobStoreDAO does not hold this info (I believe). Even going up one layer and doing the same with a BlobStore (TenancyBlobStore) still does not seem to give us access to that

@chibenwa
Copy link
Member

My original idea would be to record on James side the mailboxSession onto the reactive context (like for MDC)

Then, we simply need to read the context within TMail MultiTenantS3BlobStoreDAO to actually solve this.

@vttranlina
Copy link
Member

My original idea would be to record on James side the mailboxSession onto the reactive context (like for MDC)

Then, we simply need to read the context within TMail MultiTenantS3BlobStoreDAO to actually solve this.

It looks great at first.
I'm think to we need to find all the code that invokes the blobStoreDao.save/read methods and write context at the point where we have the mailboxSession
It seems like RequestAware#startProcessingRequest was born to resolve with situations like this? but I'm not sure it is possible or not.
Maybe we need a POC first

@chibenwa
Copy link
Member

Maybe we need a POC first

+1

@Arsnael
Copy link
Member Author

Arsnael commented Dec 2, 2024

let's not forget about edge cases mentioned by Hung like UploadRepository but also the Deleted message vault

no need to care actually with this proposal, the TenancyBlobStoreDAO will do the underlying job

@Arsnael
Copy link
Member Author

Arsnael commented Dec 4, 2024

Of course it's just a proposition of tasks here and we are still in grooming phase, up for discussion:

  • ADR => [Multitenant] ADR #1356
  • [JAMES] Propagate domain onto the reactive context (potentially POC needed)
  • Implement TenancyBlobStoreDAO (conf: none)
  • Implement TenancyBucketNameResolver (conf: bucket)
  • Implement TenancyBlobIdResolver (conf: prefix)
  • Implement SSEC key derivation (conf: ssec with key derivation based on domain)
  • Performance tests

Explanation of the task list choice:

  • Based on adding conf similar to multi-tenancy.mode=none|bucket|prefix|ssec
  • TenancyBlobStoreDAO would be first implemented with conf none, so it would just call the underlying blobstoredao without doing much else
  • Then conf bucket with the TenancyBucketNameResolver and proper changes in TenancyBlobStoreDAO (bucket conf is possible for all blobstoredaos I believe)
  • Then conf prefix with the TenancyBlobIdResolver (prefix conf only for S3 blobstore?)
  • Then conf ssec key derivation per domain (ssec key derivation conf only for s3 blobstore)

Feel free to argue :)

@chibenwa
Copy link
Member

chibenwa commented Dec 4, 2024

Please stop grooming multi-tenancy and help on operational topics.
On the next 2 week I will be unable to follow and we shall focus on customer deployment,
(did not read)

@Arsnael
Copy link
Member Author

Arsnael commented Dec 5, 2024

Roger

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants