Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cache system #55

Merged
merged 11 commits into from
Jul 31, 2024
Merged

Implement cache system #55

merged 11 commits into from
Jul 31, 2024

Conversation

pivilartisant
Copy link
Collaborator

No description provided.

@pivilartisant pivilartisant linked an issue Jul 30, 2024 that may be closed by this pull request
2 tasks
@pivilartisant pivilartisant changed the title improve cache Implemebt cache system Jul 30, 2024
@pivilartisant pivilartisant changed the title Implemebt cache system Implement cache system Jul 30, 2024
pkg/website/cache.go Outdated Show resolved Hide resolved
Copy link
Member

@thomas-senechal thomas-senechal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments
Let's try to not use Info log level in functions in pkg unless really required. We should try to use Debug log level instead. Having good practices like that now will allow us to not later require to check all the code to make sure our logs have the right level

Taskfile.yml Outdated Show resolved Hide resolved
cmd/cli/main.go Outdated Show resolved Hide resolved
Comment on lines 18 to 24
// fake last updated date
lastUpdated := time.Now()

// fake creation date
creationDate := time.Now()

shouldFetchWebsite := shouldFetch(fileName, cache, lastUpdated, creationDate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand how that works 🤔
IMO, we shouldn't compare something with the creation date of the SC, because, the lastUpdate will always be after the creation date (as it's set during SC deploy, and lastUpdate changes at each edit/delete)

So we either need a way to check the last update of the file on the disk, for example using os.Stat and ModTime() on the returned struct, or have a file storing the list of website that are supposed to be cached.

The file approach would allow us to store more informations than just the download date of each website as it could also allow us to store website immutability and potentially MNS resolution. For example, we could have the following structure, and the file would be an array of websites (in json, yaml or any file format we decide on):

type StoredWebsite struct {
	WebsiteAddress string
	Immutable      bool
	DownloadDate   time.Time
	MNS            string
}

And so, if we store an immutable website, we know we don't need to download it again, and we don't need to resolve the MNS domain. We could even imagine a way to warn people if something changed while the website was supposed to be immutable.

Please, note that you do not need to implement all I said, just a way to compare the website download date with the lastUpdate returned by the SC. So, we could go with the os.Stat method for now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the file solution!

Without digging to deep I do have a couple of questions:

  • Users modifying the file by hand
  • The eventual size of the .json/yml (performance issues)

I will implement the os.stat but should this be converted into an issue and addressed in a later sprint ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be converted into an issue and addressed in a later sprint ?

yes

Users modifying the file by hand

The file could be in a non human readable format to reduce the risk of user corrupting the file

The eventual size of the .json/yml (performance issues)

We could do something to delete website that were downloaded after a specific time, like, 1 month ? But that requires more thinking as immutable websites shouldn't be deleted, but at the same time, if you did not visit an immutable website for a while, you might want to remove it from cache

We could also allocate X Gb for website caching, and delete oldest downloaded websites when reaching the limit

But yeah, need to spend more time thinking abtout this

pkg/websiteManager/websiteManager.go Outdated Show resolved Hide resolved
int/zipper/zipper.go Outdated Show resolved Hide resolved
pkg/websiteManager/websiteManager.go Outdated Show resolved Hide resolved
pkg/websiteManager/websiteManager.go Outdated Show resolved Hide resolved
Taskfile.yml Show resolved Hide resolved
pkg/websiteManager/websiteManager.go Outdated Show resolved Hide resolved
pkg/websiteManager/websiteManager.go Outdated Show resolved Hide resolved
}

// Fetches the website and saves it to the cache.
func fetchAndCache(config *pkgConfig.Config, scAddress string, cache *cache.Cache, fileName string) ([]byte, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you replace it config with NetworkInfo directly ? We don't need the full object, and it might be an issue with the webserver which won't necessarily have the network and everything

Copy link
Member

@thomas-senechal thomas-senechal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job !

@thomas-senechal thomas-senechal merged commit 3ea96a9 into main Jul 31, 2024
6 checks passed
@thomas-senechal thomas-senechal deleted the improve-cache branch September 5, 2024 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add cache to website fetching
2 participants