-
Notifications
You must be signed in to change notification settings - Fork 925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Brave News]: Build feed off the main thread #21833
Conversation
Looks like we definitely don't need this part, but also seems like it doesn't remove the ~startup requests for data since the TabHelper (created for every Tab's WebContents) ensures the data has started to be fetched in its ctor. And this data is needed for the feature to work. So, unless we're going to delay using the feature away from the 1st WebContents, then we should probably also make sure we're doing all we can to not block the main thread when requesting, parsing or processing the publisher or feed data. |
da1bc5d
to
ef96a98
Compare
cc89d7d
to
6539ac2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems sensible to remove the prefetch and do that separately if the thread moving is easier to reason
auto articles = GetArticleInfos( | ||
builder.publishers_controller_->GetLastLocale(), feed_items, | ||
builder.publishers_controller_->GetLastPublishers(), builder.signals_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of logical invariant of the builder's state,
to me it feel like accessing builder's member variable and publisher_controller is exposing us to potential risk of thread safety. If we're to do this, PublisherController's GetLastLocale() and others should be guaranteed of thread safety.
I think bind all recipes like publisher ids, locale, last publishers, signal from the main thread would be much safer.
(GenerateAllfeed as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay cool, I was considering that but got scared off by the fact that we were doing this for perf. Honestly, it'd make me feel a lot better about this to do it that way 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got scared off by the fact that we were doing this for perf
Yeah, now I see where you're coming from. Your concern makes sense when data is huge. I think we can try using RefCountedThreadSafe and others. Let me try a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest not use RefCountedThreadSafe
, it makes the code much harder to maintain in terms of avoiding data races.
There are a few alternatives:
- Post everything each time you need as extra arguments as a copy in function/static method/lambda (depends on the data, could be costly).
- Make a task runner inner class with necessary data. See FooService here: https://source.chromium.org/chromium/chromium/src/+/main:docs/threading_and_tasks_testing.md?q=OnTaskRunnerDeleter%20lang:md
@sangwoo108 thanks for the feedback - I'm going to copy everything into the lambda. I've got some compile errors, so going to push tomorrow 😄 |
Does the PR cover all the heavy request processing or the other usage will be fixed separately? |
Also it's not clear about After this change the callbacks start to be invoked async => if the object is destroyed we get UAF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! @atuchin-m , I think this PR is safe now. Could you make the final call? I think we can apply scoped_ref partially on problematic data later, if copying cost turns out to be too expensive.
|
||
return feed; | ||
FeedGenerationInfo info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what it's the typical size of FeedGenerationInfo
and how much CPU we need to construct it (including all the for-loops here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, 11ms also sounds like an issue for UI thread, especially for Android. UI thread is in charge of making frames each 33ms on average, therefore heavy computation is prohibited here.
From the upstream doc:
Do not perform expensive computation or blocking IO on the main thread
(a.k.a. “UI” thread in the browser process) or IO thread (each process's thread for receiving IPC).
A busy UI / IO thread can cause user-visible latency, so prefer running that work on the thread pool.
I believe the only way to solve this issue completely is to move all the inner things to a dedicated sequence (aka virtual thread in the pool).
A good example in Chromium code.
Otherwise, it's about impossible to manipulate such large structures on UI thread.
I'm not against solving the issue step-by-step, but I'd like to discuss and sync the final plan before we going to merge an individual PRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agreed, we should fix the root cause (that we're on the UI thread) rather than just putting bandaids over things. However, I think it's going to be a much more significant refactor to move everything into a sequence.
What I'm roughly thinking at the moment is this:
- We land & uplift this PR, roughly as is.
- We refactor
BraveNewsController
to have it's own sequence. Every call to subservices should happen on this sequence (this is going to be complicated because things are quite tightly tied toPrefService
at the moment). - We change this
FeedGenerationInfo
to just take a reference (still handy to tie everything together) and get rid of thebase::ThreadPool
usages here.
Does that sound reasonable? Happy to jump on a call to discuss 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, after a bit more research, I think something a bit like what PasswordStore
does is probably the way to go.
We introduce a BraveNewsControllerFrontEnd
. Lives on the UI thread, has a scoped_refptr
to a BraveNewsController
and SequencedTaskRunner
which is posts tasks to the BraveNewsController
on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge this PR (after we sure that all the callback are good), this should improve the situation.
The next steps are to solve the issue completely. I hope I could provide you some perf metrics to track the progress (in #21876)
4a6d5e6
to
aaed2f5
Compare
[](base::WeakPtr<FeedFetcher> fetcher, FetchFeedCallback callback, | ||
std::tuple<FeedItems, ETags> result) { | ||
// If we've been destroyed, don't run the callback. | ||
if (!fetcher) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking fetcher
makes sense. Although, the lifetime of the callback could be different.
Frankly, I can't 100% sure that we don't add new crashes because posting the callback.
The code use a very callback-based approach. It's really hard to surely find all the entry points and make a conclusion that it's still safe..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im other words, we changed the public contract of the class. Before we run all the callbacks sync, now we could postpone. A callback could be bound not only to FeedFetcher
but to a mojo-related interface (to send the answer) or something else. Better to double check all the callers. Or even make renaming DoSomething(..)
=> DoSomethingAsync(..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have changed the the public contract of the class - the callback was never invoked synchronously, only ever after all the combined feeds and direct feeds are fetched (quite apart from the network request, JSON parsing of the combined feed potentially happens in another process).
I added the check that fetcher
wasn't destroyed because it was the only place I could think of where the contract was changing (previously the callback wouldn't be invoked if the FeedFetcher
was destroyed, and the WeakPtr check maintains that guarantee).
[puLL-Merge] - brave/brave-core@21833 DescriptionThis PR introduces several changes across multiple components of the brave-core repository, mainly focused on improving the handling and generation of news feeds in the Brave browser. The main motivation seems to be streamlining the consumption of news feeds by deduplicating content, improving asynchronous processing, and expanding the scope of what can be considered for inclusion in a news feed. ChangesChanges
|
eb6aee8
to
c1a7194
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEPS
++
// static | ||
mojom::FeedV2Ptr FeedV2Builder::GenerateAllFeed(FeedGenerationInfo info) { | ||
DVLOG(1) << __FUNCTION__; | ||
DCHECK(!content::BrowserThread::CurrentlyOn(content::BrowserThread::UI)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not good to add content dependencies to components if they aren't strictly necessary because it makes this code unusable on ios
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't work here in a static method, but normally you can get the current sequence in the constructor and use that for sequence checks.
Resolves brave/brave-browser#35695
Submitter Checklist:
QA/Yes
orQA/No
;release-notes/include
orrelease-notes/exclude
;OS/...
) to the associated issuenpm run test -- brave_browser_tests
,npm run test -- brave_unit_tests
wikinpm run lint
,npm run presubmit
wiki,npm run gn_check
,npm run tslint
git rebase master
(if needed)Reviewer Checklist:
gn
After-merge Checklist:
changes has landed on
Test Plan:
Learn More
link on the Enable Brave News popup