feat: Introducing 2FA scrapers infrastructure + OneZero experimental scraper #760

orzarchi · 2023-02-25T15:27:27Z

Hi there!
Hopefully this is the first PR in the era of 2FA scrapers.
I've loosely followed the guidelines set up here.

I've created a new base scraper - BaseTwoFactorAuthScraper.
Since the original base-scraper file was getting a bit long, I've cleaned up a bit and moved error related types to errors.ts, and public facing interfaces (e.g. ScraperOptions, ScraperScrapingResult) to interface.ts
I've created the first 2FA scraper - OneZero. It's not perfect, but I've managed to scrape my personal account successfully. See the new documentation in the README to learn more about it.

Some limitations about OneZero:

I've supplied a way to provide a persistent OTP token, but it's limited - You can get one from OneZero that lasts 45 minutes, and it should be possible to refresh it indefinitely, but I've still not found the refresh graphql calls.
Using OneZero scraper in an automated scripts will probably log you out from whatever mobile devices you are connected to - their servers don't appreciate simultaneous logins. Perhaps this can be fixed in the future by supplying more headers and persisting cookies.

closes #765

esakal · 2023-03-05T19:59:56Z

@orzarchi thank you for this contribution 💯 . I'll free sometime soon to review and play with the PR.

@eshaham @baruchiro @brafdlog as this is a major change in the design / capabilities of the scrapers, you might want to review it as well?

baruchiro · 2023-03-12T06:41:10Z

I need to understand the interface changes more.

Meanwhile, I think you can continue with this change, it will not break our app because we will have to enable the new bank manually after getting your version.

CC: @brafdlog

orzarchi · 2023-03-21T17:08:31Z

Any news?
@baruchiro for the record, there aren't any interface changes, except exporting more types.
And even that can be reverted if necessary

brafdlog · 2023-03-22T19:47:44Z

Hi @orzarchi
I (and I assume baruch as well) am in a very busy period.
I suggest you don't wait for us. There shouldn't be issues based on your description, and we will update if there is when we update to the next version of israeli-bank-scrapers.

erikash

Great work :)
Please see my comments

src/helpers/fetch.ts

erikash · 2023-03-28T05:30:51Z

src/helpers/fetch.ts

+  return result.data as Promise<TResult>;
+}
+
+export async function fetchGetWithinPage<TResult>(page: Page, url: string): Promise<TResult | null> {


No await -> no need for this to be marked async

erikash · 2023-03-28T05:34:35Z

src/scrapers/base-two-factor-auth-scraper.ts

+  ScraperGetLongTermTwoFactorTokenResult, ScraperCredentials,
+} from './interface';
+
+export type ScraperLoginResult = ErrorResult | {


Why not move to interface?

erikash · 2023-03-28T05:35:58Z

src/scrapers/base-two-factor-auth-scraper.ts

+
+
+export class BaseTwoFactorAuthScraper<TTwoFactorCredentials> implements TwoFactorAuthScraper {
+  constructor(protected options: ScraperOptions) {


erikash · 2023-03-28T05:40:26Z

src/scrapers/one-zero.ts

+  }
+
+  private async fetchPortfolioMovements(portfolio: Portfolio, startDate: Date): Promise<TransactionsAccount> {
+    const account = portfolio.accounts[0];


Please open a task for future multi account support

erikash · 2023-03-28T05:41:49Z

src/scrapers/one-zero.ts

+    const movements = [];
+
+
+    while (!movements.length || new Date(movements[0].movementTimestamp) >= startDate) {


Use moment and not Date to avoid timezone issues

erikash · 2023-03-28T05:42:14Z

src/scrapers/one-zero.ts

+      }
+    }
+
+    movements.sort((x, y) => new Date(x.movementTimestamp).valueOf() - new Date(y.movementTimestamp).valueOf());


erikash · 2023-03-28T05:42:26Z

src/scrapers/one-zero.ts

+
+    movements.sort((x, y) => new Date(x.movementTimestamp).valueOf() - new Date(y.movementTimestamp).valueOf());
+
+    const matchingMovements = movements.filter((movement) => new Date(movement.movementTimestamp) >= startDate);


esakal · 2023-03-28T06:55:06Z

@orzarchi @erikash Hi,
Thanks again for this one, supporting otp is required as many services switch to OTP login.

TL;DR

Long time ago I suggested a modular architecture that would let us compose flows with atomic operations. But it was archived and we should assume for now that we will keep current architecture of the library.

We should favor features over infrastructure consistency as this library is in low maintenance but I believe we can easily modify the PR to use the existing scrapers while supporting OTP with the great work of @orzarchi.

I verified it with Max account. You should verify the One zero as well.

Please checkout changes here #765.

There are still the other code review comments but they are pretty much minor. Feel free to pull branch one-zero-with-existing-scrapers into your repo and create a PR from there if you prefer. you can also get diff here.

@orzarchi @erikash - wdyt?

Adjustments done in the PR

use ScraperCredentials everywhere and adjust declaration to support otp
move relevant parts from base-two-factor-auth-scraper into base-scraper
delete base-two-factor-auth-scrape
extend One zero scraper with BaseScraper

Motivation behind my suggestion

The new scraper BaseTwoFactorAuthScraper changes the current architecture of the library provided long time ago by @eshaham. I identify it in some places:

It doesn't emit the same progress events like the scrapers BaseScraper and BaseScraperWithBrowser exposed by method onProgress
Since one zero doesn't require puppeteer, the new scraper is similar to BaseScraper in the implementation. Once someone will need a an OTP scraper with puppeteer support we will need a fourth scraper for that.
It has some assumptions on the scraper credentials although once compiled it is not guaranteed and might break at runtime

erikash · 2023-03-28T08:02:05Z

Great insights @esakal
I agree it's cleaner - i guess that after a few more scrapers we might consider reintroducing base-two-factor-auth-scraper for reusing the login logic - but it's still too early, certainly for this PR.

I've make a small comment here:
#765 (comment)

@esakal : can you rephrase your second comment, I didn't understand it...
"Since one zero doesn't require puppeteer, the new scraper is similar to BaseScraper in the implementation. Once someone will need a an OTP scraper with puppeteer support we will need a fourth scraper for that."

esakal · 2023-03-28T16:55:54Z

@erikash thanks :)

"Since one zero doesn't require puppeteer, the new scraper is similar to BaseScraper in the implementation. Once someone will need a an OTP scraper with puppeteer support we will need a fourth scraper for that".

Today we have two base scrapers:

base-scraper - used to scrape without puppeteer.
base-scraper-with-browser - extends base scraper with puppeteer scraping capabilities.

In the original pull request, @orzarchi introduced a new scraper that wasn't inherited from either of the existing ones. Since this scraper doesn't require puppeteer support, it doesn't use puppeteer. My comment was that if we keep all three base scrapers, we may need to provide a fourth base scraper in the future for scrapers that require both OTP and puppeteer. However, in the suggested modification, this issue is no longer a concern. I hope this clarifies things.

orzarchi · 2023-03-28T20:13:58Z

hi @esakal - no problems, I'll fix @erikash's comments on top of your new PR with your changes.
Some thoughts about the discussion above:

I think we all proved once again that composition > inheritance, and that eventually the common functions in base-scraper/base-scraper-with-browser should be extracted to their own classes, and used by all concrete scrapers :)
I still feel like my approach with typing ScraperCredentials is better - if we define it per scraper, we can avoid jumping through hoops all over the codebase. Why? In the long term we can make the scraper factory return the concrete scraper that accepts concrete credentials instead of the Scraper interface (using something like this technique), and remove files like this as well as the code that tests it.
What do you feel about keeping the generic Credentials argument as I did here?

esakal · 2023-03-29T08:00:19Z

@orzarchi, I agree with you on both topics. Ideally, we should switch to a new composition-based architecture, but this will require active development work from developers to refactor all the scrapers. Unfortunately, this library is not currently being actively evolved by a group of developers; we are mostly focused on ongoing maintenance and limited by depending on developers that has actual access to each institute. This makes it challenging to change the architecture of the project.

However, if you do have a suggestion for a new architecture, testing it out on new scrapers could be a good way to gradually transition to it. For instance, One Zero might be a good candidate for this.

Regarding the credentials, you can certainly implement them. Just keep in mind that they will only work for those who initiate the scraper explicitly. Those who use the createScraper with Typescript might encounter runtime errors because the common credentials declaration is export declare type ScraperCredentials = Record<string, string>;

orzarchi · 2023-04-01T15:05:33Z

Hey @erikash and @esakal, I've incorporated all of your suggestions and code.
I've also implemented a compromise regrading the ScraperCredentials type - I've made each scraper define its own credentials, and made ScraperCredentials a big copied union of all of the different options.
This way we get more type safety everywhere at the cost of copied and pasted code. (This already flushed out some incorrect tests)
In the future we can make the scraper factory function return concrete scraper types and get rid of ScraperCredentials altogether.

esakal

@orzarchi this is awesome! I want to merge it this weekend. There are few minor comments. Please review and decide if to fix them or not. I'm ok with having it like that. Please check my comment about the ...extraHeaders.

Once you'll reply or commit fixes I'll merge it.

src/helpers/fetch.ts

orzarchi · 2023-04-07T17:44:42Z

@orzarchi this is awesome! I want to merge it this weekend. There are few minor comments. Please review and decide if to fix them or not. I'm ok with having it like that. Please check my comment about the ...extraHeaders.

Once you'll reply or commit fixes I'll merge it.

Sure, go ahead and merge it 🤩

esakal · 2023-04-08T12:38:33Z

I approved the PR, it will be merged soon. Thanks @orzarchi for your contribution

closes #765

github-actions · 2023-04-20T06:45:50Z

🎉 This PR is included in version 3.7.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Or Zarchi added 4 commits February 25, 2023 17:15

Extracted out public api types to their own file

573ef81

Created a new base scraper class that supports OTP

df77d02

Added OneZero OTP scraper

7475b6b

Updated readme

488033a

orzarchi force-pushed the master branch from 2b81f75 to 488033a Compare February 26, 2023 14:45

Fix onezero's weird reversed UI oriented hebrew

8d30e5e

erikash reviewed Mar 28, 2023

View reviewed changes

extend base scraper with support of OTP provided by orzarchi

0f773b9

esakal mentioned this pull request Mar 28, 2023

One zero with existing scrapers #765

Closed

orzarchi force-pushed the master branch from 94b47c6 to a2cd505 Compare April 1, 2023 13:32

Fixing PR comments

8830e4b

orzarchi force-pushed the master branch from a2cd505 to 84a2048 Compare April 1, 2023 15:00

orzarchi force-pushed the master branch 2 times, most recently from fa2fdee to db4c832 Compare April 1, 2023 15:08

Refactor credentials types to increase end user's type safety

81cb33f

orzarchi force-pushed the master branch from db4c832 to 81cb33f Compare April 3, 2023 21:35

One Zero fixes

bf89066

esakal reviewed Apr 7, 2023

View reviewed changes

src/helpers/fetch.ts Show resolved Hide resolved

src/helpers/fetch.ts Show resolved Hide resolved

esakal approved these changes Apr 8, 2023

View reviewed changes

esakal merged commit 2da370e into eshaham:master Apr 8, 2023

github-actions bot added the released label Apr 20, 2023

This was referenced Apr 20, 2023

Property 'onProgress' does not exist on type 'Scraper<ScraperCredentials>' #772

Closed

Replace the Scraper type with BaseScraper #773

Closed

baruchiro mentioned this pull request Jan 31, 2025

New: handle oneZero scrape brafdlog/caspion#628

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Introducing 2FA scrapers infrastructure + OneZero experimental scraper #760

feat: Introducing 2FA scrapers infrastructure + OneZero experimental scraper #760

orzarchi commented Feb 25, 2023 •

edited by esakal

Loading

esakal commented Mar 5, 2023 •

edited

Loading

baruchiro commented Mar 12, 2023

orzarchi commented Mar 21, 2023 •

edited

Loading

brafdlog commented Mar 22, 2023

erikash left a comment

erikash Mar 28, 2023

erikash Mar 28, 2023

erikash Mar 28, 2023

erikash Mar 28, 2023

erikash Mar 28, 2023

erikash Mar 28, 2023

erikash Mar 28, 2023

esakal commented Mar 28, 2023 •

edited

Loading

erikash commented Mar 28, 2023

esakal commented Mar 28, 2023 •

edited

Loading

orzarchi commented Mar 28, 2023

esakal commented Mar 29, 2023

orzarchi commented Apr 1, 2023 •

edited

Loading

esakal left a comment

orzarchi commented Apr 7, 2023

esakal commented Apr 8, 2023

github-actions bot commented Apr 20, 2023



		export class BaseTwoFactorAuthScraper<TTwoFactorCredentials> implements TwoFactorAuthScraper {
		constructor(protected options: ScraperOptions) {

		const movements = [];


		while (!movements.length \|\| new Date(movements[0].movementTimestamp) >= startDate) {


		movements.sort((x, y) => new Date(x.movementTimestamp).valueOf() - new Date(y.movementTimestamp).valueOf());

		const matchingMovements = movements.filter((movement) => new Date(movement.movementTimestamp) >= startDate);

feat: Introducing 2FA scrapers infrastructure + OneZero experimental scraper #760

feat: Introducing 2FA scrapers infrastructure + OneZero experimental scraper #760

Conversation

orzarchi commented Feb 25, 2023 • edited by esakal Loading

esakal commented Mar 5, 2023 • edited Loading

baruchiro commented Mar 12, 2023

orzarchi commented Mar 21, 2023 • edited Loading

brafdlog commented Mar 22, 2023

erikash left a comment

Choose a reason for hiding this comment

erikash Mar 28, 2023

Choose a reason for hiding this comment

erikash Mar 28, 2023

Choose a reason for hiding this comment

erikash Mar 28, 2023

Choose a reason for hiding this comment

erikash Mar 28, 2023

Choose a reason for hiding this comment

erikash Mar 28, 2023

Choose a reason for hiding this comment

erikash Mar 28, 2023

Choose a reason for hiding this comment

erikash Mar 28, 2023

Choose a reason for hiding this comment

esakal commented Mar 28, 2023 • edited Loading

TL;DR

Adjustments done in the PR

Motivation behind my suggestion

erikash commented Mar 28, 2023

esakal commented Mar 28, 2023 • edited Loading

orzarchi commented Mar 28, 2023

esakal commented Mar 29, 2023

orzarchi commented Apr 1, 2023 • edited Loading

esakal left a comment

Choose a reason for hiding this comment

orzarchi commented Apr 7, 2023

esakal commented Apr 8, 2023

github-actions bot commented Apr 20, 2023

orzarchi commented Feb 25, 2023 •

edited by esakal

Loading

esakal commented Mar 5, 2023 •

edited

Loading

orzarchi commented Mar 21, 2023 •

edited

Loading

esakal commented Mar 28, 2023 •

edited

Loading

esakal commented Mar 28, 2023 •

edited

Loading

orzarchi commented Apr 1, 2023 •

edited

Loading