Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TagUI workflow to get Singapore public housing (HDB BTO) data - example #907

Closed
kensoh opened this issue Jan 7, 2021 · 3 comments
Closed
Assignees
Labels

Comments

@kensoh
Copy link
Member

kensoh commented Jan 7, 2021

Wrote a TagUI version of @notha99y's Python script to get Singapore public housing (HDB BTO) data.

Posted to share on his repo page notha99y/hdb_sraper#1, also put a copy here for TagUI users.


Hi Ren Jie, your BTO scraper is innovative! Especially the part where you got pricing data elsewhere on the HTML doc. For not yet fully loaded problem, Selenium should have a way to wait until something appears or you could sleep() for a few seconds.

I did a version using TagUI RPA, below is the script. As you'll expect, the number of lines to get the same outcome is much lesser (1/4 of your implementation). TagUI is a domain-specific-language for RPA, so many things can be done in 1 line of instruction. For Python users, they can use RPA for Python, a wrapper I made for TagUI (see below Python version).

TagUI flow script - bto.tag

https://services2.hdb.gov.sg/webapp/BP13AWFlatAvail/BP13EBSFlatSearch?Town=Toa+Payoh&Flat_Type=BTO&selectedTown=Toa+Payoh&Flat=`room`-Room&ethnic=`ethnic`&ViewOption=A&projName=N9%3BC17&Block=0&DesType=A&EthnicA=&EthnicM=&EthnicC=C&EthnicO=&numSPR=&dteBallot=202011&Neighbourhood=&Contract=&BonusFlats1=N&searchDetails=Y&brochure=true
click `block`
wait 3 seconds

total_units = count('(//table)[2]//td')
for unit from 1 to total_units
    read ((//table)[2]//td)[`unit`]/font/@id to unit_number
    read //*[@id ="`unit_number`k"] to unit_information
    price = unit_information.split('____________________')[0]
    sqm = parseInt(unit_information.split('____________________')[1])
    write `csv_row([unit_number, price, sqm])` to `project`_unitprices_`room`room_block`block`.csv

Parameters file - tagui_local.csv

FIELD,VALUE
room,4
ethnic,C
block,233A
project,Parkview

RPA for Python script - bto.py

room = '4'; ethnic = 'C'; block = '233A'; project = 'Parkview';
  
import rpa as r
r.init()
r.url('https://services2.hdb.gov.sg/webapp/BP13AWFlatAvail/BP13EBSFlatSearch?Town=Toa+Payoh&Flat_Type=BTO&selectedTown=Toa+Payoh&Flat='+room+'-Room&ethnic='+ethnic+'&ViewOption=A&projName=N9%3BC17&Block=0&DesType=A&EthnicA=&EthnicM=&EthnicC=C&EthnicO=&numSPR=&dteBallot=202011&Neighbourhood=&Contract=&BonusFlats1=N&searchDetails=Y&brochure=true')
r.click(block)
r.wait(3)

total_units = r.count('(//table)[2]//td')
for unit in range(1,total_units+1):
    unit_number = r.read('((//table)[2]//td)['+str(unit)+']/font/@id')
    unit_information = r.read('//*[@id ="'+unit_number+'k"]')
    price = unit_information.split('____________________')[0]
    sqm = unit_information.split('____________________')[1][:-4]
    r.write(unit_number+',"'+price+'",'+sqm+'\r\n', project+'_unitprices_'+room+'room_block'+block+'.csv')
@kensoh kensoh added the query label Jan 7, 2021
@kensoh kensoh changed the title TagUI workflow to get Singapore public housing (HDB BTO) data TagUI workflow to get Singapore public housing (HDB BTO) data - example Jan 7, 2021
@notha99y
Copy link

notha99y commented Jan 7, 2021

Nice!

@kensoh
Copy link
Member Author

kensoh commented Jan 7, 2021

snapshot of the results (prices of subsidised 99-year lease public housing in Singapore) -
results

adding below before writing to the CSV file can create the header row for the CSV -

dump Unit,Price,Sqm to `project`_unitprices_`room`room_block`block`.csv

@kensoh kensoh pinned this issue Jan 7, 2021
@kensoh kensoh unpinned this issue Jan 7, 2021
@kensoh kensoh pinned this issue Jan 13, 2021
@kensoh
Copy link
Member Author

kensoh commented Jan 17, 2021

Closing issue for now, planning with Basil on HDB February BTO launch.

To see if there are gaps in the upcoming new portal that RPA can create value in.

@kensoh kensoh closed this as completed Jan 17, 2021
@kensoh kensoh unpinned this issue Jan 25, 2021
@kensoh kensoh self-assigned this Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants