-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turbo Mode to run 10X faster (test b4 use, only works for some apps) - done #1093
Comments
This thread was originally created for a temporary turbo mode for the Automation Anywhere RPA Challenge. More details - https://community.aisingapore.org/groups/tagui-rpa/forum/discussion/rpa-challenge-submissions/ Below temporary files are now irrelevant, since turbo mode is now officially an option that can be used at runtime.
|
Week 1 Solution 1 - RPA automationUsing the tournament edition of RPA for Python, this solution will take under 10 seconds. It uses Pandas library to read the CSV file and fill up the fields. This can also be done in TagUI human language, whether you use the Pandas library through Python integration import pandas as pd
import rpa as r
r.init()
r.url('https://developer.automationanywhere.com/challenges/automationanywherelabs-customeronboarding.html')
r.click('Download CSV')
df = pd.read_csv('MissingCustomers.csv')
df['Zip'] = df['Zip'].astype(str)
for i in range(len(df.axes[0])):
r.type('customerName', df['Company Name'][i])
r.type('customerID', df['Customer ID'][i])
r.type('primaryContact', df['Primary Contact'][i])
r.type('street', df['Street Address'][i])
r.type('city', df['City'][i])
r.select('state', df['State'][i])
r.type('zip', df['Zip'][i].zfill(5))
r.type('email', df['Email Address'][i])
r.click('activeDiscount' + df['Offers Discounts'][i].capitalize())
if df['Non-Disclosure On File'][i] == 'YES': r.click('NDA')
r.click('Register')
r.wait()
r.close() Week 1 Solution 2 - JavaScript hackThis solution works by batching all the instructions to be done, sending them all at one go to the web browser, for the instructions to be executed all in one go. This way, there is no communication overheads between TagUI and Chrome browser anymore, and the speed of execution is simply dependent on how powerful your computer CPU is. A blocker I encountered was it seems that Chrome has a limit to how long that combined set of instructions is. My initial attempt was running the instructions line by line with the required data. But this is too long the total length (imagine 10 lines x 7 = 70 lines). So I created an array instead and used a for loop (10 lines + data array iterating in for loop) to send a smaller instruction set to Chrome to run at one go. Also, the webpage timer will start as soon as it loads. I used an arbitrary manual delay before running. If you set it too fast, web is not ready and automation won't work. If you set it too slow it affects your timing. For eg, when I set as 1 second delay, the execution time is 0.305 second. This JavaScript injection method is extremely dependent on the target webpage, and I will never send this as the solution for any customer doing RPA. This is only for competitive sports fun only, not for production use. import pandas as pd
import rpa as r
import time
r.init()
r.url('https://developer.automationanywhere.com/challenges/automationanywherelabs-customeronboarding.html')
df = pd.read_csv('MissingCustomers.csv')
df['Zip'] = df['Zip'].astype(str)
# simple manual delay to avoid overheads of initial communication with browser
time.sleep(2)
dom_query = "document.querySelector('[role = \"button\"]');" + "data = [[],[],[],[],[],[],[],[],[],[]];"
for i in range(7):
dom_query += "data[" + str(i) + "][1]='" + df['Company Name'][i] + "';"
dom_query += "data[" + str(i) + "][2]='" + df['Customer ID'][i] + "';"
dom_query += "data[" + str(i) + "][3]='" + df['Primary Contact'][i] + "';"
dom_query += "data[" + str(i) + "][4]='" + df['Street Address'][i] + "';"
dom_query += "data[" + str(i) + "][5]='" + df['City'][i] + "';"
dom_query += "data[" + str(i) + "][6]='" + df['State'][i] + "';"
dom_query += "data[" + str(i) + "][7]='" + df['Zip'][i].zfill(5) + "';"
dom_query += "data[" + str(i) + "][8]='" + df['Email Address'][i] + "';"
dom_query += "data[" + str(i) + "][9]='" + df['Offers Discounts'][i].capitalize() + "';"
dom_query += "data[" + str(i) + "][10]='" + df['Non-Disclosure On File'][i] + "';"
dom_query += "for (i=0;i<7;i++){";
dom_query += "document.querySelector('#customerName').value=data[i][1];"
dom_query += "document.querySelector('#customerID').value=data[i][2];"
dom_query += "document.querySelector('#primaryContact').value=data[i][3];"
dom_query += "document.querySelector('#street').value=data[i][4];"
dom_query += "document.querySelector('#city').value=data[i][5];"
dom_query += "document.querySelector('#state').value=data[i][6];"
dom_query += "document.querySelector('#zip').value=data[i][7];"
dom_query += "document.querySelector('#email').value=data[i][8];"
dom_query += "document.querySelector('#activeDiscount'+data[i][9]).click();"
dom_query += "if (data[i][10]=='YES') document.querySelector('#NDA').click();"
dom_query += "document.querySelector('#submit_button').click();}"
r.dom(dom_query)
r.wait()
r.close() |
Week 2 SolutionI used the Python version of TagUI for convenience in reading the Excel document. This can also be done in TagUI human language, for example by opening Excel to copy out the data to clipboard to process, or using import pandas as pd
import rpa as r
r.init()
r.url('https://developer.automationanywhere.com/challenges/automationanywherelabs-supplychainmanagement.html')
r.click('Download Agent Territory Spreadsheet')
df = pd.read_excel('StateAssignments.xlsx')
po_numbers = []
for n in range (7):
po_numbers.append(r.read('#PONumber' + str(n+1)))
r.dom('window.open("https://developer.automationanywhere.com/challenges/AutomationAnywhereLabs-POTrackingLogin.html")')
r.popup('POTracking')
r.click('(//button)[1]')
orders_list = []
for n in range(7):
r.type('//input[@type = "search"]', '[clear]' + po_numbers[n])
state = r.read('(//table//td)[5]')
ship_date = r.read('(//table//td)[7]')
order_total = r.read('(//table//td)[8]')
orders_list.append([state, ship_date, order_total])
r.popup('supplychainmanagement')
for order in range(7):
r.type('#shipDate' + str(order+1), orders_list[order][1])
r.type('#orderTotal' + str(order+1), orders_list[order][2][1:])
agent_name = df.loc[df['State'] == orders_list[order][0]].iloc[0]['Full Name']
r.select('#agent' + str(order+1), agent_name)
r.click('#submitbutton')
r.wait()
r.close() |
Week 3 SolutionI implemented using TagUI in Microsoft Word (requires TagUI MS Word Plug-in) - this is Week 3 Word doc solution Below is the solution in text format. Calling REST API is easy with TagUI's API step, for human language version. For Python users, it will be using requests package. Grabbing info from the desktop app can be done using OCR. But in this case, since field values can be copied out, using keyboard step to copy and clipboard() to access the value is faster and more accurate.
|
Week 4 SolutionI was the first to post on solving the toughest week 4 challenge, amongst all other users from other RPA tools including Automation Anywhere, UiPath, and open-source RPA tools. This shows that free and open-source tools can also solve real world RPA scenarios as well, since these Automation Anywhere challenges are crafted from real customer scenarios. Choosing the solutionI implemented using RPA for Python (Python version of TagUI), so that it is easier to do OCR in the background instead of using built-in OCR to do on-screen using PDF viewer. TagUI human language version can also do the same by running Python code using There are free online OCR websites which can be automated by TagUI to do the OCR and return the text extracted, but I opted to do locally on-prem to discourage sending sensitive documents like invoices to some random online 3rd-party providers. For API-based online OCR providers, an example would be Amazon Textract which comes free for 3 months for 1000 pages per month. The pricing thereafter is attractive, but I'm not sure for real-world scanned invoices would the results be as satisfactory as these Automation Anywhere invoices, which are directly saved from source to images and very high quality. I would imagine using TagUI's on-screen OCR method where you can specify certain zones and anchor elements to segment and do OCR would yield better results than a generic online service based on a generalised machine learning model. However, the effort required undoubtably would be more because user has more and very fine-tuned control on OCR process. Implementing the solutionIn my solution, I first convert the image invoices into image PDFs with img2pdf Python package, then I convert the image PDFs to text PDFs with the interesting OCRmyPDF tool, which can generate accompanying text files of the OCR text. Next I write the 'business logic' to extract the data for the 3 invoice formats provided in the challenge. This is done by using built-in helper functions like get_text() and del_chars(), as well as standard Python functions. The general idea is 1. extract data (image to text using OCR) --> 2. clean data (remove rubbish characters or unnecessary sections) --> 3. extract data (using get_text() and standard functions)--> 4. clean data (eg remove comma in the price, as required by the challenge). Relative to using machine learning, this method is deterministic but requires writing the logic. Part of the criteria for week 4 challenge is to open up the file upload dialog box. So I used r.click('#fileupload') and visually automate entering the filename from the dialog box. You will notice that there is additional click on show_all2.png image because of Mac OS file selection dialog design. It is necessary to click to expand the files listed in order to show all the files and make them available for selection. For Windows, you can type directly using keyboard step. If this criteria is not mandatory, using r.upload('#fileupload', filename) is sufficient to instantly assign the filename to the upload field. PS - A TagUI user from Brazil, Daniel Correa de Castro Freitas, has created a nicely documented GitHub repository showing his solutions for all 4 weeks. See this link for his elegant solution for week 4. Daniel used Tesseract Python package to do the OCR and Regex filtering to extract the data. See this link for another week 4 solution by Wei Soon Thia from Singapore. His solution didn't use regex but is more structured compared to mine, by defining functions for data extraction. import rpa as r; import img2pdf; import os
# 1. convert image files to pdf files to text files
for filename in os.listdir('.'):
if filename.endswith('.tiff'):
with open(filename + '.pdf', 'wb') as f: f.write(img2pdf.convert(filename))
r.run('ocrmypdf --sidecar ' + filename + '.txt ' + filename + '.pdf ' + filename + '.pdf')
# 2. start Automation Anywhere RPA challenge page
r.init(True); r.url('https://developer.automationanywhere.com/challenges/automationanywherelabs-invoiceentry.html')
# 3. process and enter invoices from OCR results
for filename in os.listdir('.'):
if filename.endswith('.tiff.txt'):
if 'Ship to Invoice no.' in r.load(filename):
ocr_text = r.get_text(r.load(filename), 'Invoice no.', 'Terms: ')
ocr_line = ocr_text.split('\n')
invoice_number = ocr_line[0].split(' ')[-1]
invoice_date = ocr_line[2].split(' ')[-3] + ' ' + ocr_line[2].split(' ')[-2] + ' ' + ocr_line[2].split(' ')[-1]
invoice_total = r.get_text(ocr_text + '$', 'Invoice Amount', '$')
r.type('#invoiceNumber', invoice_number)
r.type('#invoiceDate', invoice_date)
r.type('#invoiceTotal', invoice_total.replace(',', ''))
ocr_text = r.get_text(r.del_chars(ocr_text,'{}[]'), 'Tax Amount', 'Subtotal')
ocr_line = ocr_text.split('\n')
for item in range(len(ocr_line)):
if ocr_line[item].count('|') == 1:
ocr_line[item] = ocr_line[item].replace(' G ', ' | G ')
quantity = ocr_line[item].split('|')[0].strip()
item_no = ocr_line[item].split('|')[1].strip().split(' ')[0]
description = ocr_line[item].split('|')[1].replace(item_no, '').strip()
unit_price = description.split(' ')[-1]
description = description.replace(unit_price, '').strip()
total_price = ocr_line[item].split('|')[-1].split(' ')[-1]
r.type('#quantity_row_' + str(item + 1), quantity)
r.type('#description_row_' + str(item + 1), description)
r.type('#price_row_' + str(item + 1), total_price.replace(',', ''))
if item != len(ocr_line) - 1:
r.click('//button')
r.click('#fileupload')
r.wait(1.25)
if 'Invoice10' in filename: r.click('show_all2.png')
r.keyboard(filename.replace('.tiff.txt', '') + '[enter]')
r.click('#agreeToTermsYes')
r.click('#submit_button')
elif 'Sold to Ship to' in r.load(filename):
ocr_text = r.load(filename)
invoice_number = r.get_text(ocr_text, 'Invoice no.', 'Purchase Order')
invoice_date = r.get_text(ocr_text, 'Invoice Date', 'Terms')
invoice_total = r.del_chars(r.get_text(ocr_text, 'Invoice Amount', '\n'),'—=$, ')
r.type('#invoiceNumber', invoice_number)
r.type('#invoiceDate', invoice_date)
r.type('#invoiceTotal', invoice_total)
ocr_text = r.get_text(r.del_chars(ocr_text,'{}[]'), 'Tax Amount', 'Subtotal')
ocr_line = ocr_text.split('\n')
for item in range(len(ocr_line)):
if ocr_line[item].count('|') == 1:
ocr_line[item] = ocr_line[item].replace(' G ', ' | G ')
quantity = ocr_line[item].split('|')[0].strip()
item_no = ocr_line[item].split('|')[1].strip().split(' ')[0]
description = ocr_line[item].split('|')[1].replace(item_no, '').strip()
unit_price = description.split(' ')[-1]
description = description.replace(unit_price, '').strip()
total_price = ocr_line[item].split('|')[-1].split(' ')[-1]
r.type('#quantity_row_' + str(item + 1), quantity)
r.type('#description_row_' + str(item + 1), description)
r.type('#price_row_' + str(item + 1), total_price.replace(',', ''))
if item != len(ocr_line) - 1:
r.click('//button')
r.click('#fileupload')
r.wait(1.25)
if 'Invoice10' in filename: r.click('show_all2.png')
r.keyboard(filename.replace('.tiff.txt', '') + '[enter]')
r.click('#agreeToTermsYes')
r.click('#submit_button')
elif 'Sold to Invoice no' in r.load(filename):
ocr_text = r.load(filename)
invoice_number = r.get_text(ocr_text, 'Invoice no.', '\n')
invoice_date = r.get_text(ocr_text, 'Invoice Date', '\n')
invoice_total = r.del_chars(r.get_text(ocr_text, 'Invoice Amount', '\n'),'—=$, ')
r.type('#invoiceNumber', invoice_number)
r.type('#invoiceDate', invoice_date)
r.type('#invoiceTotal', invoice_total)
ocr_text = r.get_text(r.del_chars(ocr_text,'{}[]'), 'Tax Amount', 'Subtotal')
ocr_line = ocr_text.split('\n')
for item in range(len(ocr_line)):
if ocr_line[item].count('|') == 1:
ocr_line[item] = ocr_line[item].replace(' G ', ' | G ')
quantity = ocr_line[item].split('|')[0].strip()
item_no = ocr_line[item].split('|')[1].strip().split(' ')[0]
description = ocr_line[item].split('|')[1].replace(item_no, '').strip()
unit_price = description.split(' ')[-1]
description = description.replace(unit_price, '').strip()
total_price = ocr_line[item].split('|')[-1].split(' ')[-1]
r.type('#quantity_row_' + str(item + 1), quantity)
r.type('#description_row_' + str(item + 1), description)
r.type('#price_row_' + str(item + 1), total_price.replace(',', ''))
if item != len(ocr_line) - 1:
r.click('//button')
r.click('#fileupload')
r.wait(1.25)
if 'Invoice10' in filename: r.click('show_all2.png')
r.keyboard(filename.replace('.tiff.txt', '') + '[enter]')
r.click('#agreeToTermsYes')
r.click('#submit_button')
else:
print('[ERROR][' + filename.replace('.txt','') + '] unrecognised invoice format')
r.wait(10)
r.close() |
Wrap-up PostHere's a giveaway and my post-RPA-challenge thoughts. There are 2 premium subscriptions of AI Singapore's LearnAI + DataCamp's data science learning platforms to give away, from Wei Soon THIA and Chee Huat Huang. Last month, Automation Anywhere organised a series of RPA challenges. It was a great initiative to advance the broader RPA community, different users of various RPA tools solve the real-life customer scenarios that AA curated. AI Singapore is happy to join in the fun to give away prizes for the fastest folks who solved using TagUI, one of the leading open-source RPA software. Wei Soon and Chee Huat were amongst the winners, but would like to give away their prizes to people in the community who might be able to benefit more. If you are a student in any capacity, or am learning data science, machine learning, Python etc, please comment with 'interested'. Congratulations to the 2 gentlemen, François Blanc, Abdulaziz Shaikh, Nived N, Mirza Ahsan Baig, Daniel Correa de Castro Freitas 👏🏻👏🏻 They won $3500 USD worth of prizes for the impressive TagUI solutions they sent in. I think it's beyond any reasonable doubt, that free and open-source tools are viable options to solve real business scenarios, just as well as commercial RPA software. In fact, for week 4 (the toughest challenge) I was the first, amongst all users of different tools, to post a solution to the invoices OCR challenge. TagUI, and its various 'flavours', are fully free and open-source. Go ahead and make a dent in the digital automation space. Create RPA solutions for your clients, your bosses, your colleagues, and even your loved ones 😄 PS 1 - special shout out to Daniel Correa de Castro Freitas, Infosys Consulting RPA Developer based in Brazil. He shared online all 4 weeks of solutions using both TagUI human language and Python versions. Very nicely documented. Check out his LinkedIn profile and post for the link. PS 2 - the temporary turbo mode created to play in the challenge, is now a permanent option. Users can now run at 10X faster than normal human user speed. A workflow that takes 1h to complete is now done in 6 minutes. Imagine what RPA can do for your deadlines! |
TagUI workflow to randomly select 2 winners from participants of the giveaway prizes from Wei Soon and Chee Huat -
PS - above formula found by googling javascript how to generate a random number between 1 to 10 Video of running the workflow to randomly select the winners Bibin P John and Nur Ashikin Binti Rohaime - draw.mov |
A dollar sign in double quote would mean variable. Changing to single quote instead. Otherwise at least some PHP versions or config will throw warning (eg Colab)
Above commit fixes a PHP warning message on some version / config of PHP. First noticed on Colab example. A dollar sign in double quote would mean variable. Changing to single quote instead to treat |
Dear Ken, my folder did not produce the *.txt, only *.pdf are produced, any tips how to solve this issue? |
Oh maybe it is related to some ocrmypdf setup issue. If it is setup to work, the --sidecar option supposedly will generate text files of the OCR text. |
Also copying @ruthtxh for info |
https://github.com/lookang/TagUI/tree/main/ken thanks @kensoh for your generous sharing and wonderful tool TagUI. |
Thank you Lawrence, for sharing generously with the TagUI community! Also copying @ruthtxh |
Closing since this change has made its way into the latest packaged release. |
Turbo Option
Adding
-turbo
option to run TagUI 10X faster than normal human speed.To run in turbo mode
Or use shortcut
-t
If you are using TagUI v6.46 and above, you can get this update with
tagui update
command or MS Word plug-inUpdate TagUI
button. You can get TagUI v6.46 from this installation page, or manually unzip this zip file to overwrite your existing installation (drag all the folders and files under TagUI-master\src to your existing tagui\src folder).Downsides
Most websites and desktop apps are not designed for the super-human speed user. If your RPA runs at a speed beyond what those websites are designed and tested for, you are surely going to run into problems with some apps. Problems could be fields and data not filling up properly, not triggering expected validations, form submissions with missing data, account being blocked etc.
And the problems might happen randomly, including working on your PC but not working on another PC due to difference in CPU speed. Because of this, using turbo mode option is not recommended. You may save some cheap computer time, but if something is broken or does not work, you may end up spending expensive human time (your time) to troubleshoot or fix.
So, in general, it doesn't make sense to give up what is important (reliability and trusted to always work, cheap computer cost) for what is relatively less important (super-fast execution of automation, much faster than doing it manually).
Why add this
I can see why this is useful for some users for some specific scenarios. For eg, data collection from apps, data entry in web applications that can handle super-human speed reliably, as part of a chatbot doing backend RPA for user, fast and rapid prototyping, and perhaps taking part in RPA competitions etc. Thoroughly test for your use case before using!
Although the downsides are huge, as TagUI and its various flavours like RPA for Python becomes more mature, it make sense to support a broader range of use cases instead of rejecting them as edge cases. With diversity, the ecosystem can flourish.
The additional cost of computation of this implementation is 4ms. On a 6 year-old low-end Mac laptop, this is the additional overhead needed during initialisation to set up TagUI process to be turbo mode or not turbo mode. This is a relatively small cost 0.1%, if you estimate an RPA workflow taking 4 seconds. In practice, most workflows should take longer and this small cost of adding < 0.1% more to the execution time is humanly insignificant.
The text was updated successfully, but these errors were encountered: