These terms that appear throughout the project are defined explicitly for the reader’s convenience to clarify precisely what they are referring to.
Term | Definition |
---|---|
Listing | Contains all the units i.e. floor plans available to rent in a specific building. |
Unit | Refers to the specific apartment in a building available to rent, synonymous with floor plan. |
Amenities | Refers to premiums included with a unit or a building. |
Unit amenities | Balcony, In Unit Laundry, Air Conditioning, High Ceilings, Furnished, Hardwood Floor. |
Building amenities | Controlled Access, Fitness Center, Swimming Pool, Roof Deck, Storage, Residents Lounge, Outdoor Space. |
This project runs on Python 3. Make sure you have a version of Python 3 installed.
In order to run the web scrapers to extract rental listings data, you'll need to install chromedriver
. Make sure homebrew is installed, then run:
brew install chromedriver
Next get the installation path:
which chromedriver
If you see the error "Google Chrome for Testing.app is damaged and can’t be opened. You should move it to the Trash."
, run the following:
xattr -cr 'Google Chrome for Testing.app'
In the project directory, start by creating and activating a virtual environment:
python -m venv env # create virtual env named env
source env/bin/activate # activate it
Then install all the project requirements:
pip install -r requirements.txt
Now create a .env
file in the root directory by making a copy of .env.schema
. Replace the CHROMEDRIVER_PATH
variable in your .env
file with your chromedriver
installation path.
This project uses a PostgreSQL database to store the extracted rental listing data for each building and its units posted. You can recreate the setup by initializing your own PostgreSQL database (we use Neon for a serverless DB) and replacing the DATABASE_URL
variable in your .env
file with your database connection string.
Running main.py
in the root directory will commence the data acquisition and model training process, which executes the following steps:
- Run the data scraper to acquire the rental listing data for the current month.
- Push the extracted unit and building details for each listing to your PostgreSQL DB.
- Re-train the model using the extracted data, then save the updated model as
model.joblib
where it will be utilized by the backend API.
python -m main
To start the backend API from the backend
directory, run:
python -m uvicorn app.server:app --reload