real-estates-watcher

Real Estates Watcher πŸ”

🏦 Simple C# command-line application for periodic watching of selected Real estate advertisement portals and sending notifications on new ads. πŸ— Supports watching adverts for sells as well as leases.

Build and publish .NET commandline script Quality Gate Status

Frameworks: .NET 8, Node.js (for web scraping script)

Supported OS: Windows, macOS, Linux

🌐 Currently supported Ads portals:


πŸš€ How to run the application

Run the app by executing the following command

dotnet RealEstatesWatcher.UI.Console.dll --h <path to handlers.ini file> --p <path to portals.ini file> --f <path to filters.ini file> --e <path to engine.ini file>
* Description of all command-line options can be obtained by running the --help option

πŸ› οΈ How to build, publish, deploy and prepare the application

🫳 Manually

Perform following steps and commands either from the root folder (where the solution file is placed) or from the UI Console project in RealEstatesWatcher.UI.Console folder in order to prepare the application from execution:

  1. Restore dependencies

    dotnet restore

  2. Build

    dotnet build --configuration Release

  3. Publish the UI Console project for desired platform

    dotnet publish ./RealEstatesWatcher.UI.Console/RealEstatesWatcher.UI.Console.csproj -c Release

    You can use your own publish parameters or use predefined profiles for Windows and Linux in RealEstatesWatcher.UI.Console/Properties/PublishProfiles/ folder

  4. Copy Web scraper files from Tools/scraper folder to ~/publish/scraper directory.

  5. Make sure you have a /configs folder in the publish directory with all the configuration files or copy them manually from RealEstatesWatcher.UI.Console/configs/ folder
    • handlers.ini
    • portals.ini
    • filters.ini
    • engine.ini
    • scraper.ini
  6. Deploy the whole publish directory to server or run locally.

  7. On the target machine, enter scraper folder and install all required Node.js dependencies with command

    npm install

    ##### * It’s important to do this on the target platform as the dependencies are platform-specific

❗CHANGE❗(since v1.4.3) πŸ‘‡

  1. If an Ad portal requires specific cookies to be set during web scraping in order to reach the final ads page (to pass through GDPR consent page, etc.), fill all the cookies to cookies.json file as shown in the template. Remember to copy this file to the same ~/publish/scraper directory as stated above.

    * For example Sreality.cz Ads portal requires cookies.

There is a Dockerfile in the root of the repository that you can use to automatic build and setup Docker image and run the app in containerized environment. Simply use basic build and run commands:

docker build -t real-estates-watcher:latest .
docker run -itd real-estates-watcher:latest

πŸ“ Configuration files description

engine.ini - configuration of the watching engine (required cmd argument --e or -engine)

[settings]
check_interval_minutes=            # <number> | required | periodic checking interval, minimum 1 minute
enable_multiple_portal_instances=  # <bool>   | optional | enable/disable multiple instances of the same portal (in case of watching multiple URLs of the same portal)

portals.* - configuration of all Ad portals to watch (required cmd argument --p or -portals)

❗CHANGE❗(since v1.4)

https://reality.idnes.cz/s/prodej/domy/cena-do-7000000/...↩
https://reality.idnes.cz/s/prodej/pozemky/...↩
https://www.sreality.cz/hledani/prodej/domy/...↩
https://reality.bazos.cz/prodam/dum/...↩
https://reality.bazos.cz/prodam/chata/...↩

handlers.ini - configuration of the classes handling the received Ad posts (required cmd argument --h or -handlers)

❗CHANGE❗(since v1.4.7)

[email]         
enabled=                      # <bool>    | required | enable/disable this handler
from=                         # <string>  | required | email address for outgoing notifications
to=                           # <strings> | required | list of email addresses where to send notifications (separated by comma)
cc=                           # <strings> | optional | list of email CC addresses where to send notifications (separated by comma)
bcc=                          # <strings> | optional | list of email BCC addresses where to send notifications (separated by comma)
sender_name=                  # <string>  | required | name of the sending entity for the email_address_from
username=                     # <string>  | required | login username of sending email account
password=                     # <string>  | required | login password of sending email account
smtp_server_host=             # <string>  | required | URL of the SMTP server that handles sending notification emails
smtp_server_port=             # <string>  | required | port of the SMTP server that handles sending notification emails
use_secure_connection=        # <bool>    | required | switch for using TLS connection to the SMTP server
skip_initial_notification=    # <bool>    | required | switch to disable sending the initial list of current Real estate offers
[file]
enabled=                      # <bool>    | required | enable/disable this handler
main_path=                    # <string>  | required | path to the file where to save initial and new Ad posts
separate_new_posts=           # <bool>    | optional | set 'true' if you want to save new Ad posts in separate file from the initial list
new_posts_path=               # <bool>    | optional | path to the file where to save new Ad posts (required when separate_new_posts=true)
format=                       # <strings> | optional | format of printing parsed Ads to file, (allowed enum values below) (defaults to 'plain')

filters.ini - configuration of the Ads filter (optional cmd argument --f or -filters)

[basic]
price_min=          # <number>  | optional | minimal price of Real estate
price_max=          # <number>  | optional | maximal price of Real estate
layouts=            # <strings> | optional | layouts of Real estate (allowed enum values below)(separated by comma)
floor_area_min=     # <number>  | optional | minimal floor area of Real estate
floor_area_max=     # <number>  | optional | maximal floor area of Real estate

❗ADDED❗(since v1.4.6) πŸ‘‡

scraper.ini - configuration of the Web scraper script when Ad portal page is dynamic and require JavaScript in order to fully load the content (optional cmd argument --s or -scraper)

[nodejs]
path_to_script=                   # <string>  | required | path to web scraping script
page_scraping_timeout_seconds=    # <number>  | required | maximum timeout for scraping a single webpage (when set to 0 waits indefinitely)
path_to_cookies_file=             # <string>  | optional | path to file with cookies that may be needed with some Ad portals (see README)