Recently, I faced a challenge to help a friend who is searching for an apartment in Belgrade. The requirements were strict: a limited budget, a specific neighborhood, and minimal time for searching. Manually monitoring dozens of aggregator websites was clearly inefficient. The solution became obvious: automation.
I decided to create a project that gathers all relevant listings from multiple sources and presents them in a convenient format. This would allow my friend to focus on choosing an apartment rather than endlessly searching for one.
The main idea of the project is to create a “black box” that regularly crawls aggregator websites, filters listings, and provides them in a structured format. At this stage, the project consists of the following key components:
.
├── app
│ ├── __init__.py # Application initialization
│ ├── logging.py # Logging configuration
│ ├── manager.py # Main process management
│ ├── routes.py # API interaction
│ ├── scheduler.py # Task scheduler
│ ├── scrapers # Modules for working with specific websites
│ │ ├── base_scraper.py # Base class
│ │ ├── 4zida.py # Scraper for 4zida
│ │ ├── cityexpert.py # Scraper for CityExpert
│ │ ├── halooglasi.py # Scraper for Halo Oglasi
│ │ ├── nekretnine.py # Scraper for Nekretnine
│ │ └── sasomange.py # Scraper for Sasomange
│ └── version.py # Application version
├── config.py # Configuration
├── Dockerfile # Docker image description
└── requirements.txt # Dependencies
This modular approach simplifies adding new features and streamlines further development.
Each scraper module is responsible for collecting and processing data from a specific website. All scrapers inherit from the base class base_scraper.py
, which defines a unified interface for operation:
Example:
# scrapers/base_scraper.py
class BaseScraper:
def __init__(self, url):
self.url = url
def fetch_data(self):
raise NotImplementedError("Subclasses must implement this method")
def filter_data(self, data):
raise NotImplementedError("Subclasses must implement this method")
Each specific scraper overrides these methods based on the website’s structure.
! DO NOT USE INTERNAL API’S OF SITES WITHOUT PERMISSIONS !
At this stage, the project focuses on data collection. Future plans include:
Adding data processing modules:
System optimization:
Integrating additional features:
The project has already proven its usefulness, saving hours of searching and providing a convenient way to analyze listings. The next step is to make it even more user-friendly and functional.