Depending on your business's niche and market, adjusting your services and prices has to come along with taking your competitors into account.
In a lot of companies that I have seen, this is a manual task, that is completed once every quarter or at least every year.
In this PHP web scraping tutorial, we are going to build a tiny tool, that automizes this process. Of course, the tool will need further advancements, but is always about understanding the concepts, right? :)
Let's get started!
Prerequisites
We will need the following set of tools:
- Web server with PHP
- Composer
- Guzzle - scraping client
- PHP HTML Parser - as HTML Parser
- A currency parser
Download composer here to download the composer and follow the install instructions.
After composer has successfully been installed, install guzzle via composer:
composer require guzzlehttp/guzzle
Next, let's install our HTML parser:
composer require paquettg/php-html-parser
Finally, we add the currency parser to our project:
composer require mcuadros/currency-detector dev-master
Building the scraper
As we want to build a competitor price monitoring tool, let's say that this product URL is our own:
https://www.allendalewine.com/products/11262719/diplomatico-reserva-exclusiva
As a competitor page, we select the following:
https://www.winetoship.com/diplomatico-rum-reserva-exclusiva.html
Next, we have to define the CSS-Selectors that contain the price information.
For our "own" website, the selector is .sale-price.currency
. Going through the same process for the competitor, the selector is .less-price .o_price span
.
Putting the pieces together, we end up with the following script:
<?php
require 'vendor/autoload.php';
use \GuzzleHttp\Client;
use \PHPHtmlParser\Dom;
use \CurrencyDetector\Detector;
$productPairs = [
'rum' => [
'own' => [
'url' => 'https://www.allendalewine.com/products/11262719/diplomatico-reserva-exclusiva',
'selectorPath' => '.sale-price.currency'
],
'competitor1' => [
'url' => 'https://www.winetoship.com/diplomatico-rum-reserva-exclusiva.html',
'selectorPath' => '.less-price .o_price span'
]
]
# you can add as many product pairs as you wish
];
$detector = new Detector();
$comparison = [];
foreach ($productPairs as $productName => $pair) {
foreach($pair as $provider => $product) {
$client = new Client();
$parser = new Dom;
$request = $client->request('GET', $product['url']);
$response = (string) $request->getBody();
$parser->loadStr($response);
$price = $parser->find($product['selectorPath'])[0];
$priceString = $price->text;
$fmt = new NumberFormatter( 'en_US', NumberFormatter::CURRENCY );
$comparison[$productName][$provider] = [
'currency' => $detector->getCurrency($priceString),
'amount' => $detector->getAmount($priceString),
];
}
}
echo json_encode($comparison);
You can add as many product and competitor entities as you like. The scraper then loops through all products and competitors and fetches the HTML-Markup. Our DOM-Parser then extracts the related elements from the HTML. Finally, the currency detector parses the price string into a comparable and normalized format.
I used the following PHP web scraping tutorial to create this scraper.
Top comments (1)
awesome tutorial, thank you for it. also as a newbie in eCommerce I am using e-scraper.com to extract all product data I need for my store.