The Agenda

This tutorial explains how to write a script that would tale a list of URLs, and create a snapshot images with them. Next time when you run the script it will compare the images and alert if there are some differences.

This is very helpful while active developing because it could identify unexpected changes in the pages and potential errors.

Run Selenium standalone server

Download Selenium Standalone Server from here

You could put the jar file wherever you want but I personally prefer to keep things more organized and to put it in my home directory.

cp selenium-server-standalone-3.141.59.jar ~/.

Also we could create a shell script to start the server.

./start-selenium.sh

java -jar selenium-server-standalone-3.141.59.jar

Start the server ./start-selenium.sh and check if the server is running by navigating the browser to : http://localhost:4444/wd/hub/static/resource/hub.html

Install PHP composer.

Next step will be to install composer in our home folder, since we will need it to install the webDriver. Create a new shell scripts with the following contents:

php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"
php -r "if (hash_file('sha384', 'composer-setup.php') === '48e3236262b34d30969dca3c37281b3b4bbe3221bda826ac6a9a62d6444cdb0dcd0615698a5cbe587c3f0fe57a54d8f5') { echo 'Installer verified'; } else { echo 'Installer corrupt'; unlink('composer-setup.php'); } echo PHP_EOL;"
php composer-setup.php
php -r "unlink('composer-setup.php');"

then add execute rights chmod +x script.sh and run the script. This will install the composer.

install webDriver

composer require facebook/webdriver

Install chrome driver

brew cask install chromedriver

And now let’s write some PHP code.

<?php

 
use Facebook\WebDriver\Remote\DesiredCapabilities;
use Facebook\WebDriver\Remote\RemoteWebDriver;
use Facebook\WebDriver\WebDriverBy;
use Facebook\WebDriver\WebDriverDimension;
use Facebook\WebDriver\WebDriverExpectedCondition;
use Facebook\WebDriver\WebDriverPoint;
 
$composer_dir = '/Users/toninichev/composer';
require_once $composer_dir . '/vendor/autoload.php';
 
$host = 'http://127.0.0.1:4444/wd/hub'; // this is the default port
 
$driver = RemoteWebDriver::create($host, DesiredCapabilities::chrome());
 
// Set size
$driver->manage()->window()->setPosition(new WebDriverPoint(0,0));
$driver->manage()->window()->setSize(new WebDriverDimension(1280,800));

 
function takeScreenshot($driver, $url, $id) {
    // Navigate to the page
    $driver->get($url);
    
    $driver->wait(3);
    
    // Take a screenshot
    $driver->takeScreenshot(__DIR__ . "/screenshots/scr" . $id . "-tmp.png");
}

$urls = array('https://www.toni-develops.com/', 'https://www.toni-develops.com/2017/04/27/git-bash-cheatsheet/', 'https://www.toni-develops.com/webpack/', 'https://www.toni-develops.com/algorithms/');
//$urls = array('https://www.toni-develops.com/', 'https://www.toni-develops.com/2017/04/27/git-bash-cheatsheet/');

$html = '';

for($i = 0; $i < count($urls); $i++) {
    takeScreenshot($driver,$urls[$i], $i);   

    $a = sha1_file(__DIR__ . "/screenshots/scr" . $i . "-tmp.png");
    $b = file_exists(__DIR__ . "/screenshots/scr" . $i . ".png") ? sha1_file(__DIR__ . "/screenshots/scr" . $i . ".png") : null;
    
    if($a == $b || $b == null) {
        $className = "match";
    }else {
        $className = "noMatch";
    }

    rename(__DIR__ . "/screenshots/scr" . $i . "-tmp.png", __DIR__ . "/screenshots/scr" . $i . ".png");
    $html .= '<div class="picWrapper ' .$className . '"><div><input type="test" value="' . $urls[$i] . '" readonly /></div><img  src="screenshots/scr' . $i . '.png"/></div>';
}


// Close the Chrome browser
$driver->quit();
?>

<html>
<head>
<style>
    .picWrapper input {
        width: 100%;
        text-align:center;
    }

    .picWrapper {
        text-align: center;
    }

    .match {
        background: green;
    }

    .noMatch {
        background: red;
    }


</style>
</head>
<body>

    <?php
        echo $html;
    ?>
      
</body>
</html>

If you ever needed to fetch data from multiple sources on the backend, probably you already explored the benefits of using curl multi fetch.
In the examples below, I’m using curl to fetch data from two urls, which simply have

[php]
<?php
sleep(3);
echo “Content 1 …”;

[/php]

intentionally to delay the content generation so we could see the difference between regular http call and asynchronous curl multi fetch.

http://toni-develops.com/sandbox/examples/php/curl-multi-fetch/fetched-content/fetch1.php
http://toni-develops.com/sandbox/examples/php/curl-multi-fetch/fetched-content/fetch2.php

Single curl request
http://toni-develops.com/sandbox/examples/php/curl-multi-fetch/test1.php
As expected, this takes approximately 6 seconds (3 seconds for each url)

[php]

<?php

$t1 = microtime(true);

function fetchContent($Url) {
// is cURL installed yet?
if (!function_exists(‘curl_init’)){
die(‘Sorry cURL is not installed!’);
}

// OK cool – then let’s create a new cURL resource handle
$ch = curl_init();

// Now set some options (most are optional)

// Set URL to download
curl_setopt($ch, CURLOPT_URL, $Url);

// Set a referer
curl_setopt($ch, CURLOPT_REFERER, “http://www.example.org/yay.htm”);

// User agent
curl_setopt($ch, CURLOPT_USERAGENT, “MozillaXYZ/1.0”);

// Include header in result? (0 = yes, 1 = no)
curl_setopt($ch, CURLOPT_HEADER, 0);

// Should cURL return or print out the data? (true = return, false = print)
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

// Timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 10);

//Proxy if needed
//curl_setopt($ch, CURLOPT_PROXY, “64.210.197.20:80”);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);

// Download the given URL, and return output
$output = curl_exec($ch);

// Close the cURL resource, and free system resources
curl_close($ch);
return $output;
}

echo “<textarea style=’width:100%;height:40%’>”;
echo fetchContent(‘http://toni-develops.com/sandbox/examples/php/curl-multi-fetch/fetched-content/fetch1.php’);
echo “</textarea>”;

echo ‘<hr>’;

echo “<textarea style=’width:100%;height:40%’>”;
echo fetchContent(‘http://toni-develops.com/sandbox/examples/php/curl-multi-fetch/fetched-content/fetch2.php’);
echo “</textarea>”;

echo ‘fetched for: ‘ . (microtime(true) – $t1) . “\n”;
[/php]

multiple http requests using curl
http://toni-develops.com/sandbox/examples/php/curl-multi-fetch/test2.php
But if we use curl multi fetch, we are cutting down to 3 seconds for both urls.

[php]
<?php

// is cURL installed yet?
if (!function_exists(‘curl_init’)){
die(‘Sorry cURL is not installed!’);
}

$ch = array();
$mh = curl_multi_init();
$total = 100;

$t1 = microtime(true);

$URLs = array( “http://toni-develops.com/sandbox/examples/php/curl-multi-fetch/fetched-content/fetch2.php”,
“http://toni-develops.com/sandbox/examples/php/curl-multi-fetch/fetched-content/fetch2.php”);

$i = 0;
foreach($URLs as $url) {
$ch[$i] = curl_init();
curl_setopt($ch[$i], CURLOPT_URL, $url);
curl_setopt($ch[$i], CURLOPT_HEADER, 0);
curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER, true);

curl_multi_add_handle($mh, $ch[$i]);
$i ++;
}

$active = null;
do {
$mrc = curl_multi_exec($mh, $active);
//usleep(100); // Maybe needed to limit CPU load (See P.S.)
} while ($active);

$content = array();

$i = 0;
foreach ($ch AS $i => $c) {
$content[$i] = curl_multi_getcontent($c);
curl_multi_remove_handle($mh, $c);
}

curl_multi_close($mh);

echo “<textarea style=’width:100%;height:40%’>”;
echo $content[0];
echo “</textarea>”;

echo ‘<hr>’;

echo “<textarea style=’width:100%;height:40%’>”;
echo $content[1];
echo “</textarea>”;

echo ‘fetched for: ‘ . (microtime(true) – $t1) . “\n”;
[/php]

So this way we are reducing the amount of time to fetch data from multiple URLs to the amount needed to complete the longest transaction.

Toni-Develops

My personal developing blog

Tag Archives: php

Selenium and PHP – Write a snapshot comparing tool

The Agenda

Run Selenium standalone server

Install PHP composer.

install webDriver

Install chrome driver

Multiple asynchronous http calls using curl