Sunday, March 19, 2017

Extract data from a web with Goutte when the data is loaded with ajax

Im trying to extract the data of these products https://www.appliancesdelivered.ie/search?sort=price_desc#/?search_return=all&sort_by_field=Price+Low+to+High.

I'm using Laravel a Goutte for this, but i cant make it work... When i try to extract the menu header works fine...

with menu header

$products = $crawler->filter('body > div.container > header > nav > ul > li > a')->each(function ($node) {
        print $node->text()."\n";
        return $node->text();
});
dd($products);

Now when a tried to get the title of the products in te list for test purpose (using the same way of select path) the result is a null array...

$products = $crawler->filter('body > div.container >div > div.product-popup-container > div.container > div > div.search-results.col-md-9 > div > div.product-description.col-xs-8.col-sm-8 > div > div.col-xs-12.col-sm-7.col-lg-8 > h4 > a')->each(function ($node) {
        print $node->text()."\n";
        return $node->text();
});
dd($products);

I am starting to think that it may be because the list of products is loaded with ajax and it is not obtained from the first request as it happens with the header menu ... What can I do to extract the information correctly? help please!



via DobleseSays

Advertisement