$ npm install node-phantom-simple
This module is API-compatible with
node-phantom but doesn't rely on
WebSockets
/ socket.io
. In essence the communication between Node and
Phantom / Slimer has been simplified significantly. It has the following advantages
over node-phantom
:
cluster
(node-phantom
does not, due to how it works)
server.listen(0)
works in cluster.Your software should work without changes, but can show deprecation warning about outdated signatures. You need to update:
options.phantomPath
-> options.path
.create()
.evaluate()
& .waitForSelector()
-> move callback
to last
position of arguments list.That's all!
npm install node-phantom-simple
# Also need phantomjs OR slimerjs:
npm install phantomjs
# OR
npm install slimerjs
Note. SlimerJS is not headless and requires a windowing environment.
Under Linux/FreeBSD/OSX xvfb can be used to run headlessly.. For example, if you wish
to run SlimerJS on Travis-CI, add those lines to your .travis.yml
config:
before_script:
- export DISPLAY=:99.0
- "sh -e /etc/init.d/xvfb start"
You can use it exactly like node-phantom, and the entire API of PhantomJS should work, with the exception that every method call takes a callback (always as the last parameter), instead of returning values.
For example, this is an adaptation of a web scraping example:
var driver = require('node-phantom-simple');
driver.create({ path: require('phantomjs').path }, function (err, browser) {
return browser.createPage(function (err, page) {
return page.open("http://tilomitra.com/repository/screenscrape/ajax.html", function (err,status) {
console.log("opened site? ", status);
page.includeJs('http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js', function (err) {
// jQuery Loaded.
// Wait for a bit for AJAX content to load on the page. Here, we are waiting 5 seconds.
setTimeout(function () {
return page.evaluate(function () {
//Get what you want from the page using jQuery. A good way is to populate an object with all the jQuery commands that you need and then return the object.
var h2Arr = [],
pArr = [];
$('h2').each(function () { h2Arr.push($(this).html()); });
$('p').each(function () { pArr.push($(this).html()); });
return {
h2: h2Arr,
p: pArr
};
}, function (err,result) {
console.log(result);
browser.exit();
});
}, 5000);
});
});
});
});
options (not mandatory):
/CoreText/
to suppress some common annoying
font-related warnings.For example
driver.create({ parameters: { 'ignore-ssl-errors': 'yes' } }, callback)
driver.create({ parameters: ['-jsconsole', '-P', 'myVal']} }, callback)
will start phantom as:
phantomjs --ignore-ssl-errors=yes
You can rely on globally installed engines, but we recommend to pass path explicit:
driver.create({ path: require('phantomjs').path }, callback)
// or for slimer
driver.create({ path: require('slimerjs').path }, callback)
You can also have a look at the test directory to see some examples of using the API, however the de-facto reference is the PhantomJS documentation. Just mentally substitute all return values for callbacks.
All of the WebPage
callbacks have been implemented including onCallback
,
and are set the same way as with the core phantomjs library:
page.onResourceReceived = function(response) {
console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
};
This includes the onPageCreated
callback which receives a new page
object.
Properties on the WebPage
and Phantom
objects are accessed via the get()
/set()
method calls:
page.get('content', function (err, html) {
console.log("Page HTML is: " + html);
});
page.set('zoomfactor', 0.25, function () {
page.render('capture.png');
});
// You can get/set nested values easy!
page.set('settings.userAgent', 'PhAnToSlImEr', callback);
Engines are buggy. Here are some cases you should know.
.evaluate
can return corrupted result:
page.onConfirm()
handler can not return value due async driver nature.
Use .setFn()
instead: page.setFn('onConfirm', function () { return true; })
.Made by Matt Sergeant for Hubdoc Inc.
© 2010 - cnpmjs.org x YWFE | Home | YWFE