javascript - Express app.get() equivalent in CasperJS -


i have built simple web scraper scrapes website , outputs data need when visit url - localhost:3434/page. implemented functionality using express app.get() method.

i have following questions,

1) want know if there way implement functionality in casperjs.

2) there way make code start scraping after visit url -localhost:8081/scrape. don't think creating endpoint correctly because starting scrape before visit url

3) when visit url gives me error saying url not available.

i think of these problems solved if can set end point correctly localhost:3434/page in casperjs. don't need results appear on page. need start scraping when visit url.

below code developed scrape website , create server in casper.

var server = require('webserver').create();  var service = server.listen(3434, function(request, response) {     var casper = require('casper').create({     loglevel:"verbose",     debug:true     });      var links;     var name;     var paragraph;     var firstname;     var expression = /[-a-za-z0-9@:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-za-z0-9@:%_\+.~#?&//=]*)?/gi;     var regex = new regexp(expression);      casper.start('http://www.home.com/professionals/c/oho,-tn');      casper.then(function getlinks(){          links = this.evaluate(function(){             var links = document.getelementsbyclassname('pro-title');             links = array.prototype.map.call(links,function(link){                 return link.getattribute('href');             });             return links;         });     });      casper.then(function(){         this.each(links,function(self,link){           if (link.match(regex)) {             self.thenopen(link,function(a){               var firstname = this.fetchtext('div.info-list-text');               this.echo(firstname);             });           }         });     });      casper.run(function() {             response.statuscode = 200;             response.write(firstname);             response.close();                        });     }); 

the webserver used in casperjs script phantomjs's web server module "intended ease of communication between phantomjs scripts , outside world , not recommended use general production server"

you should not build web server in phantomjs. checkout these node-phantom bridges let use phantom regular nodejs web server:

spookyjs driver particularly casperjs, whereas others phantomjs only.

although casperjs allows being loaded within phantomjs can @ least use in phridge (not sure others) since has .run function runs function directly inside phantomjs environment:

casperpath = path.join(require.resolve('casperjs/bin/bootstrap'), '/../..'); phantom.run(casperpath, function(casperpath) {     phantom.casperpath = casperpath;     phantom.injectjs(casperpath + '/bin/bootstrap.js');     casper = require('casper').create();     ... 

besides ones use phantomjs, there's others:

zombiejs uses native nodejs libraries makes fastest , natural use in nodejs app. although it's meant more testing purposes , may not work on sites other scrapers might.


Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -