Quantcast
Channel: Kodi Community Forum - All Forums
Viewing all articles
Browse latest Browse all 84817

PhantomJS module?

$
0
0
I've recently been using phantomjs inside an addon to scrape content from a javascript-heavy site. It occurred to me that it might be useful to extract this into a module that can be used by other plugin developers.

Rationale
Many content sites today are heavily Javascript oriented and can be hard if not impossible to scrape using simple HTTP libraries. For some sites you really need a full browser with a DOM implementation and Javascript support. PhantomJS is a headless webkit browser which is fully scriptable and is well suited to this task.

Features
The module would provide an easy way for plugin developers to scrape sites using phantomjs. It would provide:
  • A Python API to invoke phantomjs with a given script, passing arguments to it and returning results as a dictionary
  • A Javascript PhantomJS script with some useful utilities for writing scraper scripts
  • Possibly a "background" mode which keeps the phantomjs process running so it can be reused on subsequent plugin calls.
  • Possibly automatic installation of phantomjs for the user.


Just wondering if there's any interest in such a module, or if there is better way to approach the problem altogether?

Viewing all articles
Browse latest Browse all 84817

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>