screen-scraper is a tool for extracting data from Web sites. It works much like a database that provides access to the information of the Web. It provides a graphical interface allowing you to designate URLs, data elements to be extracted, and scripting logic to traverse pages and work with scraped data. Once these items have been created, screen-scraper can be invoked from external languages such as .NET, Java, PHP, and Active Server Pages. It can be scheduled to scrape information at periodic intervals, and can automatically write extracted data to CSV files.
|Tags||Internet Web Networking Monitoring|
|Operating Systems||Mac OS X OS Independent Windows POSIX Linux|
|Implementation||Java ASP C# Cold Fusion PHP|
Release Notes: This release contains a number of feature enhancements and bugfixes, including being able to drag and drop objects into folders, several new features in the logging window such as automatic scrolling, being able to call scripts from other scripts, backing up the database automatically, and the addition of a new library used to facilitate saving scraped data as XML.
Release Notes: Several bugfixes and minor features have been added, including automatic backup of the database, enhanced HTML rendering and HTML stripping, fixing an error that caused duplicate scripts to appear at times on import, and fixing multiple errors relating to international character sets and non-ASCII characters.
Release Notes: The http-client library has been updated to accept all SSL certificates. In certain situations, the database was closed prematurely when screen-scraper was invoked from the command line.
Release Notes: This release fixes a particularly annoying bug that slipped into version 2.7 related to running from the command line. It also contains a few other minor bugfixes.
Release Notes: screen-scraper can now generate RSS feeds from scraped data. The session.addToSessionVariable method was added. Log messages have been enhanced and clarified. All of screen-scraper's ports may now be set in the properties file. A number of miscellaneous bugfixes have been made.