Articles tagged with: XLS

24 January 2012

Mine Your Website's Data With a Private Custom Crawler

Written by Dr. Ulrich Sigmund, Posted in Blog

Web pages provide a plethora of information and mineable data. Unfortunately most of them are not using the XML based XHTML but the classic HTML. Therefore we decided to extend the ANKHOR XML parser to accept most HTML content.

With this extension it is now quite simple to e.g. extract all <img> references from a web page and convert it into a table.

 httpxmlfilter

 I have created a simple web crawler for testing purposes that walks through all reachable documents on a given domain starting at the root. It uses a while loop to iterate through the access depth. A HEAD request is executed in parallel for all resources that are reachable at this level and have not been accessed in one of the iterations before.

06 May 2010

Direct Data Import From XLSX And XLS Files

Written by Stefan Herr, Posted in Blog

In the blog entry from March 28, we showed you how to exchange data between ANKHOR FlowSheet and traditional spreadsheet applications using the clipboard or CSV files as intermediate data format. However, the latest release 0.9.47 of ANKHOR FlowSheet provides a great new library that supports the direct import of XLSX and XLS files, offering immediate access to the data fields in the workbooks and even (within certain limits) the automatic conversion of cell formulas into FlowSheet macros. This means that the original spreadsheet can be automatically converted into a corresponding FlowSheet!

Import from XLSX and XLS Files

In this article we explain how to make use of the macros in the new "Spreadsheet Import" library.