Projects / Xidel

Xidel

Xidel is a command line tool to download Web pages and extract data from them. It can download files over HTTP/S connections, follow redirections, links, or extracted values, and process local files. The data can be extracted using XPath 2.0, XQuery 1.0, and JSONiq expressions, CSS 3 selectors, and custom, pattern-matching templates that are like an annotated version of the processed page. The extracted values can then be exported as plain text/XML/HTML/JSON, or assigned to variables to be used in other extract expressions or be exported to the shell. There is also an online CGI service for testing.

Tags
Licenses
Operating Systems
Implementation

RSS Recent releases

  •  25 Mar 2014 12:28

Release Notes: This release improved JSONiq support (updated to JSONiq 1.0.3) and the JSON extensions (more compatible with XQuery, assignable), uses arbitrary precision arithmetic for all numeric operations if necessary, and adds a trivial subset of XPath/XQuery 3 (!, ||, and switch), new functions for resolving URI or HTML hrefs, some new multipage commands similar to XSLT, and more.

  •  26 Mar 2013 12:49

    Release Notes: This release added JSONiq support with functions/objects/arrays/literals, improved the command syntax by allowing grouped command line options to apply them only to certain pages/links, changed the input/output formats, added support for exporting variables to the shell, fixed several HTML parsing/serialization bugs, changed the syntax of extended strings (from "..$var;.." to x"..{$var}.."), added more HTTP options (different methods, ports, authorization), and made various other minor changes.

    •  06 Nov 2012 22:27

      Release Notes: The XPath interpreter has been extended to become a complete XQuery 1 interpreter, thereby some bugs and design mistakes were found and fixed. Two additional functions were added: a "form" function that encodes HTML forms and can easily follow post requests (e.g. -f form(//form[1], "username=...&password=..."), and a "match" function to run the pattern-matching templates from within XPath 2 expressions. The Windows CLI interface was improved (e.g. support for single quotes), and the two online services were merged into one.

      Screenshot

      Project Spotlight

      mcds

      mutt search queries against carddav.

      Screenshot

      Project Spotlight

      Groonga

      An full text search engine and column store..