51 projects tagged "Parser"

Download Website Updated 12 Feb 2013 HtmlCleaner

Screenshot
Pop 16.61
Vit 22.14

HtmlCleaner is an HTML parser. HTML found on the Web is usually dirty, ill-formed, and unsuitable for further processing. For any serious consumption of such documents, it is necessary to first clean up the mess and bring order to the tags, attributes, and ordinary text. For a given HTML document, HtmlCleaner reorders individual elements and produces well-formed XML. By default, it follows rules similar to those which most Web browsers use to create a Document Object Model. However, the user may provide custom tag and rule sets for tag filtering and balancing.

Download Website Updated 17 Feb 2013 WTMParse

Screenshot
Pop 22.23
Vit 22.05

WTMParse is a script originally intended for use in forensic examinations which parses WTMP files from Unix-like operating systems and generates a CSS-styled HTML report containing the login terminal, username, log start date, and login time/date in a table. It's good for postmortem forensic examinations or as a way of getting "last"-like information when you don't have the ability to boot the machine in question but can grab the wtmp.

Download Website Updated 24 Feb 2013 PHP Emoticon Parser

Screenshot
Pop 22.52
Vit 21.90

PHP Emoticon Parser can replace emoticon text with HTML image tags. It can search for emoticon text characters in a given text string and replace them with equivalent emoticon images. The emoticon text and image mappings are defined in a separate script that maps emoticon names to the different equivalent representations for emoticon text symbols.

No download Website Updated 01 Mar 2013 Metrix++

Screenshot
Pop 21.12
Vit 21.78

Metrix++ is a platform to collect and analyze code metrics. It has a plugin-based architecture, so it is easy to add support for new languages, define new metrics, and/or create new pre- and post-processing tools. Every metric has 'turn-on' and other configuration options. There are no predefined thresholds for metrics or rules; you can choose and configure any limit you want. It scales well to large codebases. For example, initial parsing of about 10000 files takes 2-3 minutes on an average PC, and only 10-20 seconds for iterative re-run. Reporting summary results and exceeded limits takes less than 1 - 10 seconds. It can compare results for 2 code snapshots (collections) and differentiate added regions (classes, functions, etc.), modified regions, and unchanged regions. As a result, easy deployment is guaranteed into legacy software, helping you to deal with legacy code efficiently, and either enforce the 'leave it not worse than it was before' rule or motivate re-factoring.

No download Website Updated 27 Jun 2013 GrammarScope

Screenshot
Pop 27.78
Vit 18.89

GrammarScope provides a simple-to-use graphical interface to the parse tree, grammatical structure, typed dependencies, and semantic graph of any text as parsed by the Stanford Parser/Stanford CoreNLP.

No download Website Updated 26 Sep 2013 CSSParser

Screenshot
Pop 22.89
Vit 16.30

CSSParser is a class that evaluates a given CSS selector expression and returns the corresponding nodes from a given DOMNode object.

Download Website Updated 11 Nov 2013 jsoup

Screenshot
Pop 180.60
Vit 12.38

jsoup is a Java library for working with real-world HTML. It can parse HTML from a URL, file, or string. It can find and extract data, using DOM traversal or CSS selectors. The HTML elements, attributes, and text can be manipulated. It can clean user-submitted content against a safe white-list. jsoup is designed to deal with all varieties of HTML found in the wild, from pristine and validating to invalid tag-soup; jsoup will create a sensible parse tree.

Download Website Updated 16 Dec 2012 listparser

Screenshot
Pop 82.60
Vit 7.54

listparser is a Python library that parses subscription lists (also called reading lists) and returns all of the feeds, subscription lists, and "opportunity" URLs that it finds. It supports OPML, RDF+FOAF, and the iGoogle exported settings format.

Download Website Updated 21 Mar 2011 LEPL

Screenshot
Pop 107.04
Vit 7.28

LEPL is a recursive descent parser library written in Python. It is based on parser combinator libraries popular in functional programming, but also exploits Python language features. Operators provide a friendly syntax, and the consistent use of generators supports full backtracking and resource management. Backtracking implies that a wide variety of grammars are supported; appropriate memoisation ensures that even left-recursive grammars terminate.

Download No website Updated 02 Apr 2013 cardme

Screenshot
Pop 56.61
Vit 6.28

cardme is a Java library implementation of RFC 2426, VCard. It provides Java applications with a way to read and write from and to the VCard file format. The project's goals are to provide a flexible and easy to use library with excellent documentation.

Screenshot

Project Spotlight

cyphertite

A tar-like ultra secure remote deduplicating archiver.

Screenshot

Project Spotlight

JasperReports

A Java reporting library.