20 projects tagged "OCR"

No download No website Updated 04 Mar 2011 OCR2DATA

Screenshot
Pop 19.44
Vit 34.39

OCR2DATA is a full OCR stack for document digitization analysis and OCR. It provides external connection by way of an API, standard document exchange formats, and a database.

No download Website Updated 19 Dec 2011 OCRFeeder

Screenshot
Pop 113.49
Vit 1.45

OCRFeeder is a document layout analysis and optical character recognition application. It is able to automatically outline a document image's contents, distinguish between graphics and text and perform OCR over the latter. It can export to several formats, its main one being ODT. OCRFeeder has a GTK+ graphical user interface that allows the user to control the application and, for example, edit and correct the automatic recognition. It can also be used from the command line for automation.

Download Website Updated 21 Feb 2014 OCRKit

Screenshot
Pop 190.05
Vit 15.00

OCRKit uses OCR to recognize the text in a graphic, which is particular useful for PDFs received via email, created by DTP, office applications, or images obtained from a scanner, copier, or digital still camera.

Download Website Updated 16 Nov 2013 PDF OCR X

Screenshot
Pop 242.05
Vit 22.33

PDF OCR is a simple drag-and-drop utility that converts PDFs and images into text documents. It uses advanced OCR (optical character recognition) technology to extract the text of the PDF or image. This is particularly useful for dealing with PDFs and images that were created via a scan-to-PDF function in a scanner or photo copier. It uses the Tesseract engine to perform OCR, and currently supports over 20 languages for OCR.

Download Website Updated 03 Sep 2010 Paperless Office

Screenshot
Pop 88.62
Vit 1.00

Paperless Office is a document management and electronic filing system. It is similar to Paperport, but adds many new features, such as automatic document classification, synchronization with your filing cabinet, date extraction, semantic Web integration, and sophisticated natural language processing, such as extracting todo lists from documents, spam detection, urgency classification, along with planning, scheduling, and execution features. You can set due dates and interdependencies for documents and tasks, so it has workflow support.

Download Website Updated 05 Mar 2014 Paperwork

Screenshot
Pop 193.66
Vit 2.40

Paperwork is a GUI to make papers easily searchable using OCR. The basic idea behind Paperwork is "scan & forget" : You should be able to just scan a new document and forget about it until the day you need it again.

No download Website Updated 20 Mar 2014 Pyocr

Screenshot
Pop 149.35
Vit 3.50

Pyocr is a simple Python wrapper for OCR engines (Tesseract, Cuneiform, etc.). It supports Python 2.7 and Python 3.x, and requires Pillow.

No download Website Updated 20 May 2014 Solr-Connector-Files

Screenshot
Pop 21.05
Vit 3.13

Solr-Connector-Files crawls and indexes directories and files from your filesystem (whatever is mountable to Linux) into Apache Solr. It features extraction of file contents with Tika, which extracts metadata and text form many document and file formats. It also integrates automatic text recognition (OCR) for images, photos, and PDFs using Tesseract OCR.

No download Website Updated 14 Oct 2013 getxbook

Screenshot
Pop 114.89
Vit 5.33

getxbook is a collection of tools to download books from websites. There are tools to download from Google Books' "book preview", Amazon's "look inside the book", and Barnes and Noble's "book viewer". There is an optional GUI written in Tcl/Tk, and some shell scripts using OCR to create plain text or searchable PDFs and DjVu files from the downloaded books.

Download Website Updated 04 Nov 2012 tesseract-ocr

Screenshot
Pop 146.42
Vit 2.65

tesseract-ocr is an OCR engine originally developed by Hewlett Packard and now sponsored by Google. It is highly accurate and will read a binary, gray, or color image and output text.

Screenshot

Project Spotlight

cclite

LETS and community currency software.

Screenshot

Project Spotlight

bind

Berkeley Internet Name Domain