August 17, 2006; 05:14 AM Snowtide Informatics Systems, Inc., the leading provider of
enterprise-class PDF content extraction solutions, announced the
release of PDFTextStream v2.0, the latest version of its PDF content
extraction API. Adding a wealth of new capabilities in response to
customer requests, PDFTextStream v2.0 is now available on the Python
and .NET platforms as well as for Java; now supports the extraction of
Chinese, Japanese, and Korean (CJK) text; provides new tools that
simplify content extraction from unstructured PDF documents; adds the
ability to recognize and interpret tabular data in PDF documents; now
supports v1.9 and v2.0 of the Apache Lucene search engine; and includes
other critical performance enhancements.
PDFTextStream enables back-end enterprise systems to extract the text
and metadata contained in PDF documents. This latest version is
especially suited for large enterprises and government agencies that
need to automate and speed the extraction and cataloging of content
held in PDF documents, yet demand high extraction accuracy.
"Over the last year, large enterprises and government agencies have
been approaching us with increasingly complex PDF content extraction
problems revolving around pressing business issues," said Chas Emerick,
the President and Founder of Snowtide Informatics Systems, Inc.
"These problems often present unique technology challenges," he added.
"For example, some require the extraction of data from unstructured
content; others the extraction of CJK text, or the ability to interpret
and access tabular data in PDF documents so it can be more easily
converted into spreadsheets, XML files, or database-ready records. With
PDFTextStream v2.0, we can now offer an even more comprehensive API to
meet these sophisticated demands."
Functionality Not Matched by Competitive Offerings
The release of PDFTextStream v2.0 expands the leadership position of
PDFTextStream as the most comprehensive and highest-performing set of
developer tools for turning unstructured PDF content into structured
data. New capabilities include:
* The ability to use PDFTextStream within Python and .NET environments,
where it was previously only available on the Java platform.
* Functionality that enables the recognition and interpretation of
tabular data -- along with the API for accessing the data -- for the
purpose of rendering spreadsheets, XML files, or other usable formats.
* Full CJK character encoding support built in to the standard
PDFTextStream distribution, an increasingly important requirement in
today's global economy
* Added support for v1.9 and v2.0 of the Apache Lucene search engine,
necessary to keep PDFTextStream's integration module up to date with
the latest Lucene releases
Other important new features include improved accuracy of extracts
sourced from rotated text, the ability to more easily plug text
extracts into existing test analysis processes, functionality that
enables the merging of two or more PDF documents into a single file,
and significantly improved performance.
For additional PDFTextStream v2.0 product details, please visit http://snowtide.com/PDFTextStream
About Snowtide Informatics Systems, Inc.
Snowtide Informatics Systems, Inc. is a privately held software company
headquartered in Holyoke, Massachusetts. Its high-performance software
and custom development services enable large enterprises and government
agencies to automate the extraction, conversion, and cataloging of
content held in PDF documents. PDFTextStream, Snowtide Informatics'
flagship product, is a software component library for Java, Python, and
.NET environments that has been built from the ground up to rapidly and
accurately extract text and metadata held in PDF documents. For more
information about Snowtide Informatics Systems visit snowtide.com.
|