Welcome to Ivy Tools

Ivy Pdf Parser (.NET library)

Our Pdf Parser enables searching and extracting information from complex PDF documents including the ones with inconsistent formatting, variable-length text columns and tables.

The difference

Traditionally, PDF extraction tools are converting PDF documents to plain text and then search it using various means. With this approach you are losing any rich formatting information like font type, character size or bold/italic attributes.

Ivy Pdf Parser reads PDF as a graphical document, so you can use all rich text attributes for matching. For some documents it helps to extract the last pieces of information that cannot be determined from the plain text.

  • Flexible text search.
    Search text, regular expressions and font attributes.
  • Geometrical search.
    Find elements relative to other elements.
    (For example a text on the right of word "Total")
  • Table extraction.
    Tables are recognized automatically.
  • Built-in functions.
    Clean up extracted data and connect pieces together.
  • Extensible.
    Use collection of raw elements and write your own logic to parse it.
  • 100% .Net managed code.
    Very fast.
  • Works with other file formats, like Excel, Text, HTML.
    Can be used for all your data processing needs.

Now includes powerful Template Editor

  • Easy to use GUI. Create and test expressions, preview results on multiple documents at the same time.
  • Any number of data points. Extract single values and tables.
  • Powerful post-processing. Joins, filters, conversions, all power of C# for your convenience.
  • Template inheritance. Create sub-templates to deal with different document variations.
  • Validation tests. Write logic to validate the results.
  • Export results to Excel, XML or JSON. Include tables into result.
  • Automation support. Integrate extraction into your process.
  • Extensible. Reference any .Net library.