Top 20 NuGet extraction Packages

.Net (C#) Binding for Rosette API
Process, transforms, filters and handle audio signals for machine learning and statistical applications. This package is part of the Accord.NET Framework.
Face analytics library based on deep neural networks and ONNX runtime.
Microsoft SQL Server model extraction utility.
Face analytics library based on deep neural networks and ONNX runtime. Gpu implementation.
Extract tables from PDF files (port of tabula-java using PdfPig). Json writer.
Extract tables from PDF files (port of tabula-java using PdfPig). Csv and Tsv writers.
Extract tables from PDF files (port of tabula-java using PdfPig).
Toolkit With More Powerful Functionalities For PDF Rasterization, PDF Redaction, PDF Data Extraction, PDF Printing, & Much More.
MSBuild.Xrm.SourceControl provides a simple but powerful method for extracting Dynamics 365 customisations. The extension uses PowerShell scripts that can seamlessly extract customisations from a Dynamics 365 instance and then subsequently rebuild them into a zipped Solution file ready for import wh...
A c# library that provides the ability to extract text from various document file formats, e.g. pdf, docx, ppt, etc...
A c# library that provides the ability to extract text from various document file formats, e.g. pdf, docx, ppt, etc...
A c# library that provides the ability to extract text from various document file formats, e.g. pdf, docx, ppt, etc...
Turn unstructured HTML pages into structured data. The OpenScraping library can extract information from HTML pages using a JSON config file with xPath rules. It can scrape even multi-level complex objects such as tables and forum posts.
Process, transforms, filters and handle audio signals for machine learning and statistical applications. This package is part of the Accord.NET Framework.
Simple but functional string functions that I think might be useful.
Toxy is a .NET data/text extraction framework similar to Apache Tika in Java. It supports a lot of popular formats such as docx, xlsx, xls, pdf, csv, txt, epub, html and so on.
Library for text extraction. Supports doc, docx, xlsx, odt, pdf, rtf, html, rar, zip,
Boilerpipe text extraction library ported to .Net Core based on rasmusjp's implementation in .NET 4.5 which you can find here https://github.com/rasmusjp/boilerpipe.net
Content extraction via text density