Top 20 NuGet crawler Packages

[Icon]
This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" m...
[Icon]
Deprecated as there's new maintainer for original HAP project. Please check the new repo at https://github.com/zzzprojects/html-agility-pack. This is a port of HtmlAgilityPack library created by Simon Mourrier and Jeff Klawiter for .NET Core platform. This NuGet package supports can be used with Un...
[Icon]
A .NET Standard web crawling library similar to WebMagic and Scrapy. It is a lightweight ,efficient and fast high-level web crawling & scraping framework for .NET
[Icon]
A .NET Standard web crawling library similar to WebMagic and Scrapy. It is a lightweight ,efficient and fast high-level web crawling & scraping framework for .NET
[Icon]
Offical package. DotnetSpider is a high performance, light weight cralwer developed by C#.
[Icon]
Abot is an open source C# web crawler built for speed and flexibility. It takes care of the low level plumbing (multithreading, http requests, scheduling, link parsing, etc..). You just register for events to process the page data. You can also plugin your own implementations of core interfaces to t...
[Icon]
A powerful C# web crawler that makes advanced crawling features easy to use. AbotX builds upon the open source Abot C# Web Crawler by providing a powerful set of wrappers and extensions.
[Icon]
dcsoup is a .NET library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. This library is basically a port of jsoup, a Java HTML parser library. see also: http://jsoup.org/ API reference is ...
[Icon]
A library for reading/writing WARC files and scraping websites.
[Icon]
An API wrapper for the TF2 Outpost platform. A platform to find great deals for your Team Fortress 2, Counter-Strike: Global Offensive and Dota 2 items with zero hassle.
[Icon]
A .net core library to handle the login to Backpack.tf. Backpack.tf is a trading site for Team Fortress 2, Counter-Strike: Global Offensive, and Dota 2. Community item pricing, item trading and stats, and much more.
[Icon]
This is a port of HtmlAgilityPack library created by Simon Mourrier and Jeff Klawiter for .NET Core platform. This NuGet package supports can be used with Universal Windows Platform, ASP.NET 5 (using .NET Core) and full .NET Framework 4.6. Original description: This is an agile HTML parser that bu...
[Icon]
This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" m...
[Icon]
Web scraper / crawler / spider. Supports robots protocol and user agent.
[Icon]
The Crawler-Lib Engine is a general purpose workflow enabled task processor. It has evolved from a web crawler over data mining and information retrieval. It is throughput optimized and can perform thousands of tasks per second on standard hardware. Due to its workflow capabilities it allows to stru...
[Icon]
The Crawler-Lib Engine Test Helper simplifies the test of tasks. It can be used to develop unit tests and integration tests for tasks.
[Icon]
This is ESENT storage providers for the NCrawler
[Icon]
HTML processing of crawled data for NCrawler
[Icon]
Provides storing crawler data using EF for NCrawler
[Icon]
Storage in file system for the NCrawler