Info
Version: | 1.2.8 |
Author(s): | Vitaly A. Popov |
Last Update: | Monday, August 10, 2020 |
.NET Fiddle: | Create the first Fiddle |
Project Url: | https://data-excavator.com/excavatorsharp-web-scraping/ |
NuGet Url: | https://www.nuget.org/packages/ExcavatorSharp.WebScraper.x64 |
Install
Install-Package ExcavatorSharp.WebScraper.x64
dotnet add package ExcavatorSharp.WebScraper.x64
paket add ExcavatorSharp.WebScraper.x64
ExcavatorSharp.WebScraper.x64 Download (Unzip the "nupkg" after downloading)
Dependencies
- EPPlus(<= 4.5.3.3)
- RestSharp(>= 106.10.1)
- cef.redist.x64(>= 79.1.36)
- cef.redist.x86(>= 79.1.36)
- CefSharp.Common(>= 75.1.360)
- CefSharp.OffScreen(>= 75.1.360)
- HtmlAgilityPack(>= 1.11.23)
- HtmlAgilityPack.CssSelectors(>= 1.0.2)
- Newtonsoft.Json(>= 12.0.3)
Tags
It converts HTML code into a structured array of data. The library allows data scraping from multiple sites in parallel mode, within a single running application. Create scraping tasks and perform data extraction on a schedule.
The library is designed for professional extraction and parsing of large volumes of data.
Under the hood there are .css-selectors and xpath support, data export into .csv/.xlsx/.sql/.json, online data export, support for proxy servers, dynamic content crawling, interaction with the site via javascript and much more. The library uses .NET Sockets and Chromium Embedded Framework.
The library can be used separately as crawler or parser. We support the formats sitemap.xml and robots.txt. We support the gzip / deflate compression.
Attention! Only x64 versions are supported for .NET 4.5.2 and 4.6 platforms.
AnyCPU build does not support! You will NOT be able to run the library when building AnyCPU. This is caused by the features of CEF.