We tested the top 6 video scraping providers to see how they handle video metadata on the top video platform, totaling 6,000 requests, and measured their success rate, response time, and metadata fields.
Video scraping benchmark results
To see how we calculated these metrics, read video scraping benchmark methodology.
What data you can scrape from video platforms
Different providers return different amounts of metadata for the same video URL. JSON providers give you parsed fields you can use directly; HTML providers return the rendered page, so you pull the fields you need with CSS selectors.
The table below lists the metadata fields each provider returned for one video URL, highlighting the ones unique to that provider.
Beyond the unique fields shown, every JSON provider also returns the common video metadata you would expect: title, description, view count, like count, comment count, publish date, duration, channel name, channel URL, subscriber count, thumbnails, tags, and related videos. The HTML providers expose the same data, just through CSS selectors on the rendered page.
Video scrapers & benchmark results
Oxylabs averaged 17 seconds per URL in the benchmark, returning the watch page as rendered HTML for the four target fields to be extracted client-side. Oxylabs provides a Web Scraper API with eight YouTube-specific sources, each targeting a different object on the platform:
search: up to 20 search results for a querysearch_max: up to 700 search results for a querymetadata: metadata of a single videosubtitles: subtitle track of a single videodownload: audio or video stream of a single videovideo_trainability: whether a video is eligible for AI trainingchannel: full channel data including video listautocomplete: search-bar suggestions for a term
There is also a universal scraper with render=html for cases where none of the dedicated sources fit, which renders the page in a headless browser and returns the HTML.
For the video scraping benchmark we sent each video URL through the universal source with render=html, then parsed the rendered watch page to pull title, channel, view count, and duration.
Decodo is the second-fastest provider tested at 4 seconds per URL, returning 22 parsed fields, five of them exclusive to Decodo. It has four scraper templates dedicated to video platform, each covering a different object on the platform:
- Metadata: titles, durations, views, channel info and more for a single video
- Search: up to 20 search results for a query
- Subtitles: full subtitles and captions of a video for analysis or indexing
- Channel: channel metadata, video lists and engagement metrics for creator analysis
Metadata accepts a video ID via the query parameter and returns structured JSON containing title, channel, view count, duration, upload date, like count, and the remaining metadata fields. This is the template we used in the video scraping benchmark.
SerpApi‘s Video API was the fastest provider in the benchmark at 1 second per URL, returning 18 parsed fields. It exposes three YouTube engines, each available as a single GET against https://serpapi.com/search.json:
- Video API : per-video details including title, channel, views, likes, published date, description, chapters, related videos, and pagination tokens for comments
- Search API : search results for a query, with upload-date, length, and quality filters via the
spparameter - Video Transcript API : the transcript of a video by ID, with snippets, start/end timestamps, and language details
All three return parsed JSON in one synchronous call and accept gl (country) and hl (language) for localization. Video API accepts a video ID via the v parameter and returns the full payload in a single GET, and with no_cache=true added to bypass the one-hour SerpApi cache, this is the engine that powered SerpApi’s role in the video scraping benchmark.
Apify’s Video scraper took the longest at 21 seconds per URL but produced the richest payload of any provider tested, with 28 parsed fields.
Apify has six dedicated scraper actors in their marketplace, maintained by the Streamers team, each targeting a different object on the platform:
- Video scraper: full per-video metadata including channel name, likes, views, and subscriber counts
- Comments scraper: comment text, posting date, author username, and parent video info
- Channel scraper: channel info such as subscriber count, total video count, total views, and creation date
- Shorts scraper: short-form video data including caption, timestamps, likes, dislikes, views, and comment counts
- Hashtag video scraper: video records discovered by hashtag, with the same per-video fields
- Video downloader: MP4, MP3 and other format downloads pushed directly to cloud storage
Every actor accepts URLs or search terms as input and returns parsed JSON, CSV or Excel. The Video scraper is the actor we ran in the video scraping benchmark, called via the standard Apify /acts/{actor}/runs endpoint with a single video URL per startUrls entry, polled to completion, and read from the run’s dataset items.
Nimble averaged 18 seconds per URL in the benchmark, returning rendered HTML rather than parsed fields. For web pages they offer the Extract API: any URL goes in, anti-bot evasion and proxy rotation happen on Nimble’s side, and a stealth browser driver (we picked vx10) renders the page before returning the HTML.
Pulling the metadata out of that response was a client-side job: locate the embedded ytInitialPlayerResponse JSON inside the HTML, walk into videoDetails, and read off title, channel author, view count, and duration in seconds.
Zyte returned each URL in 9 seconds via its browserHtml mode, leaving metadata extraction to the client.
Zyte has a single Zyte API endpoint configured per request with payload flags. The httpResponseBody flag returns raw HTTP without running scripts, which works for static pages but misses content on a JS-hydrated video page. Switching to browserHtml: true boots a real browser, executes the page’s JavaScript, and returns the post-hydration HTML. From there the extraction matches what Nimble’s pipeline needed: grab ytInitialPlayerResponse from a <script> tag, balance-brace the JSON to its closing }, parse it, and lift the four target fields from videoDetails.
Video scraping benchmark methodology
We tested 6 video scraping providers on 1,000 unique video URLs, sending one URL per request and recording the response. All URLs were verified to be live at the time the benchmark was run, so a removed-video edge case did not need to be handled in the validation logic.
The 1,000 URLs were in canonical watch?v=… form. Channel pages, playlists, and short-form videos were excluded so every entry passed to every provider was the same kind of object.
Each provider was configured to use the URL-input mode its API supports:
- Decodo: YouTube Metadata template, video ID passed via
query, parsed JSON. - SerpApi: YouTube Video API engine, video ID passed via
v, withno_cache=trueso cached responses were never served. - Apify: Video scraper actor via
/acts/{actor}/runswith the URL instartUrls. The run was polled until completion and the dataset items were read once it finished. - Oxylabs: Web Scraper API with
source=universalandrender=html. The previously documentedyoutube_metadatasource now returns an unsupported-source error, so the universal scraper with rendered HTML was used instead. - Nimble: Extract API with
render=trueand thevx10stealth browser driver, returning rendered HTML. - Zyte: Zyte API with
browserHtml: true, returning post-hydration HTML.
A response was counted as valid when at least one of four fields was returned in a usable format: title as a non-empty string, view_count as a non-negative integer (or a string that parses as one), duration as either an MM:SS string or an integer of seconds, or published as a date string (either an exact date or a relative phrase such as “3 weeks ago”). A single field in correct form was enough to count the call as successful, because that already shows the provider reached the page and completed the scrape.
Three of the seven providers returned rendered HTML rather than parsed JSON. For those responses, the validator located the embedded ytInitialPlayerResponse script and read the videoDetails object, applying the same check to its four fields: title, author, viewCount, and lengthSeconds.
HTTP 429 responses triggered a 30-second back-off and were retried up to three times. For each call, the wall-clock time from submission to a usable response was recorded, then averaged across the 1,000 URLs to produce the per-provider end-to-end time. The boolean validation result was averaged the same way to produce the per-provider success rate.
FAQs
None of the providers expose a time series of past view counts directly. You can build one by scraping the same video URL on a schedule and storing the snapshots yourself; daily or hourly cron is usually enough for trend analysis.
Search returns a ranked list of videos for a keyword, with shallow metadata per result. URL scraping returns deep metadata for a specific video you already know about. Search is for discovery; URL scraping is for monitoring a known set of items.
Public, non-personal data is generally legal to scrape in most jurisdictions, but every platform’s Terms of Service forbid automated access. The legal risk increases if you scrape personal data (comments tied to identifiable users), if you redistribute the raw video content, or if you bypass authentication. Consult a lawyer for high-stakes use cases.
No. Every provider in the benchmark manages its own proxy pool and anti-bot evasion. You authenticate with an API key and send the target URL or video ID; the proxy layer is invisible to the caller.
Be the first to comment
Your email address will not be published. All fields are required.