Benchmark

Review Scraping Benchmark: Bright Data, Oxylabs & Decodo

updated on Jul 24, 2026

We tested 5 web scraping providers across 5 major review platforms for a total of 12,500 requests, and measured success rate, completion time, and metadata fields.

Provider

For

Bright Data

Highest success rate, structured JSON

Oxylabs

Fastest completion

Decodo

High success rate with low completion time on simpler targets

SerpApi

Platform-specific review APIs with structured JSON output

Review scraping benchmark

You can read benchmark methodology section for more details on the testing process.

Domain coverage by provider

Domain	Bright Data	Nimble	Zyte	Oxylabs	Decodo
Google Maps	✅✅	✅	✅	❌	❌
Yelp	✅✅	✅	✅	❌	❌
Amazon	✅✅	✅	✅	✅✅	✅✅
Trustpilot	✅✅	✅	✅	✅	✅
Tripadvisor	✅	✅	✅	✅	✅

✅ = supported, returns HTML
✅ ✅ = supported, returns structured data

Review scraping performance by domain

Available metadata fields for providers with structured JSON responses

Provider	Domain	Field Count	Available Fields
Bright Data	Amazon	29	asin, author_id, author_link, author_name, badge, brand, categories, department, helpful_count, is_amazon_vine, is_verified, product_name, product_rating, product_rating_count, product_rating_max, product_rating_object, rating, review_country, review_header, review_id, review_images, review_posted_date, review_text, url, variant_asin, variant_name, videos
Bright Data	Google Maps	26	address, category, cid, country, fid_location, local_guide, number_of_likes, overall_place_riviews, photos, photos_by_reviewer, place_general_rating, place_id, place_name, profile_pic_url, questions_answers, response_date, response_of_owner, review, review_date, review_details, review_id, review_rating, reviewer_name, reviewer_url, reviews_by_reviewer, url
Bright Data	Trustpilot	39	1_star, 2_star, 3_star, 4_star, 5_star, breadcrumbs, company activity, company_about, company_category, company_country, company_email, company_id, company_location, company_logo, company_name, company_other_categories, company_overall_rating, company_phone, company_rating_name, company_total_reviews, company_website, date_posted, is_verified_company, is_verified_review, review_content, review_date, review_date_of_experience, review_id, review_rating, review_replies, review_title, review_url, review_useful_count, reviewer_location, reviewer_name, reviews_posted_overall, url
Bright Data	Yelp	17	Content, Date, Eelite_status, Rating, Reactions, Replies, Review_auther, Review_image, business_id, business_name, check-in_status, date_iso_format, profile_pic_url, recommended_review, review_id, review_order, url
Oxylabs	Amazon	10	author, content, helpful_count, id, is_verified, product_attributes, rating, review_from, timestamp, title

Review scraping providers pricing

Review scraping providers free trial

Vendor	Free trial
Bright Data	5K records per month
Oxylabs	7-day
Decodo	3-day trial (100 MB)
SerpApi	250 searches per month
Nimble	5K requests (one-time)
Zyte	$5 credits

Get our team to automate one of your business processes with AI agents, free of charge.

Automate a process

Review scraping providers & benchmark results

Bright Data

Bright Data achieved the highest average success rate at 78% across all five review platforms and was the only provider to return structured JSON on four of them: Amazon, Google Maps, Trustpilot, and Yelp. It led on Amazon (96%) and Trustpilot (98%), delivering up to 39 metadata fields per review including verification status, reviewer location, and owner responses. Google Maps was its weakest domain at 39%, though most providers also failed on this domain due to JavaScript-rendered review content.

Oxylabs

Oxylabs was the fastest provider in the benchmark at 5s average completion time, significantly ahead of the next closest at 13s. It posted high results on Trustpilot (98%) and Tripadvisor (91%), and matched the top tier on Amazon (92%) with 10 structured JSON fields. It did not return results on Google Maps or Yelp, where it lacked dedicated scraping configurations for these platforms.

Decodo

Decodo scored 93% on Trustpilot and 76% on Tripadvisor using its unblocker proxy, demonstrating solid performance on server-rendered review pages. However, it recorded 0% on both Google Maps and Yelp, and only 11% on Amazon despite using a structured API endpoint. Its coverage is limited to two of the five tested platforms, making it the narrowest option in the benchmark for review scraping.

SerpApi

SerpApi offers separate dedicated APIs for each major review platform rather than a single general-purpose scraping endpoint. It provides individual APIs for Google Maps Reviews, Yelp Reviews, Tripadvisor, each returning structured JSON with platform-specific fields such as topic mentions and sub-ratings on Google Maps, elite status and language breakdowns on Yelp, or location details on Tripadvisor by query.

Zyte

Zyte was one of only two providers to return results on all five platforms, finishing with a 65% average success rate. It performed best on Tripadvisor (86%) and Yelp (57%), maintaining steady extraction across domains. Google Maps was a relative bright spot at 41%, one of the higher scores on a domain where most providers failed. All extraction was HTML-based with CSS selector parsing, so no structured metadata fields were returned beyond the five standard review fields.

Nimble

Nimble reached 92% on Amazon and 66% on Trustpilot, showing it can handle structured review pages effectively. However, performance dropped to 1% on Google Maps and 31% on Yelp, where JavaScript-heavy rendering limited its HTML-based extraction. Its 52% overall average reflects this uneven platform support, with completion times averaging 20s.

Review scraping benchmark methodology

We selected the top 5 review-focused domains from the Tranco top sites list: Amazon, Google Maps, Tripadvisor, Trustpilot, and Yelp. The five scraping providers were chosen from web data scraping companies with at least 100 employees. Each provider received the same set of 2,500 URLs (500 per platform), and we measured three metrics: success rate, completion time, and available metadata fields.

Providers and integration types

Providers were integrated using two approaches depending on the platform:

JSON structured API: The provider returns parsed review data in JSON format with named fields (e.g., reviewer_name, rating, review_text). Bright Data and Oxylabs offered this for select platforms.
HTML response: The provider returns rendered HTML, which we parsed using CSS selectors to extract review fields. Decodo, Nimble, and Zyte primarily used this approach.

Note: Decodo returned a JSON structured response for Amazon, but none of the responses contained successful review data. Its 11% success rate on Amazon came entirely from correct 404 detection, so no metadata fields are reported for that combination.

Reviews scraping benchmark validation rules

Each response went through a three-step validation:

Submission: HTTP status code between 200-399 or 404 was required to pass.
Execution: For async providers, the scraping job had to complete without timeout or error.
Validation: The response had to contain usable review data.
- For JSON responses: at least one review with a valid review_text (string) or rating (integer).
- For HTML responses: at least one CSS selector match returning review content.

Before running the full benchmark, we tested each provider with intentionally broken URLs, confirmed 404 pages, and live pages with zero reviews to map how each provider signals these edge cases. Providers returned different indicators depending on their implementation, including explicit error codes, HTTP 404 status, or empty response bodies.

When a provider correctly identified a page as not found or returned an appropriate response for a page with no reviews, the result was counted as valid. We then applied a cross-provider verification step: if a provider returned empty results on a URL where at least one other provider extracted review data, that empty result was reclassified as a failure. This separated extraction failures from pages that simply had no reviews to return.

Completion time

Completion time was measured end-to-end from the initial API request to receiving the final response. For async providers (e.g., Bright Data dataset API), this includes the polling/wait time until results were ready.

Available metadata fields

For providers returning structured JSON, we counted the total number of unique fields returned across all reviews. For HTML-based responses, the metadata count reflects the fixed set of CSS selector fields used for extraction (5 fields: reviewer_name, review_text, rating, review_date, review_title).

Reviews scraping benchmark dataset

The 2,500 test URLs were collected from publicly accessible review pages across the five Tranco top-ranked review platforms. URLs were cleaned to remove locale parameters, invalid formats, and duplicates before testing.

Shared configuration

All providers received identical URLs from the same dataset and were tested under the same conditions:

Sequential execution: one request at a time, no parallel requests
Delay between requests: 2 seconds
Rate limit handling: 30-second wait with up to 3 retries on HTTP 429
Submission timeout: 300 seconds
Execution timeout: 600 seconds
Each URL was tested once per provider

Provider configurations

Bright Data

Bright Data used two integration methods depending on the domain. For Amazon, Google Maps, Trustpilot, and Yelp, we used the Dataset API, which returns structured JSON with parsed fields. For Tripadvisor, we used a web unblocker that returns rendered HTML, which we parsed locally with CSS selectors.

The Dataset API was polled via the /progress/{snapshot_id} endpoint at 1-second intervals until the status reached ‘ready’. Results were then fetched from the /snapshot/{snapshot_id} endpoint.

Decodo

Decodo used the Universal Scraper API for Amazon. For Google Maps, Tripadvisor, Trustpilot, and Yelp, we used the web unblocker with the X-SU-Headless: HTML header for JavaScript rendering. All requests included a desktop User-Agent header.

Oxylabs

Oxylabs used a dedicated source API for Amazon (source: amazon_reviews) with structured JSON output. For Google Maps, Tripadvisor, Trustpilot, and Yelp, we used the Web Unblocker proxy. Unblocker requests included a desktop User-Agent header.

Nimble

Nimble used the Web API for all domains with render: true for JavaScript rendering. All requests returned rendered HTML, which we parsed with CSS selectors. No domain-specific configuration was applied.

Zyte

Zyte used the Extract API for all domains with browserHtml: true, which returns JavaScript-rendered HTML via a headless browser. No domain-specific configuration was applied.

See more of our benchmarks and data-driven insights in Google Search.

Add as preferred source

FAQs

Manual product review scraping is slow and incomplete. Scraping customer reviews using automated tools allows you to extract hundreds or thousands of reviews in minutes.

This saves time and ensures your data collection process captures both positive and negative reviews.

Scraped reviews provide valuable customer insights for market research. Companies can track customer concerns, measure customer loyalty, and analyze customer preferences over time.

Most review platforms set restrictions on automated data extraction. Running web scrapers too aggressively can trigger CAPTCHA, IP blocks, or bans.

To reduce risks, use a respectful automated process with rate limits, random delays, and residential proxies if needed.

Typical fields include review text, star ratings, user names, dates, and metadata. Some setups also track structured data like location, product category, or business type.

You can collect customer reviews from various websites, including e-commerce platforms, social media networks, and popular platforms like Amazon, Walmart, Yelp, Google Play, and Trustpilot.

Cite this benchmark

Pick the format that matches where you're publishing. Pasting the link version into your CMS preserves the backlink.

Nazlı Şipi (2026) - "Review Scraping Benchmark: Bright Data, Oxylabs & Decodo". Published online at AIMultiple.com. Retrieved July 24, 2026, from: https://aimultiple.com/review-scraping [Online Resource]

Şipi, N. (2026, July 24). Review Scraping Benchmark: Bright Data, Oxylabs & Decodo. AIMultiple. https://aimultiple.com/review-scraping

@misc{sipi2026,
  author = {Şipi, Nazlı},
  title  = {{Review Scraping Benchmark: Bright Data, Oxylabs & Decodo}},
  year   = {2026},
  month  = jul,
  howpublished    = {\url{https://aimultiple.com/review-scraping}},
  note   = {AIMultiple. Retrieved July 24, 2026}
}

Download all data

Results and timestamps of 14.0 thousand data points. Download the data used in this article as a ZIP file containing one CSV file and a README.

Last updated: July 3, 2026

Download

Nazlı Şipi

AI Researcher

Follow On

Nazlı is a data analyst at AIMultiple. She has prior experience in data analysis across various industries, where she worked on transforming complex datasets into actionable insights.

View Full Profile