Skip to content

feat: provide Request instances in skipped request callbacks#1999

Open
vdusek wants to merge 4 commits into
masterfrom
provide-request-in-skipped-callbacks
Open

feat: provide Request instances in skipped request callbacks#1999
vdusek wants to merge 4 commits into
masterfrom
provide-request-in-skipped-callbacks

Conversation

@vdusek

@vdusek vdusek commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

on_skipped_request callbacks can now receive the full Request object instead of only the URL string, so handlers can access request metadata such as user_data.

This stays backward compatible by dual-dispatching on the type annotation of the callback's first parameter: annotate it as Request to receive the object, while a str annotation (or no annotation) keeps receiving the URL string as before. Skipped URLs are normalized to Request once at the choke points (add_requests and both extract_links implementations), the duplicated request-building helper was extracted to crawlee._utils.requests.create_request_from_options, and link extraction now applies transform_request_function to robots-skipped requests for consistency with enqueued ones.

Reworked from #1927 (original author inactive).

@vdusek vdusek added t-tooling Issues with this label are in the ownership of the tooling team. adhoc Ad-hoc unplanned task added during the sprint. labels Jun 30, 2026
@vdusek vdusek self-assigned this Jun 30, 2026
@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.77108% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.17%. Comparing base (a58687e) to head (608fbb6).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
src/crawlee/crawlers/_basic/_basic_crawler.py 83.33% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1999      +/-   ##
==========================================
+ Coverage   93.09%   93.17%   +0.08%     
==========================================
  Files         167      167              
  Lines       11775    11815      +40     
==========================================
+ Hits        10962    11009      +47     
+ Misses        813      806       -7     
Flag Coverage Δ
unit 93.17% <92.77%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vdusek vdusek requested a review from Pijukatel June 30, 2026 12:22
@vdusek vdusek marked this pull request as ready for review June 30, 2026 12:22
@vdusek vdusek requested a review from Mantisus July 1, 2026 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants