Skip to content

Commit

Permalink
Remove unused setting and improve settings.md docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Wesley van Lee committed Jan 10, 2025
1 parent 1a7ce0e commit f86183d
Show file tree
Hide file tree
Showing 4 changed files with 1 addition and 12 deletions.
10 changes: 1 addition & 9 deletions docs/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,12 +57,4 @@ This setting defines the location of the WACZ file that should be used as a sour
SW_WACZ_CRAWL = True
```

Setting to ignore original `start_requests`, just yield all responses found.

### `SW_WACZ_TIMEOUT`

```python
SW_WACZ_TIMEOUT = 60
```

Transport parameter for retrieving the `SW_WACZ_SOURCE_URI` from the defined location.
Setting to control the scraping behavior. If set to `False`, the scraper will bypass the WACZ middleware/downloadermiddleware during the crawling process.
1 change: 0 additions & 1 deletion scrapy_webarchive/spidermiddlewares.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ def __init__(self, settings: Settings, stats: StatsCollector) -> None:

self.wacz_uris = re.split(r"\s*,\s*", wacz_uri)
self.crawl = settings.get("SW_WACZ_CRAWL", False)
self.timeout = settings.getfloat("SW_WACZ_TIMEOUT", 60.0)

@classmethod
def from_crawler(cls, crawler: Crawler) -> Self:
Expand Down
1 change: 0 additions & 1 deletion tests/test_downloadermiddlewares.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ def _get_settings(self, **new_settings):
settings = {
"SW_WACZ_SOURCE_URI": self._get_wacz_source_url(),
"SW_WACZ_CRAWL": True,
"SW_WACZ_TIMEOUT": 60,
}
settings.update(new_settings)
return Settings(settings)
Expand Down
1 change: 0 additions & 1 deletion tests/test_middleware.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ def setup_method(self):
def _get_settings(self, **new_settings):
settings = {
"SW_WACZ_SOURCE_URI": get_test_data_path("warc_1_1", "quotes.wacz").as_uri(),
"SW_WACZ_TIMEOUT": 60,
}
settings.update(new_settings)
return Settings(settings)
Expand Down

0 comments on commit f86183d

Please sign in to comment.