Table of Contents
Image Cloud: Diagnostics
The decision to create a new wiki site was mainly due to the fact that the original image cloud site mixes and and all versions and is thus not consistent.
Check the status of an image
git repo: swiss army knive: https://git.picalike.corpex-kunden.de/incubator/swiss-army-knife/-/tree/master/image_cloud
python3 ./image_cloud_diag.py --url https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756606/756606_01_r20.jpg url: https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756606/756606_01_r20.jpg download requested: 2022-07-22 22:27 download started : 2022-08-17 17:30 download finished : 2022-08-17 17:30 content length: 20741 fail count: 2 fire+forget: False
The result is useful to decide in what state an image is. if content length is greater than zero, the image has been finally downloaded.
But since status codes are not kept, it cannot be inferred why a download failed. Maybe a time-out or temporary blocking.
Domain Analysis
It is possible to get an overview of all failed downloads per domain, but only if it was scheduled with the fire+forget worker(!):
python3 ./image_cloud_diag.py --url "https://www.yoox.com/images/items/37/37796884rx_14_f.jpg" --date 2022-08-04 --domain www.yoox.com node image-cloud.picalike.corpex-kunden.de:54321 is responsible for https://www.yoox.com/images/items/37/37796884rx_14_f.jpg connecting to node to retrieve image information... 2022-08-04 07:44 | 3 | https://www.yoox.com/images/items/11/11468480DT_14_f.jpg domain analysis required 0.010085 secs for 1 URLs
The url is used to decide what node DB to query.
Check Domain Block State
python3 ./image_cloud_diag.py --url "https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756254/756254_01_r20.jpg" --date 2022-08-17 --domain www.ferragamo.com --check-block-state WARNING: the query is not covered by an index and is likely slow www.ferragamo.com: successful download at 2022-08-17 18:30 www.ferragamo.com: successful download at 2022-08-17 18:30 www.ferragamo.com: successful download at 2022-08-17 18:30 www.ferragamo.com: successful download at 2022-08-17 18:30 www.ferragamo.com: successful download at 2022-08-17 18:30
Reset Error Counters
python3 ./image_cloud_diag.py --url "https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756254/756254_01_r20.jpg" --reset-fail-count
There is a variation that allows to reset the counter for a domain combined with a time
python3 ./image_cloud_diag.py --url "https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756254/756254_01_r20.jpg" --reset-fail-count --domain www.ferragamo.com --date 2022-08-15
This resets all error counters for the given domain to zero with the restriction download_requested >= 2022-08-15