Table of Contents

Image Cloud: Diagnostics

The decision to create a new wiki site was mainly due to the fact that the original image cloud site mixes and and all versions and is thus not consistent.

Check the status of an image

git repo: swiss army knive: https://git.picalike.corpex-kunden.de/incubator/swiss-army-knife/-/tree/master/image_cloud

python3 ./image_cloud_diag.py --url  https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756606/756606_01_r20.jpg

url: https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756606/756606_01_r20.jpg
download requested: 2022-07-22 22:27
download started  : 2022-08-17 17:30
download finished : 2022-08-17 17:30
content length: 20741
fail count: 2
fire+forget: False

The result is useful to decide in what state an image is. if content length is greater than zero, the image has been finally downloaded.

But since status codes are not kept, it cannot be inferred why a download failed. Maybe a time-out or temporary blocking.

Domain Analysis

It is possible to get an overview of all failed downloads per domain, but only if it was scheduled with the fire+forget worker(!):

python3 ./image_cloud_diag.py --url "https://www.yoox.com/images/items/37/37796884rx_14_f.jpg" --date 2022-08-04 --domain www.yoox.com

node image-cloud.picalike.corpex-kunden.de:54321 is responsible for https://www.yoox.com/images/items/37/37796884rx_14_f.jpg
connecting to node to retrieve image information...
2022-08-04 07:44 |  3 | https://www.yoox.com/images/items/11/11468480DT_14_f.jpg
domain analysis required 0.010085 secs for 1 URLs

The url is used to decide what node DB to query.

Check Domain Block State

python3 ./image_cloud_diag.py --url "https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756254/756254_01_r20.jpg" --date 2022-08-17 --domain www.ferragamo.com --check-block-state

WARNING: the query is not covered by an index and is likely slow
www.ferragamo.com: successful download at 2022-08-17 18:30
www.ferragamo.com: successful download at 2022-08-17 18:30
www.ferragamo.com: successful download at 2022-08-17 18:30
www.ferragamo.com: successful download at 2022-08-17 18:30
www.ferragamo.com: successful download at 2022-08-17 18:30

Reset Error Counters

python3 ./image_cloud_diag.py --url "https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756254/756254_01_r20.jpg" --reset-fail-count

There is a variation that allows to reset the counter for a domain combined with a time

python3 ./image_cloud_diag.py --url "https://www.ferragamo.com/wcsstore/FerragamoCatalogAssetStore/images/products/756254/756254_01_r20.jpg" --reset-fail-count --domain www.ferragamo.com --date 2022-08-15 

This resets all error counters for the given domain to zero with the restriction download_requested >= 2022-08-15