====== Product Trend Calculator ====== Calculates trend score for products based on their position history. Maintainer: Silvia ===== Host ===== http://pci01.picalike.corpex-kunden.de:8004 ===== Git ===== https://git.picalike.corpex-kunden.de/picalike/product_trend_calculator.git ===== Database ===== Host: **osa_data**\\ DB: **meta_db** reads from: * meta_db_collection (products with given ''%%shop_id%%'' and ''%%last_visit%%'' not older than 30 days) * categories (categories with given ''%%shop_id%%'' and ''%%session%%'' not older than 30 days) * history (for relevant products and relevant sessions) writes into: * product_trends (unique key: (''%%picalike_id%%'', ''%%sesssion%%'')) ===== Usage ===== The product trend calculator receives commands and sends stats to the [[shop_conveyor_belt|shop conveyor belt]] through port [tdb]. It follows the feed import in the pipeline. ===== Relevant Data ===== The trend score is calculated for all products in ''%%meta_db_collection%%'' that have ''%%last_visit%%'' not older than 30 days. Likewise, only sessions not older than 30 days are taken into account for position history and category size. ===== Weights ===== ==== Categories ==== Categories are weighted accordingly to their size in the last relevant session (i.e., the latest session in which the category appears) and the sum of the elements for all categories in their last relevant session. === Example === == Session 1 == category A: 10 items\\ category B: 5 items == Session 2 == category A: does not appear\\ category B: 10 items\\ category C: 20 items == Relevant sessions == category A: session 1 (10 items)\\ category B: session 2 (10 items)\\ category C: session 2 (20 items) == Category weights == category A: 10/40 = 0.25\\ category B: 10/40 = 0.25\\ category c: 20/40 = 0.5 ==== Sessions ==== Sessions are weighted according to their distance to the current session measured in weeks as exp(-weeks) and normalized. === Example === == Session 0 == position: 0.9\\ weeks since current session: 1.5\\ time weight: exp(-1.5) = 0.2\\ result: 0.9*0.2 = 0.18 == Session 604800 == position: 0.4\\ weeks since current session: 0.5\\ time weight: exp(-0.5) = 0.6\\ result: 0.4*0.6 = 0.24 == Session 907200 (current) == position: 0.6\\ weeks since current session: 0\\ time weight: exp(0) = 1\\ result: 0.6*1 = 0.6 == Time weighted position == (0.18 + 0.24 + 0.6)/(0.2 + 0.6 + 1) = 1.02/1.8 = 0.57 ===== Score ===== The product trend score is given by the sum of the category scores divided by the sum of the category weights for all categories in which the product appears in the relevant sessions.\\ For a given category, the category score is given by the category weight multiplied with the time weighted position. === Example === == Category A == category weight: 0.4\\ time weighted position: 0.5\\ category score: 0.4 * 0.5 = 0.2 == Category B == category weight: 0.2\\ time weighted position: 0.9\\ category score: 0.2 * 0.9 = 0.18 == Product trend score == sum(category scores)/sum(category weights): (0.2 + 0.18)/(0.4 + 0.2) = 0.38/0.6 = 0.63 ===== Possible problems ===== Too big chunksize can lead to memory problems.