Table of Contents

Trend Analyzer

Purpose of this app is to extract and aggregate trend information for attributes from the many cluster we have. Attributes can be color, brand, attribute in general. With the results we can tell how e.g. the trend score of the attribute “red” is.

Maintainer: Patrick

Host

http://pci01.picalike.corpex-kunden.de:1113/

Git

https://git.picalike.corpex-kunden.de/picalike/trend_analyzer_api

Database

reads from

Host: Live Solr
Zookeeper_Client necessary: solr01.picalike.corpex-kunden.de:2185,solr02.picalike.corpex-kunden.de:2186,sg03.picalike.corpex-kunden.de:2187

writes into

Host: OSA Database
DB: trends
Collection: trends (unique key: (shop_id, picalike_cat, attribute, gender))

Input to Start Calculation

Endpoint: pci01.picalike.corpex-kunden.de:1113/start
The application expects a post request with the following parameters:

Upload Output to Mongo

    "trend_id":"{}#{}#{}#{}".format(shop_id,cat,element, gender),
                      "shop_id":shop_id,
                      "picalike_cat":cat,
                      "attribute":element,
                      "position":current_position,
                      "type":"{}".format(element_type),
                      "position_hist": position_hist,
                      "cluster_trend":cluster_trend,
                      "timestamp": time.time(),
                      "gender": gender

Input

Endpoint: pci01.picalike.corpex-kunden.de:1113/get_trends
The application expects a post request with the following parameters:

Output Example

 {'results': [{'attribute': 'Cashmere Victim',
 'position_hist': {'5': 0.8240069150924683, '6': 0.8240069150924683},
 'cluster_trend': 0.8240069,
 'position': 0.25751608239999996},
{'attribute': 'Clamp',
 'position_hist': {'3': 0.8273629397153854,
  '5': 0.7457152545452118,
  '6': 0.7457152545452118,
  '2': 0.8651039004325867},
 'cluster_trend': 0.745715242,
 'position': -404.0},
 'status': 200,
 'msg': 'Found 6086 documents for s24_de_feed. This was a trend analysis on attribute'}

How the calculation works

With the shop_id as reference the app filters docs in solr using its defined trendsetter shops. From the search results the app collects each products attributes and its trend_score from solr and aggregates the overall trend_score for each attribute combination.

All product metadata are collected in the following data structure:

 container = {picalike_cat:
                 {attribute:
                      {gender:
                          {position_hist:
                              {calendar_week1: [past_positions],
                              calendarweeek2: [past_positions], ...},
                           position: [current_positions],
                           cluster_trend: [current_cluster_trend]
                           }
                       }
                  }
             }
             

During the calculations the last 30 days were considered.If the requested shop_id is a feed_shop, then the position information will not be considered. If the requested shop_id is a crawler_shop, then the cluster_trend information will be seen as null.

= attribute combination = This kind of structure was chosen so that you can count the many trend_score in a list for each attribute combination and calculate the average. An attribute combination consist the picalike_cat, attribute_name and gender, which also describes the combinations id (picalike_cat#attribute_name#gender).

Every product metadata will be processed into a new attribute combination, if the combination already exists, only the position/cluster_trend scores will be appended.

After the collection of all the positions or cluster_trends of each attribute combination (depending on crawler/feed shop), the average value of each will be determined.