====== OnSight Analytics Interface ====== {{/dokuwiki/lib/images/smileys/icon_exclaim.gif|:!:}}**branch01 ist nur für die Zeit der Entwicklung die Heimat, keine Garantie für Uptime und Version**{{/dokuwiki/lib/images/smileys/icon_exclaim.gif|:!:}} Host: **http://branch01.picalike.corpex-kunden.de:9095/** Host: **http://branch01.picalike.corpex-kunden.de:9995/** (No timeout requests)(Please just one request per time) Beispiel: **http://branch01.picalike.corpex-kunden.de:9095/hello** ===== git ===== Im git zu finden als: $ git clone ssh://picalike@sg01.picalike.corpex-kunden.de/home/picalike/repositories/onsight_analytics.git ===== Allgemeines ===== ===== MongoDB-Collections ===== attributes (alle picalike-Attribute) categories (alle picalike-Kategorien) genders (alle picalike-Genders) **mit-Verwendung** sessions ===== Apis ===== http://:/ ==== Solr Search ==== === /update_solr_index (POST/json) === creates a new index in solr for shop_id or updates an existing one **in:** shop_id: session_history: Data used for update must have a session >= (latest_session - session_history) **out:** {} === /request_solr_index (POST/json) === make an search request to solr **in:** shop_id: or query: start_date: yyyy-mm-dd end_date: yyyy-mm-dd start: rows: sort: example for start and row: * start: 5 means you will skip the first 5 results * rows: 5 means you will get 5 results example for query_str: gender:"Damen" AND name:"Schal" AND price:[* TO 2000] example for sort: price desc price asc known field names for now: ''%%pid, price, name, text, brand, gender, images, picalike_gender, picalike_cat, shop_cat%%'' {{/dokuwiki/lib/images/smileys/icon_exclaim.gif|:!:}} some fields may be implemented in future **out:** {"message": , "response": {"result_list": [], "total_match": , "stats":{"price": {"min": , "max": , "mean": , "median": }}}} === /is_synchronized (POST/json) === checks if session number is the same for import and solr **in:** {"shop_ids": []} **out:** {: } ==== Shops ==== === /translate_id (POST) === translate apikey ↔ shop_id for feeds **in:** shop_id: OR apikey: **out:** {"message": , "response": { (apikey or shop_id)}} === /get_shops (GET) === returns all shop_ids (unique) of shops that have been crawled during last week (7*24*60*60 sec) ohne cache, direkte Anfrage an die mongoDB **in:** {} **in_opt:** start_date= (yyyy-mm-dd) with_feed= **out:** {"message": , "response": {"shop_ids": [], "infos":{ : {"segment": , "shop_name": , "type": }}}} === /get_shop_crawler_mapping (POST/json) === returns all crawler_shop_ids which are mapped to the shop_id **in:** shop_id: **out:** {"message: , "response": {"shop_ids": [], "infos":{ : {"segment": , "shop_name": , "type": }}}} === /remove_shop_crawler_mapping (POST/json) === remove all crawler_shop_ids which are mapped to the shop_id **in:** shop_id: **out:** {"message: , "response": {}} === /add_shop_crawler_mapping (POST/json) === update mapping of crawler_shop_ids to shop_id (will overwrite previous entries) writes to shop_crawler_mapping in mongoDB **in:** shop_id: crawler_shops: [] **out:** {"message: , "response": {}} === /add_inspiration_shops (POST/json) === update mapping of inspiration_shop_ids to shop_id (will overwrite previous entries) writes to shop_crawler_mapping in mongoDB **in:** shop_id: inspiration_shops: [] # list of shop_ids **out:** {"message: , "response": {}} === /get_inspiration_shops (POST/json) === returns all inspiration_shop_ids which are mapped to the shop_id **in:** shop_id: **out:** {"message: , "response": {"shop_ids": [], "infos":{ : {"segment": , "shop_name": , "type": }}}} === /get_shop_stats (POST/json) === returns stats for a shop **in:** shop_ids: [] include_unmapped: bool, max_age: **out:** {"message: , "response": { "total": # total number of mapped products "by_cat": # picalike_cat -> (absolute count, percent) # only if include_unmapped == True "unmapped_count": "unmapped": # -> }} ==== Products ==== === /get_product_history (POST/json) === returns history for given products **in:** picalike_ids: [] **out:** {"message": , "response": { : [{ "timestamp": , "price": , "sort_key": , "position": [{"cat": [], "pos": }, ...]}, ...], : [], ...} for each picalike_id, the array is sorted by “timestamp” price is in cents (int) ==== Diverse ==== === /get_attributes (GET) === returns all picalike attributes ohne cache, direkte Anfrage an die mongoDB **in:** {} **out:** {"message": , "response": {"attributes": []}} ==== Categories ==== === /get_categories (GET) === returns all picalike categories ohne cache, direkte Anfrage an die mongoDB **in:** {} **out:** {"message": , "response": {"categories": []}} === /get_shop_categories (GET) === return all categorie pathes from shop from last session erstmalige Anfrage mongoDB, danach cache, update des caches mit Parameter use_cache=false **in:** shop_id= **in_opt:** use_cache=false (default is true) session_history= Shown data must haven an session >= (latest_session - session_history) **out:** {"message": , "response": {"categories": [ list() ]}} list() liefert die Baumstruktur der Kategorien des Shops, z.B. [ “Damen”, “Hosen”, “Jeans” ] === /add_category_relation (POST/json) === add a shop_category → picalike_category relation **in:** shop_id: p_cat: shop_cat: siehe auch /get_categories und /get_shop_categories **out:** === /remove_category_relation (POST/json) === remove a shop_category → picalike_category relation **in:** shop_id: shop_cat: **out:** === /get_category_relation (GET) === get all relations for shop_id **in:** shop_id= **out:** {"message": , "response": [ tupel(, ) ]} Das Tupel wird in json als eine Liste mit zwei Elementen abgelegt. === /get_top_categories (GET) === get top categories for all shops in shop_id together **in:** shop_id=[] **in_opt:** top_n= **out:** {"message": , "response": {"n_products": , "top_cats": [ tuple(, ) ]}} Categories are ordered from most to least frequent. === /get_categorization (POST) === get picalike category suggestions for origin shop category **in:** shop_id: **in_opt:** shop_cat:[, ,...] # optional: default looks for all existing shop categories **out:** {"message": , "response": [{"shop_cat": "Damen_Hosen_Lange Hosen", "picalike_cat": "fashion_leg_pants", "rel_freq": 1.0}]} === http:%%//%%branch01.picalike.corpex-kunden.de:5003/prepare_categorization (GET) === prepares suggestions of picalike categories for origin shop categories based on the relative frequencies of the picalike categories in an origin shop category. Saves results in MongoDB: Sandy Picalike –> patrick_test –> categorization ==== Brands ==== === /get_shop_brands (POST) === return all brands from shop from last session **in:** shop_id: **out:** {"message": , "response": {"brands": [ ], "count":{: }}} ==== Genders ==== === /get_genders (GET) === returns all picalike genders ohne cache, direkte Anfrage an die mongoDB **in:** **out:** {"message": , "response": {"genders": []}} === /get_shop_genders (GET) === return all genders from shop from last session erstmalige Anfrage mongoDB, danach cache, update des caches mit Parameter use_cache=false **in:** shop_id= **in_opt:** use_cache=false (default is true) **out:** {"message": , "response": {"genders": [ ]}} === /add_gender_relation (POST/json) === add a shop_genders → picalike_genders relation **in:** shop_id: p_gender: shop_gender: **out:** === /remove_gender_relation (POST/json) === remove a shop_genders → picalike_genders relation **in:** shop_id: shop_gender: **out:** === /get_gender_relation (GET) === get all relations for shop_id **in:** shop_id= **out:** {"message": , "response": [ tupel(, ) ]} Das Tupel wird in json als eine Liste mit zwei Elementen abgelegt. ==== Reports ==== === /existing_report (GET) === checks if report name exists **in:** report_name= shop_id= **out:** {"message": , "response": {"exists": }} === /list_reports (GET) === return all reports for given product and shop **in:** shop_id= product_id= **out:** {"message": , "response": [{"report_id": , "user_id": , "report_name": }]} === /get_all_reports (POST) === return all reports that match the filters (only manually created reports) **in:** ALL OPTIONAL shop_id: user_id: product_id: date_from: # e.g.: "01/31/2019" date_till: **out:** {"message": , "response": [{"report_id": , "user_id": , "report_name": , ...}]} === /add_report (POST/json) === add a new report or edit if report ID is provided **in:** shop_id: product_id: user_id: report_name: filter: { see below } **in opt:** report_id: **out:** {"message": , "response": {"report_id": } **filter** explanation: cluster_dist: - similarity cluster_price_from - e.g. "10,00" cluster_price_till - e.g. "29,99" cluster_trendsetters cluster_competitors cluster_brands cluster_genders cluster_date_range - e.g. "01/13/2019 - 01/17/2019" cluster_categories {{/dokuwiki/lib/images/smileys/icon_exclaim.gif|:!:}} cluster_genders can contain the value “all” which needs special treatment === /remove_report (POST/json) === remove a report **in:** report_id: **out:** {"message": } === /get_report (GET) === get a report **in:** report_id= **out:** {"message": , "response": {"report_id": , "report_name": , "product_id": , "shop_id": , "user_id": , "filter": {}, "date": }} === /exclude_product (POST) === remove product from results **in:** {"report_id": , "picalike_id": , "user_id": } === /unexclude_product (POST) === remove product from exclude list **in:** {"report_id": , "picalike_id": } === /get_excluded (GET) === get list of excluded products **in:** report_id= **out:** {"message": , "response": {"excluded": [(, )]}} ==== Cluster ==== === /comp_cluster (POST/json) === creates all the data that is needed by comp_cluster.php. It is also used to create automatic reports **in:** shop_id: prod_id: or image: optional: report_id: , if given, the report is stored immediately price: # only needed if image is given start: rows: limit: -- used for nearest neighbor search filter fields with the same name and in the same format as in /add_report **out:** shop_id: prod_id: # None if image was supplied reco_data: currently: { min_price, max_price, mean_price, ref_price } reco: { timestamp, error, uncertain, huge_markup, price_range, margin_range } cluster_trend: excluded: total_match: - total number of solr matches cluster: if **report_id** is supplied, the report will be saved on the fly **reco_data** is a dictionary with the required data to calculate the recommendation. Currently the fields //min_price, max_price, mean_price, ref_price// are needed. **reco** contains at least //timestamp// and //error//. //error// can be either a boolean or a string. //uncertain// is a boolean that is set to True if the algorithm does not want to give a recommendation. //huge_markup// is a boolean that is set to True if the price of the reference article is significantly below the mean price in the cluster. //price_range// returns a list of prices [low, high]. //margin_range// is the relative price increase in percent [low, high] **cluster_trend** indicates whether the products in the cluster of similiar products from trendsetter shops have recently been more (close to 1) or less (near -1) popular. A value of 0 indicates that there was no change in popularity in recently. The value 1 indicates a fast growing popularity, the value 0 predicts a negativ trend. **cluster** contains a list of similar products. The documents are the solr responses with the additional fields: //distance//, //prod_trend// and //prod_history//. The reference product is always at the beginning of the list. Data generated from similiar products from shop competitors. === /get_cluster_report (POST/json) === returns the most recent results of the batch processing of the cluster reports **in:** report_id: - can be a report_id or a picalike_id **out:** data: - a merge of 'reco_data' and 'reco' from /comp_cluster ==== Trends ==== === http:%%//%%branch01.picalike.corpex-kunden.de:5002/prepare_trends (GET) === ONLY USE DAILY - PROCESS TAKES MORE THAN 30 MIN Receives a get requests and downloads the product infos from solr and from the API “http://branch01.picalike.corpex-kunden.de:9095/get_product_history” to find trendy products. It then forecast the future trend with double double_exponential_smoothing and saves its results in the MongoDB === /sort_trend_products (POST/json) === returns the most trendy products from the recent trend analysis from the MongoDB only looks into products from shops, that are mapped for the specific shop **in:** shop_id: # only feed shops accepted optional: limit: # default: 10 category: [] # default: [] gender: [] # default: ["Men", "Women", "undefined"] --> some products do not have gender info selected_shops:[] # default: [] --> looks for all shops **out:** {"message": , "response": [{"product_id":, "category":,"datetime":,"gender":,"pos_mov":,"shop_id":}] === /sort_trend_brands (POST/json) === returns the most trendy brands from the recent trend analysis from the MongoDB only looks into products from shops, that are mapped for the specific shop **in:** shop_id: # only feed shops accepted optional: limit: # default: 10 category: [] # default: [] gender: [] # default: ["Men", "Women", "undefined"] --> some products do not have gender info selected_shops:[] # default: [] --> looks for all shops **out:** {"message": , "response": [{"brand":, "category":[],"datetime":,"gender":[],"pos_mov":,"shop_id":[]}] === /sort_trend_category (POST/json) === returns the most trendy categories from the recent trend analysis from the MongoDB only looks into products from shops, that are mapped for the specific shop **in:** shop_id: # only feed shops accepted optional: limit: # default: 10 gender: [] # default: ["Men", "Women", "undefined"] --> some products do not have gender info selected_shops:[] # default: [] --> looks for all shops **out:** {"message": , "response": [{"category":, "datetime":","gender":[],"pos_mov":,"shop_id":[]}] === /sort_trend_colors (POST/json) === returns the most trendy colors from the recent trend analysis from the MongoDB only looks into products from shops, that are mapped for the specific shop **in:** shop_id: # only feed shops accepted optional: limit: # default: 10 category: [] # default: [] gender: [] # default: ["Men", "Women", "undefined"] --> some products do not have gender info selected_shops:[] # default: [] --> looks for all shops **out:** {"message": , "response": [{"color":, "category":[],"datetime":,"gender":[],"pos_mov":,"shop_id":[]}] === /sort_trend_attributes (POST/json) === returns the most trendy attributes from the recent trend analysis from the MongoDB only looks into products from shops, that are mapped for the specific shop **in:** shop_id: # only feed shops accepted optional: limit: # default: 10 category: [] # default: [] gender: [] # default: ["Men", "Women", "undefined"] --> some products do not have gender info selected_shops:[] # default: [] --> looks for all shops **out:** {"message": , "response": [{"attribute":, "category":[],"datetime":,"gender":[],"pos_mov":,"shop_id":[]}] === /get_sim_trend_products (POST/json) === looks for products in feed shop that are similiar to the given trend product API is currently used by /sort_sim_trend_products **in:** {"shop_id_feed":, "shop_id_crawler":, "product_id":, "distance_max": # optional: limits similiarity distance, default is 0.5 } **out:** {"message":"OK","response":{ "product_ids:["96607881#bonprix_de_feed, "91524296#bonprix_de_feed", "90838081#bonprix_de_feed", "95056195#bonprix_de_feed", "95020895#bonprix_de_feed"] } } === http:%%//%%frontend01-hpc.picalike.corpex-kunden.de:5003/get_cat_trends (GET) Port 5003 === Returns the most trendy products from a category of one shop. Only returns available products. Products sorted in descending order by cluster_trend. **in:** shop_id: prod_id: # Becomes optional, if "i" as prod_ref is given as argument optional: limit: # default: 10 brand: or [, ,...] # default: all shop_cat: or [, ,...] # default: all gender: or [, ,...] # default: all size: or [, ,...] # CURRENTLY NOT POSSIBLE TO FILTER. default: all max_price: # default: no max min_price: # default: no min category_limitation: True/False # default: True i: # if "i" is set, gender and category are same as given picalike_id. Default: None **out:** {"count": 10, "description": "Category Trends", "generator": "http://picalike.com", "modified": "2019-04-29 08:24:04.480205", "title": "picalike Request", "ids": {"0": {"brand": "NO", "cluster_trend": 0.0, "extraimg": "https://www.witt.eu/product/resized/029/029.00K3F.072-127.002.u_5.jpg", "gender": "Damenmode", "shop_cat":"Frauen_Bekleidung_Hosen_Sweatpants", "id": "184097", "img": "https://www.witt.eu/product/resized/027/027.00KAT.022-123.013.i_5.jpg", "location": "https://www.witt-weiden.de/287202?articleNumber=184097", "name": "Hose", "price": 1599}, "1":{...}} } ==== Krawla ==== === http:%%//%%branch01.picalike.corpex-kunden.de:9095/get_krawla_sum (GET) === **in:** shop_id: # only feed shops accepted **out:** {"message": , "response": {"w1": {"total_all" : , "total_new_prod" : , "total_sale" : }, "w2": {"total_all" : , "total_new_prod" : , "total_sale" : }, "w3": {"total_all" : , "total_new_prod" : , "total_sale" : }, "w4": {"total_all" : , "total_new_prod" : , "total_sale" : }, "w_all" : {"total_all" : , "total_new_prod" : , "total_sale" : } } }