====== Witt You API ====== Recommendations based on customer similarity for wwde. ===== git ===== https://git.picalike.corpex-kunden.de/smessuti/you_api working version is in develop branch ===== Database ===== host: **sandy.picalike.corpex-kunden.de**\\ database: you_api **witt_products** contains info for all products in the dataset\\ * ''%%prod_id%%'' product id * ''%%image%%'' image url * ''%%ratings%%'' similar products used to calculate customer similarity * ''%%paid%%'' list of customers that purchased the product * ''%%free%%'' list of customers that did not pay for the product **witt_customers** contains info about the customers\\ * ''%%customer_id%%'' customer id * ''%%paid%%'' list of products purchased by the customer * ''%%free%%'' list of products the customer did not pay **witt_customer_similarity** contains similar customers and products to be recommended\\ * ''%%customer_id%%'' reference customer id * ''%%similar%%'' >=50 most similar customers, sorted and grouped by score * ''%%recommended%%'' 200 recommended products sorted by number of similar customers that bought them ===== Usage ===== Requests should be sent to\\ http://frontend04-hpc.picalike.corpex-kunden.de:5000/you.php?param1=value1¶m2=value2...¶mn=valuen\\ method: GET ==== Parameters ==== **mandatory**\\ * ''%%y%%'' customer id * ''%%key%%'' shop key (this is currently ignored and only product from wwde are used) **result**\\ * ''%%limit%%'' max number of products returned * ''%%format%%'' json or txt **filter**\\ * ''%%b%%'' brand * ''%%color%%'' color * ''%%gender%%'' gender * ''%%size%%'' size (search in sizes string) * ''%%category%%'' category (search in category string) * ''%%price_from%%'' min price * ''%%price_till%%'' = max price ===== Customer Similarity ===== Given active customer ''%%C%%'', we consider the set ''%%P%%'' of products purchased (paid and free) by ''%%C%%''.\\ For each ''%%p%%'' in ''%%P%%'' we calculate the distance between ''%%p%%'' and every product in the shop.\\ We consider the 10 smallest distances ''%%s_0, s_1, …, s_9%%'' (note that ''%%s_0 = 0%%'') and for all products with distance ''%%s_i%%'' from ''%%p%%'' we assign a rating ''%%R_p(i) = 10-i%%''.\\ These ratings are saved in the Mongo collection witt_products. For each customer ''%%D%%'' we calculate the similarity to ''%%C%%'' by summing the rating of the highest rated product they purchased (paid and free) for each product in ''%%P%%'', if such a rating exists. === example === (for brevity suppose we only rate products with distance s_0 and s_1) * ''%%C%%'' has bought products 1, 2 and 3 * ''%%D%%'' has bought products 1, 4, 5 and 7 * ratings for product 1 is {1: 10, 33: 9, 467: 9} * ratings for product 2 is {2: 10, '6': 9} * ratings for product 3 is {3: 10, '5': 10, '4': 9} The similarity of ''%%D%%'' to ''%%C%%'' is given by: * in the ratings for 1, ''%%D%%'' has 1 which gives score 10 * in the ratings for 2, ''%%D%%'' has no products * in the ratings for 3, ''%%D%%'' has 4 and 5, so we take 5 because it has a higher score which gives 10 * 1, 4 and 5 are added to the list of rated products which will not be included in the final result Therefore we have score = 20 and count = 2 where count is the number of products from ''%%P%%'' that contributed to the score. ===== Recommendations ===== Once we have at least 50 most similar customers (we take all customers with the same score, so there could be more than 50 in the end), we look at all products bought by them which are not contained in the rated set. Such products are sorted according to the number of similar customers that bought them and the top 200 are saved in witt_customer_similarity. When the API is called, the metadata for the recommended products is retrieved and those that pass the filters are returned ordered by their score.