User Tools

Site Tools


you_api

Witt You API

Recommendations based on customer similarity for wwde.

git

https://git.picalike.corpex-kunden.de/smessuti/you_api

working version is in develop branch

Database

host: sandy.picalike.corpex-kunden.de
database: you_api

witt_products contains info for all products in the dataset

  • prod_id product id
  • image image url
  • ratings similar products used to calculate customer similarity
  • paid list of customers that purchased the product
  • free list of customers that did not pay for the product

witt_customers contains info about the customers

  • customer_id customer id
  • paid list of products purchased by the customer
  • free list of products the customer did not pay

witt_customer_similarity contains similar customers and products to be recommended

  • customer_id reference customer id
  • similar >=50 most similar customers, sorted and grouped by score
  • recommended 200 recommended products sorted by number of similar customers that bought them

Usage

Parameters

mandatory

  • y customer id
  • key shop key (this is currently ignored and only product from wwde are used)

result

  • limit max number of products returned
  • format json or txt

filter

  • b brand
  • color color
  • gender gender
  • size size (search in sizes string)
  • category category (search in category string)
  • price_from min price
  • price_till = max price

Customer Similarity

Given active customer C, we consider the set P of products purchased (paid and free) by C.
For each p in P we calculate the distance between p and every product in the shop.
We consider the 10 smallest distances s_0, s_1, …, s_9 (note that s_0 = 0) and for all products with distance s_i from p we assign a rating R_p(i) = 10-i.
These ratings are saved in the Mongo collection witt_products. For each customer D we calculate the similarity to C by summing the rating of the highest rated product they purchased (paid and free) for each product in P, if such a rating exists.

example

(for brevity suppose we only rate products with distance s_0 and s_1)

  • C has bought products 1, 2 and 3
  • D has bought products 1, 4, 5 and 7
  • ratings for product 1 is {1: 10, 33: 9, 467: 9}
  • ratings for product 2 is {2: 10, '6': 9}
  • ratings for product 3 is {3: 10, '5': 10, '4': 9}

The similarity of D to C is given by:

  • in the ratings for 1, D has 1 which gives score 10
  • in the ratings for 2, D has no products
  • in the ratings for 3, D has 4 and 5, so we take 5 because it has a higher score which gives 10
  • 1, 4 and 5 are added to the list of rated products which will not be included in the final result

Therefore we have score = 20 and count = 2 where count is the number of products from P that contributed to the score.

Recommendations

Once we have at least 50 most similar customers (we take all customers with the same score, so there could be more than 50 in the end), we look at all products bought by them which are not contained in the rated set. Such products are sorted according to the number of similar customers that bought them and the top 200 are saved in witt_customer_similarity.

When the API is called, the metadata for the recommended products is retrieved and those that pass the filters are returned ordered by their score.

you_api.txt · Last modified: 2024/04/11 14:23 by 127.0.0.1