Table of Contents
Witt You API
Recommendations based on customer similarity for wwde.
git
https://git.picalike.corpex-kunden.de/smessuti/you_api
working version is in develop branch
Database
host: sandy.picalike.corpex-kunden.de
database: you_api
witt_products contains info for all products in the dataset
prod_id
product idimage
image urlratings
similar products used to calculate customer similaritypaid
list of customers that purchased the productfree
list of customers that did not pay for the product
witt_customers contains info about the customers
customer_id
customer idpaid
list of products purchased by the customerfree
list of products the customer did not pay
witt_customer_similarity contains similar customers and products to be recommended
customer_id
reference customer idsimilar
>=50 most similar customers, sorted and grouped by scorerecommended
200 recommended products sorted by number of similar customers that bought them
Usage
Requests should be sent to
http://frontend04-hpc.picalike.corpex-kunden.de:5000/you.php?param1=value1¶m2=value2...¶mn=valuen
method: GET
Parameters
mandatory
y
customer idkey
shop key (this is currently ignored and only product from wwde are used)
result
limit
max number of products returnedformat
json or txt
filter
b
brandcolor
colorgender
gendersize
size (search in sizes string)category
category (search in category string)price_from
min priceprice_till
= max price
Customer Similarity
Given active customer C
, we consider the set P
of products purchased (paid and free) by C
.
For each p
in P
we calculate the distance between p
and every product in the shop.
We consider the 10 smallest distances s_0, s_1, …, s_9
(note that s_0 = 0
) and for all products with distance s_i
from p
we assign a rating R_p(i) = 10-i
.
These ratings are saved in the Mongo collection witt_products. For each customer D
we calculate the similarity to C
by summing the rating of the highest rated product they purchased (paid and free) for each product in P
, if such a rating exists.
example
(for brevity suppose we only rate products with distance s_0 and s_1)
C
has bought products 1, 2 and 3D
has bought products 1, 4, 5 and 7- ratings for product 1 is {1: 10, 33: 9, 467: 9}
- ratings for product 2 is {2: 10, '6': 9}
- ratings for product 3 is {3: 10, '5': 10, '4': 9}
The similarity of D
to C
is given by:
- in the ratings for 1,
D
has 1 which gives score 10 - in the ratings for 2,
D
has no products - in the ratings for 3,
D
has 4 and 5, so we take 5 because it has a higher score which gives 10 - 1, 4 and 5 are added to the list of rated products which will not be included in the final result
Therefore we have score = 20 and count = 2 where count is the number of products from P
that contributed to the score.
Recommendations
Once we have at least 50 most similar customers (we take all customers with the same score, so there could be more than 50 in the end), we look at all products bought by them which are not contained in the rated set. Such products are sorted according to the number of similar customers that bought them and the top 200 are saved in witt_customer_similarity.
When the API is called, the metadata for the recommended products is retrieved and those that pass the filters are returned ordered by their score.