Hello everyone,
First time on the forum.
Any suggestions are welcome.
I have a strongly balanced panel dataset with 600 different products over a time period of 50 months. All products have 3 characteristics (type,size and taste) as individual variables.
Furthermore I have revenue, amount and price for all products. What I am looking for is to group products by characteristics, so all normal apple pie together, all normal pear pie, etc.
So all different combinations of type, size and taste grouped in a single entry for each possible combination while keeping the time periods intact. In addition i want to have new id's and new product_id's to be created as well. The new prices should be reflecting all prices of products with the same characteristics using revenue as weights to calculate them.
So new price (normal apple pie, 2009m2) = (10*100+5*10) / (100+10)
Below is a small sample of the data i have, I show 4 products over 3 periods for simplicity:
And what I like it to be afterwards:
Any suggestions are welcome on how to tackle this problem, thank you in advance.
Johan
First time on the forum.
Any suggestions are welcome.
I have a strongly balanced panel dataset with 600 different products over a time period of 50 months. All products have 3 characteristics (type,size and taste) as individual variables.
Furthermore I have revenue, amount and price for all products. What I am looking for is to group products by characteristics, so all normal apple pie together, all normal pear pie, etc.
So all different combinations of type, size and taste grouped in a single entry for each possible combination while keeping the time periods intact. In addition i want to have new id's and new product_id's to be created as well. The new prices should be reflecting all prices of products with the same characteristics using revenue as weights to calculate them.
So new price (normal apple pie, 2009m2) = (10*100+5*10) / (100+10)
Below is a small sample of the data i have, I show 4 products over 3 periods for simplicity:
id | product_id | period | type | size | taste | revenue | amount | price |
1 | 1 | 2009m2 | pie | normal | apple | 100 | 10 | 10,00 |
2 | 1 | 2009m3 | pie | normal | apple | 120 | 11 | 10,91 |
3 | 1 | 2009m4 | pie | normal | apple | 90 | 8 | 11,25 |
4 | 2 | 2009m2 | pie | normal | pear | 50 | 4 | 12,50 |
5 | 2 | 2009m3 | pie | normal | pear | 55 | 5 | 11,00 |
6 | 2 | 2009m4 | pie | normal | pear | 60 | 5 | 12,00 |
7 | 3 | 2009m2 | pie | normal | apple | 10 | 2 | 5,00 |
8 | 3 | 2009m3 | pie | normal | apple | 20 | 4 | 5,00 |
9 | 3 | 2009m4 | pie | normal | apple | 20 | 3 | 6,67 |
10 | 4 | 2009m2 | cake | big | apple | 200 | 10 | 20,00 |
11 | 4 | 2009m3 | cake | big | apple | 210 | 10 | 21,00 |
12 | 4 | 2009m4 | cake | big | apple | 220 | 11 | 20,00 |
And what I like it to be afterwards:
new_id | new_product_id | period | type | size | taste | price |
1 | 1 | 2009m2 | pie | normal | apple | 9,55 |
2 | 1 | 2009m3 | pie | normal | apple | 10,06 |
3 | 1 | 2009m4 | pie | normal | apple | 10,42 |
4 | 2 | 2009m2 | pie | normal | pear | 12,50 |
5 | 2 | 2009m3 | pie | normal | pear | 11,00 |
6 | 2 | 2009m4 | pie | normal | pear | 12,00 |
7 | 3 | 2009m2 | cake | big | apple | 20,00 |
8 | 3 | 2009m3 | cake | big | apple | 21,00 |
9 | 3 | 2009m4 | cake | big | apple | 20,00 |
Any suggestions are welcome on how to tackle this problem, thank you in advance.
Johan