I am confrontated with two problems related to dropping data and merging sets.
1) In the table below, you see that there are several companys (like alphacrunch or betamaker) who reported monthly returns, but some of them like alphacrunch only reported until December of 1996 and stopped then. I would like to analyse only the returns of companys who reported constantly from Jan 1996 until Dec 2011. All others should be excluded from my set. So, I´d like to get rid off every company who did not report a return for every month in my observed period. Also I want to exclude the numbers before Jan 1996 and after Dec 2011. How can I put that? Is it a combination of the command "drop" with an "if-clause"?
2) In addition to that I have numerical series of risk factors (like Mom, RMW & CMA in table below) for the period Jan 1996 to Dec 2011 in an excel file. I´d like to merge this excel file with my stata document described above. There are no id´s in the excel file yet. My stata file above consists of thousands of id numbers like 5, 15 etc., whereas every id belongs to a specific company. The idea is to mix these risk factors with the returns of every single company i.e. for the same horizon Jan 1996 until Dec 2011, I´d like to connect the risk factors with the
returns. All in all, I need to do that for every company which means, I need to duplicate my excel file several (thousand) times? Is merge 1:m going to help in this case?
I am so grateful for any kind of help. Thanks a lot in advance.
1) In the table below, you see that there are several companys (like alphacrunch or betamaker) who reported monthly returns, but some of them like alphacrunch only reported until December of 1996 and stopped then. I would like to analyse only the returns of companys who reported constantly from Jan 1996 until Dec 2011. All others should be excluded from my set. So, I´d like to get rid off every company who did not report a return for every month in my observed period. Also I want to exclude the numbers before Jan 1996 and after Dec 2011. How can I put that? Is it a combination of the command "drop" with an "if-clause"?
id | year | month | return | company name |
5 | 1994 | 1 | 0.71 | alphacrunch |
5 | 1994 | 2 | 0.63 | alphacrunch |
5 | 1994 | 3 | 0.52 | alphacrunch |
… | … | … | … | … |
5 | 1996 | 12 | 0.43 | alphacrunch |
15 | 1994 | 1 | 0.32 | betamaker |
… | … | … | … | … |
15 | 2011 | 12 | 0.21 | betamaker |
2) In addition to that I have numerical series of risk factors (like Mom, RMW & CMA in table below) for the period Jan 1996 to Dec 2011 in an excel file. I´d like to merge this excel file with my stata document described above. There are no id´s in the excel file yet. My stata file above consists of thousands of id numbers like 5, 15 etc., whereas every id belongs to a specific company. The idea is to mix these risk factors with the returns of every single company i.e. for the same horizon Jan 1996 until Dec 2011, I´d like to connect the risk factors with the
returns. All in all, I need to do that for every company which means, I need to duplicate my excel file several (thousand) times? Is merge 1:m going to help in this case?
I am so grateful for any kind of help. Thanks a lot in advance.
Year | Month | Mom | RMW | CMA |
1996 | 1 | 0,56 | -0,56 | 2,26 |
1996 | 2 | 0,58 | 1,02 | -1,78 |
1996 | 3 | -1,89 | 1,4 | -0,95 |
… | … | … | … | … |
2011 | 11 | 2,73 | 1,94 | -2,24 |
2011 | 12 | 3,9 | 1,46 | 2,96 |