Hi everyone,
I do have a problem with Stata executing my intended rolling window regression and do really hope that some of you can (again) help me. Unfortunately, prior threads, the help files and other search did not resolve my issue.
What I have
Unbalanced panel data for monthly fund returns (crossectional identifier: fund_id; timeseries identifier: month_id) & risk factor realizations; while the latter part is complete, there are frequent gaps in monhtly returns for funds (gaps either 1 month or several months)
Stylized data extract:
You can see that for fund 1 there is return data missing between 2005m7 and 2006m3 (and also 2006m5 is missing); as I want this post to be readable I shortened the data extract, but there is no fund for which there isn’t at least 48 months of return observations, which can and often do include gaps; risk factors are made up for this example
What I want Stata to do
Run a rolling regression over the last 60 months; if less than 60 months of previous data are available for a specific fund-month, I require the fund to be in the sample for at least 48 months in the previous 60 months, and then run the regression with the available data.
What I tried
As always, many thanks in advance
Jan
I do have a problem with Stata executing my intended rolling window regression and do really hope that some of you can (again) help me. Unfortunately, prior threads, the help files and other search did not resolve my issue.
What I have
Unbalanced panel data for monthly fund returns (crossectional identifier: fund_id; timeseries identifier: month_id) & risk factor realizations; while the latter part is complete, there are frequent gaps in monhtly returns for funds (gaps either 1 month or several months)
Stylized data extract:
:
clear input byte fund_id str10 month_id double return double market size value 1 "2005m1" 0.02 5 3 -2 1 "2005m2" 0.01 3 -2 4 1 "2005m3" 0.03 4 5 1 1 "2005m4" 0.03 2 5 1 1 "2005m5" 0.02 -1 2 3 1 "2005m6" 0.01 2 3 4 1 "2005m7" 0.04 1 2 1 1 "2006m3" 0.08 2 4 2 1 "2006m4" 0.04 3 2 1 1 "2006m6" 0.08 1 5 2 2 "2006m1" 0.03 -2 -2 3 2 "2006m1" 0.04 -1 -2 1 end
What I want Stata to do
Run a rolling regression over the last 60 months; if less than 60 months of previous data are available for a specific fund-month, I require the fund to be in the sample for at least 48 months in the previous 60 months, and then run the regression with the available data.
What I tried
- First, I declared the data as panel by xtset fund_id month_id, then filled the gaps by tsfill – where clearly the return data is still missing for the filled observations
- I created identifier via tsspell telling me that there are 48months of continuous return [cret48; returns 1 if 48 continuous months of return] information or 60 months [cret60; see cret48] respectively – I guess these identifiers might be obsolete if anyone came up with a running solution, but I used them down below
- rolling
:
rolling, window(60) reject(cret<=60 & _seq >= 48 & _seq <=60): xtreg return market size value
- BUT: this did not work; I tried to use the reject option to tell Stata, only to regress if consecutive return observations count at least 60; admittedly, this does not resolve the issue with the 48 out of 60 in case that 60 is not reached. Did I misuse the reject option? How could I qualify the second condition?
- rollreg (note: this is a user-written command on ssc); as I didn’t find a solution with the regular stata rolling, I tried to use this user written command
:
rollreg return market size value if fund_id[_n] == fund_id[_n-1] & cret60 == 1 & _seq >=48, move(60) stub(a)
- As you might imagine, this didn’t work either. I got the information that I do have gaps though I actually filled the gaps. I guess this is actually caused by the gaps being missing values as there is no return information and interpolation wouldn’t make any sense.
As always, many thanks in advance
Jan