Quantcast
Channel: Statalist
Viewing all 73331 articles
Browse latest View live

Use of the conditional market model with panel data stock returns

$
0
0
Dear all,

For a research we have to use the conditional market model for our event study on earnings announcements. We have daily stock return data of 1400 companies for a 6 year period. How can we make a panel data regression based on the daily stock returns in Stata? We want to regress the market return, a dummy for the event window and several control variables on the daily return. So we can calculate the beta for the whole sample. Any help would be much appreciated!

Thanks

Max


Large sample and xtabond2

$
0
0
Hello Statalisters,
I have a dynamic panel model and I'm testing for mortality persistence. I am working with 5,563 municipalities for a period of 8 years and I am using xtabond2 in Stata13.
The coefficient of my lagged dependent variable is around 0.0200 (0.056).
After sample selection, I estimated the same model for 730 municipalities (T = 8). The coefficient of lagged dependent variable is around 0.350 (0.032).
Could the large sample explain the small lagged dependent variable coefficient?

Once I have small lagged dependent variable coefficient, can I use xtabond2? (I read that if I have weak persistence I should use xtabond, but my results are better under System GMM).


Thanks for your time.

Reshape wide - coordinates level data

$
0
0
I have the following monthly data by coordinates (below). The data is by longitude and latitude, how can I reshape the data into wide form so that each column is a coordinate and each row is a month?
coordinate1 coordinate2 coordinate3...
Month1
Month2
Month3
.
.
.
Month12

I have tried to create an ID for each coordinate but ran intro trouble because of the negative sign. Is there a way around it? Thank you!

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 iso3 int lon byte(lat month) float precip
"AFG" 61 32  1   .9966658
"AFG" 61 32  2  .15671173
"AFG" 61 32  3   .2424774
"AFG" 61 32  4          0
"AFG" 61 32  5          0
"AFG" 61 32  6          0
"AFG" 61 32  7          0
"AFG" 61 32  8          0
"AFG" 61 32  9          0
"AFG" 61 32 10          0
"AFG" 61 32 11    .121536
"AFG" 61 32 12   .3886606
"AFG" 61 33  1   1.133129
"AFG" 61 33  2   .3552083
"AFG" 61 33  3   .3072774
"AFG" 61 33  4    .008316
"AFG" 61 33  5          0
"AFG" 61 33  6          0
"AFG" 61 33  7          0
"AFG" 61 33  8 .000836129
"AFG" 61 33  9    .048312
"AFG" 61 33 10 .016165162
"AFG" 61 33 11    .190656
"AFG" 61 33 12   .6389768
"AFG" 61 34  1  1.2373316
"AFG" 61 34  2   .6235324
"AFG" 61 34  3   .3563652
"AFG" 61 34  4    .017568
"AFG" 61 34  5          0
end

Help needed with output file viewing

$
0
0
Dear Statalist,

I have a problem in viewing the output file.
I am using checkrob command. After running this command I got this answer:

See the help file for explanations.
Number of core variables: 1; number of testing variables: 4 (=16 regressions)
Output file: res13.txt; Table file: table_res13.txt

How can I see the output file in Stata? what command should I use? I would like to see it in a table format. After using File=>Log=> View... I got a viewer window for the txt file with strings of numbers containing the results from checkrob. I would like to see it in a table format, but unfortunately I did not find any suitable command. One of the many that I tried is Type name of the output file.txt and the list, but I get something un-usable.
This is what I got:

. type res13.txt
no,b_D.logexg,se_D.logexg,b_D.logcrude,se_D.logcru de,b_D.logvatg,se_D.logvatg,b_D.logintvatexg,se_D. logintvatexg,b_i.id2,se_i.id2
1, .0734114618019212, .0529218264754015, ., ., ., ., ., ., ., .
2, .0724448110336447, .0519790060301233, .0744026301566806, .0070302039210622, ., ., ., ., ., .
3, .0734115377032312, .0529265181600998, ., ., .0030594376607996, .0180456194603842, ., ., ., .
4, .0724448702552365, .0519836093354547, .0744008815699698, .0070310547509631, .0014713938574941, .0160651014749696, ., ., ., .
5, .0127503809487351, .07838689156684, ., ., ., ., -.0373886276998492, .0320599679586234, ., .
6, .0195641291266282, .0746259882311454, .0743668835219694, .0070314548691679, ., ., -.0325934431892098, .0291530005111542, ., .
7, -.4970203770241216, .2426697069278353, ., ., -.1821802827021407, .0681808590125899, -.3515844815964865, .1375471846816433, ., .
8, -.4614933178053787, .2309786678587035, .074246001325228, .0070323564643987, -.1719148510268983, .0657187095219133, -.3290928920777571, .129847340
> 7096241, ., .


Would you please help me in using the right commands to see it in a table format in Stata or at least the table file table_res13.txt?
It is very frustrating when Stata becomes too complicate and user unfriendly.
Many thanks for your time.

Loop over csv files

$
0
0
Apparently my objective is easy, but isn't working. I have a lot of csv files and I want to make a macro to append each one.
So, I make the following code:

foreach base in "filiados_sd_al" "filiados_rede_al" "filiados_pv_al" "filiados_ptn_al" {
import delimited "\\fs-eesp-01\EESP\Usuarios\andre.pruner\Filicacao Partidaria\Dados Filiados\Alagoas\`base'.csv", delimiter(";")
tempfile `base'
save `base', replace
append using `base'
}

doesnt work, so stata say me file not found. I verified the names of the files and all be correct, this does not seem to be the problem, but this simple code still does not work.
Someone can help me about that?

changing base category in interactions, testparm

$
0
0
Hi,

I am currently running a regression of the form:

regress loghrlwage women vzerf women#c.vzerf tzerf women#c.tzerf alerf women#c.alerf ///
yeduc women#c.yeduc i.indzweig women#indzweig i.brfsgruppe women#brfsgruppe pgerwzt ///
women#c.pgerwzt, cluster(pid)

My primary aim is to test whether there are significant differences for women and men in the coefficients in a regression of the log hourly wage.
Normally, I would estimate two models for males and females separately and do a Chow test, but this is not possible because of the cluster-option.
That's why I estimated the model as described above including the interactions of women with all other regressors.

First of all, I would like to know, whether there is a possibility of defining the base group of the interaction terms. Using the i. option, STATA automatically sets the first category of indzweig as
well as the first category of brfsgruppe as base category, which is perfectly fine. But in the interaction terms women#indzweig and women#brfsgruppe STATA now omits the last categories of indzweig and brfsgruppe instead of the first. How can I change that?

Secondly, I wonder whether it is correct to test the significance of the interaction terms via the testparm command:

testparm women women#c.vzerf women#c.tzerf women#c.alerf women#c.yeduc women#indzweig ///
women#brfsgruppe women#c.pgerwzt

As a result I get a F statistic of 9.02 and Prob>F = 0.000.
Am I right in concluding that there is a significant influence of the variable "women" as well as of the interactions of women with the other regressors and, consequently, that there are different constants as well as different regression coefficients for males and females in regressing the log hourly wage?

Thanks a lot for your help!

Ally

Addressing variability in -metan-

$
0
0
I am comparing several studies with continuous data using -metan-. I'm taking the six-variable approach (n-exp, mean-exp, sd-exp, n-control, mean-control, sd-control) as follows:

metan nia blia sdia niv bliv sdiv, nowt nobox label(namevar=author, yearvar=year)

Unfortunately, my studies have relatively high heterogeneity (I2). I believe this to be a function of between-study variations in male-female balance. I've tried using -metareg- (for the first time):

metareg blia, wsse(iawsse)

But the I2 I get differs significantly from that given by -metan-. I'd like to use -metareg- to adjust for sex:

metareg blia male, wsse(iawsse)
...but I don't feel comfortable moving ahead with that until I get matching baseline I2 estimates. Any ideas what I might be doing wrong?


Thanks in advance!

returning specific values from a range of values

$
0
0
Hi there

I am using the following code to create a new variable (newvar).

gen newvar = 0
foreach var of varlist a1-a18 {
replace newvar = 1 if `var' >= 1200 & `var' <= 1400
}

So, my newvar takes value 1 if variables a1-a18 contain any value from the range 1200-1400.

For each observation that now has value 1 for newvar, I would like to know which value from the range 1200-1400 in variables a1-a18 made them "eligible" for inclusion in the newvar=1 group. I would therefore like to create another variable that returns that value for each observation where newvar = 1.

I hope that makes sense. Any suggestions greatly appreciated.

Thanks


Problem of Missing Values in daily stock returns..

$
0
0
I'm trying to calculate stocks daily returns in stata using daily closing price. However, I'm getting the calculated returns with many missing values, not just a missing value each new time series as I have defined id for the observations.
I have applied this command for daily returns
g daily_returns = ln( closing_rate) - ln(l. closing_rate)
and it is creating too many missing values, more than 50% of the total returns are missing (missing values).
Then, I was advised to apply the following command (after sorting id date),
gen daily_returns = ln(closing_rate / [closing_rate[_n-1])
however,
this time it is calculating all the returns but with only one missing value (not one missing value in every starting of new id that I have generated)!!!

how can I generate a missing value for the returns calculated at the start of each new id..?!

regards..

combine data sets with different structures

$
0
0
Hello,

I have two data sets that I would like to combine (not merge!)

1) The UN Micro Indicator Cluster Survey on district level: Individual data on children's health outcomes. I created 4 "in-utero variables" for each child, that is date of conception, end of first trimester, end of second trimester, and birth
district child id conception end 1st trimester end 2nd trimester birth
A 1 19.03.1995 19.06.1995 19.09.1995 19.12.1995
A 2 24.7.1995 24.10.1995 24.01.1996 24.04.1996
A 3 etc.
B 4
B 5

2) A data set of incidences of violence on district level and date of incidence
district time/date intensity of violence event
A 25.03.1995 20 1
A etc. 1
B 1
B 1
C 1
I would like to have a variable that tells me if there was an incidence of violence when the child was in-utero and if so in which trimester

For the above example, there would be an incident for the child 1 in the first trimester none for the second and third trimester, and child 2 would not have been exposed to violence in utero at all during in utero

How can I get a data set that gives me the incidence of violence (and the intensity) for the time spans I have assigned to each child?
district child id event violence in 1st trimester violence in 2nd trimester violence in 3rd trimester
A 1 1 1 0 0
A 2 0 0 0 0
A 3
B 4
B 5
dates are in stata-readable format

Thanks a lot for your help!

Panel Data- How write I(1) Variables in Equation and Use Bounds Test?

$
0
0
1. How can I use Bounds Test with Panel Data in Eviews 9
2. How should I write an equation where some of my variables are I(1) and some I(0). I see Eviews automatically puts a 'D' with short run coefficients.

stcox and margins

$
0
0
Dear Stata experts,

I am using Cox survival analysis and have checked all my models based on the different tests for proportionality assumptions, the most accurate functional format specification, overall model fit, etc. Also my coefficients are stable (show the same significant results) across different models.

However, now that my professor asked me to display the confidence intervals for the margins command of my squared term variable, all calculated margins including the CI become insignificant. I used the margins command after stcox including the different options and well, it does not make sense... I looked into this Stata forum and (also on Google) and found that apparently the margins command cannot be appropriately used after stcox due to the way it is calculated (which would make sense as all other tests prove my variables and models to be well specified).

So, what can I do now to show for what time intervals the HR (truly) is significant (I imagine something like a relative hazard or cumulated hazard line with CI intervals along the time range)?

I hope my question (and probably also my confusion) is clear.

Thanks a lot for your answers in advance - I appreciate your support a lot!

Best,
Rike

Error; repeated time values in sample

$
0
0
Hi,
I want to do an time series analysis. i use the comand tsset to tell stata that the data are time series. When doing so i got the error meassage "repeated time values in sample".
I understand what the problem is, but would be very thankful if some one could help me solve the problem.

Array


Best regards,
Gregg

expanded stub width for frequency table

$
0
0
I need to tabulate a string variable with some very long values (100+ characters). How can I expand the stub width for the frequency table?

Can stata read an Excel file that is password protected?

$
0
0
Can stata read an Excel file that is password protected?

Thank you.

Crash when closing Do-file Editor on Stata for Mac after upgrading to macOS Sierra 10.12.2

$
0
0
Apple recently released the macOS Sierra 10.12.2 update. This update can cause Stata to crash when the Do-file Editor is closed. This affects both Stata 13 and Stata 14 for Mac. We will be releasing updates soon for both Stata 13 and Stata 14 to address the crash. We suggest that users not upgrade to macOS Sierra 10.12.2 until the updates to Stata for Mac have been released.

Difference between reg, reg with vce(robust) and reg with vce(cluster)

$
0
0
Dear Statalisters,

While running regressions in Stata I've encountered a somewhat strange and counter-intuitive result in my regression. I ran the same regression 3 times, one without any options, one with vce(robust) and another with vce(cluster).

My intuition from the regression was that one without any options will have the lowest standard error, followed by robust option and cluster option. However, I discovered that regression with robust and cluster option had smaller standard error compared to the regression without any option. In fact, the standard error from robust option and cluster option was identical.

What is the possible explanation for this result? If it helps, the code that I ran was:

Code:
xtset industry year
xi: xtreg yvar xvar i.year i.industry
xi: xtreg yvar xvar i.year i.industry, vce(robust)
xi: xtreg yvar xvar i.year i.industry, vce(cluster industry)

Using the rolling command

$
0
0
Hello,

I have an unbalanced panel dataset where panel variable is denoted by firmid and time variable is year. I want to calculate a new variable varsalest defined as the variance of total sales (variable name is sale) in the past 5 years (t-5 to t-1) .

Going through the previous posts here, I learned to use the rolling command as below-

xtset firmid year
rolling varsales=r(Var), window(5) stepsize(5) : summarize sale, detail

I want to make sure of two things here-
1) If there are one or more missing values in the 5-year window, then the variance is calculated from the remaining values.
2) This calculation is done grouped by firm. But the 'by' prefix does not work with rolling command. How can I specify the calculation to be done by each firm?

Thanks.

Missing standard errors when using the code areg2gen (two-clustered standard-errors and with fixed effect)

$
0
0
Hi, everyone,

I am using stata/se 12.0. Using the transaction data of corporate bonds, I am trying to run a regression on panel data with standard errors clustered to bond-issuer and transaction day level, as well as the bond-fixed effect. For a given bond , there are several trades in one transaction day. A bond issuer may issue more than one bond. The code is as follows:

areg2gen markup interval par_value round_indicator return_positive return_negative, absorb (cusip) fcluster(issuer_id) tcluster(trd_exctn_dt).

The running result is as follows:
Array


If the question is unclear for you, please tell me. And I am sincerely asking for help. If anyone knows how to solve this problem, please let me know.

Thanks a lot in advance.

Truncated variable labels and string variables

$
0
0
I am using Stata 14.

1- When I carry out tabulations, variables with long value labels are truncated even though I perform format operation. How do I avoid that?
2. When I import data into stata from excel, SPSS, etc, the string length of my variables are truncated. How do avoid this.
Any help would be appreciated.
Viewing all 73331 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>