Quantcast
Channel: Statalist
Viewing all 72843 articles
Browse latest View live

How to assign the same value every 3 observations

$
0
0
Suppose I have a panel data with 2 firms and each firm has 6 observations of variable called y

id y
1 1
1 2
1 3
1 4
1 5
1 6
2 6
2 5
2 4
2 3
2 2
2 1

I would like to transform the variables as follows (assign the same y every three observations based on the first observation of y at the 1st, 4th observation etc)

id y
1 1
1 1
1 1
1 4
1 4
1 4
2 6
2 6
2 6
2 3
2 3
2 3

I know how to do it using other software using a for loop but couldn't figure out a way to do it in STATA, could someone help me with it? Thanks a lot!!

Upgrate to Stata 16 from Stata 13?

$
0
0
Hi,

I have multiple datasets and code in Stata 13 that I need to continue using but my department is offering to upgrade to Stata 16. Am I going to have compatibility issues? Is that a good idea to upgrade? What are the benefits of Stata 16 relative to older versions? Thanks.

What does the "predict" command produce in Principal Component Analysis?

$
0
0
I am trying to use Principal Component Analysis to convert 10 portfolio returns into 3 most important statistical component returns. What I did is
Code:
pca r1-r10
predict pc1 pc2 pc3
However, when looking at the generated variables, I am really confused. What are they? How are they computed from eigenvectors of the original covariance matrix?

Group Invariance of Individual Parameters in SEM/GSEM

$
0
0
I'm looking for a way to test the invariance of individual parameters across groups in SEM or GSEM. Stata allows tests of "sets" or "classes" of parameters, but I can't identify an approach to test individual parameters within a class (eg, coef). Any suggestions?

How to code to calculate Buy-and-hold Abnormal Return (BHAR)

$
0
0
I need to calculate peer-adjusted BHARs to measure long-run performance effects. Peer firms with similar market capitalization and equity's book-to-market ratio perform well in randomized samples.

For each loan-announcing firm, I need to select a peer firm that resembles the sample firm except for the announcement of loan financing. I need to compute each firm's subsequent holding period return (HPR) as

HPRi = (∏Ti t=1 (1+Rit) - 1 ] X 100%,

where Rit is the rth firm's stock return on the rth day, and Ti is the number of trading days in the three-year period following the loan announcement.

In my data, "sprtrn" (Return on the S & P) represents Rit. How can I write the STATA code to calculate HPRi?


After calculating HPR for each sample firm and for its matching firm, I need to evaluate the difference, a stylized investor's BHAR as follows: How can I write the STATA code to calculate BHAR?

BHAR = HPREvent - HPRPeer

I used the following code. But it does not align with the above formula. How can I fix it?
egen firm=group(permno)
gen date=substr(date,1,4)
gen day=20 gen

tempdate = date
gen date2 = date(date,"YMD")
format date2 %td
drop year month day tempdate

gen month=mofd(date2)

xtset firm date

gen adj_it = ret - sprtrn
gen compound = (1 + adj_it) * (1 + l.adj_it)
forvalues i=2/11 {
replace compound = compound * (1 + l`i'.adj_it)
}

gen bhar = compound - 1
drop compound

creating two different count variables

$
0
0
I am working on rare disease dataset and prepearing the data for poisson regression. There are 4 predictor variables (age_category, sex, inc_quint and rural) and outcome is breast cancer (breast_ca). This is just a mock dataset.


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(age_cat sex inc_quint rural breast_ca)
1 2 1 1 1
2 1 2 2 1
3 1 3 1 1
1 2 4 2 1
1 1 5 2 0
2 2 4 1 0
2 2 3 1 0
4 2 3 1 0
5 1 2 1 0
2 1 1 2 0
1 2 1 2 0
1 2 2 1 0
1 2 2 1 0
1 2 3 1 1
2 1 4 1 1
2 1 5 2 1
2 1 5 2 1
2 2 5 1 1
2 2 4 1 0
4 2 3 1 0
3 1 4 1 0
5 1 5 2 0
2 1 5 2 1
7 1 4 2 1
2 2 3 2 1
3 2 2 2 1
2 1 1 2 1
1 1 3 1 1
7 1 4 1 1
8 1 4 1 1
6 2 5 1 0
5 2 5 1 0
4 1 4 1 0
6 1 3 1 1
3 1 3 2 1
2 1 4 2 1
4 2 5 2 1
5 2 5 2 0
6 2 4 2 0
7 1 3 1 0
2 1 3 1 1
3 1 2 1 1
4 2 3 1 0
5 2 4 1 0
6 2 4 1 1
7 1 5 1 1
8 1 5 1 1
9 2 5 1 1
2 2 5 1 1
3 1 4 1 1
6 2 4 1 0
8 2 3 2 0
1 1 3 2 1
2 1 2 2 1
3 2 2 2 1
4 1 2 2 1
5 2 2 1 1
3 1 4 1 1
2 2 5 1 1
1 2 5 1 1
7 2 4 1 0
7 1 4 1 0
8 2 5 1 1
1 2 1 1 0
2 1 2 2 0
3 1 3 1 0
1 2 4 2 1
1 1 5 2 0
2 2 4 1 0
2 2 3 1 0
4 2 3 1 0
5 1 2 1 0
2 1 1 2 0
1 2 1 2 0
1 2 2 1 0
1 2 2 1 0
1 2 3 1 1
2 1 4 1 1
2 1 5 2 1
2 1 5 2 1
2 2 5 1 1
2 2 4 1 0
4 2 3 1 0
3 1 4 1 0
5 1 5 2 0
2 1 5 2 0
7 1 4 2 0
2 2 3 2 0
3 2 2 2 0
2 1 1 2 0
1 1 3 1 0
7 1 4 1 0
8 1 4 1 0
6 2 5 1 0
5 2 5 1 0
4 1 4 1 0
6 1 3 1 0
3 1 3 2 0
2 1 4 2 0
4 2 5 2 0
end
I contracted the dataset using the command

contract age_cat-breast_ca

However, it gives a single frequency variable for different combinations of variables.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(age_cat sex inc_quint rural breast_ca _freq)
1 1 3 1 0  4
1 1 3 1 1  1
1 1 3 2 0  8
1 1 3 2 1  1
1 1 5 2 0  2
1 2 1 1 0  1
1 2 1 1 1  1
1 2 1 2 0  2
1 2 2 1 0  4
1 2 3 1 1  2
1 2 4 2 1  2
1 2 5 1 0  8
1 2 5 1 1  1
2 1 1 2 0  6
2 1 1 2 1  1
2 1 2 2 0  9
2 1 2 2 1  2
2 1 3 1 0  2
2 1 3 1 1  3
2 1 4 1 1  2
2 1 4 2 0  4
2 1 4 2 1  1
2 1 5 2 0  4
2 1 5 2 1  5
2 2 3 1 0  2
2 2 3 2 0  4
2 2 3 2 1  1
2 2 4 1 0  4
2 2 5 1 0 16
2 2 5 1 1  4
3 1 2 1 0  4
3 1 2 1 1  1
3 1 3 1 0  1
3 1 3 1 1  1
3 1 3 2 0  4
3 1 3 2 1  1
3 1 4 1 0 19
3 1 4 1 1  2
3 2 2 2 0 12
3 2 2 2 1  2
4 1 2 2 0  8
4 1 2 2 1  1
4 1 4 1 0  5
4 2 3 1 0 10
4 2 5 2 0  4
4 2 5 2 1  1
5 1 2 1 0  2
5 1 5 2 0  3
5 2 2 1 0  8
5 2 2 1 1  1
5 2 3 1 0  2
5 2 4 1 0  5
5 2 5 1 0  5
5 2 5 2 0  5
6 1 3 1 0  4
6 1 3 1 1  1
6 2 4 1 0 13
6 2 4 1 1  1
6 2 4 2 0  5
6 2 5 1 0  5
7 1 3 1 0  5
7 1 4 1 0 13
7 1 4 1 1  1
7 1 4 2 0  4
7 1 4 2 1  1
7 1 5 1 0  8
7 1 5 1 1  1
7 2 4 1 0  9
8 1 4 1 0  4
8 1 4 1 1  1
8 1 5 1 0  8
8 1 5 1 1  1
8 2 3 2 0  9
8 2 5 1 0  8
8 2 5 1 1  1
9 2 5 1 0  8
9 2 5 1 1  1
end
I was wondering if it is possible to get two different variables for frequency of breast cancer cases (those coded 1 for a specific combinations of predictors and another variable for total number of those coded 0 for breast cancer.

for example I wanted a dataset like,below
age_cat sex inc_quint rural breast_ca(1) breast_ca(0)
1 1 3 1 1 4
which means there was 1 case of breast cancer out of 4 people with that combination of predictor variables. and I want zero frequencies to be shown for breast_ca(1), but not for breast_ca(0)

Thank you.
Yuba

Appending Files with Similar Name but From Different Folder

$
0
0
I have two folders with exactly similar file names inside them. I would like for these files to be appended with each other.

For example, interview__33 from Folder v2 should be appended with interview__33 from Folder v3.

I used this codes:

global v3 "C:\Users\lm2\Desktop\Raw Files\20190816"
global v2 "C:\Users\lm2\Desktop\Raw Files\v2"
global temp ""C:\Users\lm2\Desktop\Raw Files\20190816\temp"

cd "$v2"
local dta: dir . files "*.dta"
foreach f of local dta {
use "`f'", clear
append using "$v3\`f'"
save "$temp\`f'", replace
}

The error is: r(601);
File C:\Users\lm2\Desktop\Raw Files\20190816 not found

How to display few IDs while maintaining the whole sample distribution in Fabplot?

$
0
0
Dear Statalist,


Could anyone advise how to display only a few IDs while maintaining the whole sample distribution in Fabplot? In my case, the ID is the city's pinyin.

My code is such that

Code:
fabplot scatter Mgt_NAgrUrb year , by(pinyin)
. The default fabplot display all of the cities' distribution of Mgt_NAgrUrb over the year (in the attachment)

Fabplot is an awesome package but the discussion on using it seems scant. Thank you very much in advance.

Regards.

Factor scores.

$
0
0
Hello!

I was wondering how can I get factor scores which have mean = 0 and sd = 1.
I'm using the following commands:

Code:
use https://stats.idre.ucla.edu/stat/stata/output/m255, clear
drop if missing(sexism,facsex, facnat, facrank, studrank, grade, salary, yrsut, nstud)
polychoric facsex facnat facrank studrank grade salary yrsut nstud
return list
matrix r = r(R)
factormat r, n(1250) pcf sds(stdev) means(mean)
predict fac1, regression
I'm not sure whether the scores produced by stata are correct.

Using randomtag by group

$
0
0
Dear Statalisters,

I am having a dataset with individuals belonging to different groups and I am running a simulation with many repetitions. Per repetition, one individual is randomly drawn from each group using the sample command, e.g.

Code:
*A toy example
clear
*Generate 100 different groups
set obs 100
generate long group=_n
*Generate 1000 indivudals per group
expand 1000
bysort group: gen individual=_n

*Sample 1 individual per group
by group: sample 1, count
Since sample relies on sorting the data (which makes the code run rather slowly), I would like to use the user-written command randomtag (from SSC) that tags the same observations that sample would select but does not sort the observations.

My problem is that randomtag does not have a by() option, so I can't use it to sample one individual per group. Does anyone has an idea how to accomplish this with randomtag or with another workaround

If anyone has any ideas, please let me know, thank you in advance!

Ali

VAR-MGARCH Spillover effects help. # DCC-MGARCH #CCC-MGARCH # BEKK-MGARCH

$
0
0
Dear Statalists:

I am fairly a new person to Stata software and it's my first time here to post some questions here.

I am trying to run a VAR-MGARCH model with BEKK and DCC and CCC specifications. My dataset contains 4 financial indices(each of them follows I(1) process ) and I want to investigate the potential return and volatility spillovers between the underlying variables. However, it seems that both DCC and CCC models do only provide me with the main diagonal matrices and do not provide further specifications of the full ARCH and GARCH parameters. It also seems that Stata 16 does not offer BEKK regressions.

My question here: can anyone here able to share me some command or ado files to guide me on how to operate those VAR-DCC/ VAR-CCC models with full specifications of ARCH/ GARCH specifications to test for spillover effects between the underlying variables.

For, the codes I used are listed below:

Code:
tsset t

varsoc r_cnne r_eco r_wti r_cqqq, maxlags(10) # use information criterion to find the opitmal lag choice for VAR operation.

mgarch dcc (r_cnne r_eco r_wti r_cqqq = L.r_cnne L.r_eco L.r_wti L_r.cqqq), arch (1) garch(1) distribution(normal)  nolog  # VAR-DCC with lag order 1 for mean euqation and GARCH(1,1) process for DCC residual.

mgarch ccc (r_cnne r_eco r_wti r_cqqq = L.r_cnne L.r_eco L.r_wti L_r.cqqq), arch (1) garch(1) distribution(normal)  nolog  # VAR-CCC with lag order 1 for mean euqation and GARCH(1,1) process for CCC residual.
Thank you in advance and You kindly advise and help is highly appreciated.

Best regards
Ben

Interaction Term interpretation

$
0
0
Hello,

I have the following model

log trade volume = log distance*year

I want to figure out how transport costs approximated by distance change over time. I have 25 time periods.
Now I struggle to interpret the coefficient. The coefficient is +0,1322**.

Any help?

How do I create HHI by country,and by industry of data in a time series

$
0
0
I'm trying to get started with my regression analysis on Stata but I'm lacking HHI variable.I n particular the export concentration HHI by country from the exporter and industry HHI market concentration .
I have data classified as follows

Year. Dest-Country. Export value. Industry.
2000. Xxx. X$. A
|
2015. Xxx. X$. A

​2000. yyy. X$. A
|
2015. Yyy. X$. A


And so on .I have about 13 countries in my study and 7 industries over the span of 16 years.is there any code in particular that can help me find 1.hhi by country and the hhi by industry.thanks

Factor Analysis

$
0
0
I am running the following command:


factor criswci criswmi criswpi criswyi criswui criswti criswbi, factors(1)

score criswi

and it says:
score is not valid after factor.


I think Stata has a new command for this but don't know it since I am not used to factor analysis.

P.S. The do file that I am using is quite old and must be an older version of Stata.

Help with Estimation of Total Factor Productivity in a BC'95 Stochastic Frontier Model

$
0
0
I ran a Battese and Coelli 1995 sfpanel model in Stata 12.1 of the following translog equation

sfpanel lny lnl lnm lne lnk lnksq lnlsq lnesq lnklnl lnmsq lnklne lnklnm lnllne lnllnm lnelnm year, model(bc95) dist(tn) emean( for for5 for10 for15 for20 for25 exp_firm firm_size) ort(o)

Aimed at establishing the effect of FDI on efficiency and productivity at firm level.

I wish to estimate TFP whose components are Technical Change(TC), Technical Efficiency Change (TEC) and Scale Efficiency Change (SEC).
  1. Can this be done directly in Stata?
  2. And what is the syntax considering the 4-input translog equation?
Thanks.
Gabriel.

Path Model with (endogenous) Treatment - which model, which command?

$
0
0
Hello all

I have collected data on the intention to create a new business of my students.

I measured it before the course (t1) and after it (t2). I have data on their absenteeism in class and the time they spent on the online platform. My hypothesis is that following my course will modify their attitude toward new business ventures and thus their intention to create a new business.

The hypothesized model is as follow:
Array

All variables are continuous. What is difficult is that the model includes both panel and structural equation modeling features.

Do you know a class of model that could accommodate this?

In addition, absenteeism and time spent on the online course are probably endogeneous. I would be very interested, if you know of a way to treat this additional issue.

Thank you very much in advance!









bootstrap to get the confidence intervals for a linear cobmination of regression coefficients

$
0
0
I am estimating a variable called tot, tot=t1+t2
t1=p1*c1 p1 is the mean of a variable, c1 is the coefficient of a regression for this variable
t2=p2*c2 the same as t1 with a different variable
tot=t1+t2

I calculated the value of tot by the linear combinations of the numbers. My coauthor asked me to get the variance for the value of tot.

I created a bootstrap program to calculate the confidence intervals of tot, but there are many error messages. I am wondering whether I am at the right direction. Here are the codes
Code:
capture program drop bootc
program define bootc,rclass
drop quant p1 p2 c1 c2 t1 t2 tot

gen quant=40000

mean cond1
gen p1=r(mean)
mean cond2
gen p2=r(mean)

     
glm cost i.cond $char, family(gamma) link(log) le(95)
gen c1=_b[1.cond]
gen c2=_b[2.cond]


gen t1=quant*p1*c1
gen t2=quant*p2*c2

gen tot=t1+t2

end

simulate tot=r(tot), reps(4) seed(12345): bootc
estat bootstrap, all
Thank you very much!

Marginal effect*

$
0
0
Dear community! I have conducted ordered probit regression with 10 categories DV and I would like to create a graph for the marginal effect of probability of my DV outcome 6 to 10.
I apply margins command, however, I initially applies expression for the outcome 0. I couldn't find an answer in - help margins - document. Could you help me please?

Use mimix command for cross-sectional data

$
0
0
Hi everyone, I'm interested in applying the reference-based sensitivity analyses using multiple imputation (i.e. the mimix command) for my research. However, my data is cross-sectional not longitudinal. Is there a way to use the mimix command for cross-sectional data? If not, are there similar commands to the mimix command for cross-sectional data? Any suggestions would be helpful! Thanks in advance.

Predicted probability for each DV outcome

$
0
0
Dear community!
How to create a table for predicted probability effect of each IV on each DV outcome like on the screenshot attached?Array
Viewing all 72843 articles
Browse latest View live