Quantcast
Channel: Statalist
Viewing all 72776 articles
Browse latest View live

Transform annual data in quarterly data

$
0
0
Hello,

I have a panel data with variables that are annual. I want to create a new variable in a quarterly frequency and keeping the sames annual values for each period.
For example, I have the annual value (in 2015) for land area=27500. I want to create a variable in a quarterly frequency using this value:
land_area_q1=27500;
land_area_q2=27500;
land area_q3=27500
land_area_q4=27500.

Can you help me with the code?

Thank you in advance!

refrencing to a specific cell

$
0
0
Hi everyone,

I just want to divide values of different observation of one variable on one specific value of that variable. its like below:

MARQUE quality quality1
brand1 29.11
brand2 33.04
brand3 100
brand4 37.19
brand5 26.33
brand6 48.1

i want for example to do below:

gen quality1= quality/ quality[brand1] >>>>>>>>that divide all qualities on the quality of brand1

how can i do that?
thanks
Ashkan

applying command only to integer value with two (one) decimal place

$
0
0
Hi,

Does anyone know how I can condition the following command only on the variable (integer) value expressed in decimal format:

bysort id (year), sort: replace ts=ts[_n+1] if ts[_n-1]==ts[_n+1]
the variable of interest is ts. Basically, I want to add another condition (that ts an integer with one or two decimal places). the command should not apply to round numbers..

Thanks to any help!

Pairwise Comparison of Means

$
0
0
Dear all,
I did a pairwise comparison of means using the following command:
pwmean Urban_Peri , over(Foodgrp12 Wealthindex2 ) pveffects cimeans cformat(%9.3f) pformat(%5.3f) sformat(%8.3f)

I got the following results:
Urban and Peri_Urban Dwellers
Food Groups#Wealth Index Contrast Std. Err t P>t [95% Conf. Interval
(1 1) vs (1 0) -0.33153 0.097091 -3.41 0.001 -0.52253 -0.14054
(4 1) vs (1 0) -0.17915 0.103159 -1.74 0.083 -0.38208 0.023782
(5 1) vs (1 0) -0.4531 0.128189 -3.53 0 -0.70527 -0.20093
(6 1) vs (1 0) -0.31941 0.150245 -2.13 0.034 -0.61497 -0.02385
(9 1) vs (1 0) -0.42042 0.162605 -2.59 0.01 -0.74029 -0.10055
(12 1) vs) ( (10 1) -0.26486 0.133916 -1.98 0.049 -0.5283 -0.00143
(2 0) vs (1 1) 0.466667 0.142141 3.28 0.001 0.187051 0.746283
(4 0) vs (1 1) 0.196396 0.097091 2.02 0.044 0.005402 0.387391
(6 0) vs (1 1) 0.266667 0.152951 1.74 0.082 -0.03422 0.567549
(8 0) vs (1 1) 0.466667 0.206239 2.26 0.024 0.060957 0.872376
(9 0) vs (1 1) 0.466667 0.159752 2.92 0.004 0.152406 0.780928
My problem is that I can't interpret what, for example,(1 1) vs (1 0) mean. Can somebody help please?


How to overlay Impulse Response Functions

$
0
0
Hi,

I have estimated a baseline SVAR time series model. As a robustness check, I have replaced one of the variables in the model. I would like to plot the impulse response function from the baseline model with its confidence intervals overlaid with the impulse response using the second model. I figure it has something to do with irf ograph but cannot work out how to get it to pull from two different irf files. Could someone help me with the code?

Thanks in advance!

PS. I know this forum does not like aliases. I have tried to change my name but cannot work out how to do that either!

Diverging stacked bar chart

$
0
0
Hi everyone,

I want to display the frequencies of one categorical variable (attitude) over the categories of another (country) in a stacked bar chart. The attitude variable has 5 (Likert scale) categories. I have two specific aims, which I do not know how to accomplish using Stata:
  1. The middle or neutral category of the attitude variable should be at the center of the graph over all the countries.
  2. The stacked bars should be sorted according to the frequency of one (or more) categories of the attitude variable.
For an example of what I have in mind, see:Using Stata 12.1 I tried the following which did not exactly yield the desired results:

Code:
slideplot hbar attitude , by(country) percent neg(1 2 3) pos(4 5)
But slideplot cannot sort the bars. Also, you either have to chose whether the middle category 3 goes to the left or to the right, or you have to leave it out completely.

Code:
tab attitude, gen(attitudeCat)
graph hbar attitudeCat1 attitudeCat2 attitudeCat3 attitudeCat4 attitudeCat5 ///
, percent stack over(country, sort(5) descending)
With graph hbar I can at least sort the stacked bars according to one of the categories, but I still cannot center them around their middle category.

Does anyone have a suggestion of how to do this? Any help is much appreciated!

Kind regards,
Uwe

I produced a tiny example dataset in case you need it:

Code:
clear

input float attitude long country

 4 2

 4 2

 5 2

 4 2

 5 2

 2 2

 5 2

 4 2

 5 2

 4 2

 5 2

 1 2

 1 2

 1 2

.c 2

 5 3

 3 3

 2 3

 4 3

 4 3

 5 3

.c 3

 5 3

 5 3

 5 3

 4 3

 1 3

 1 3

 1 3

 1 3

.c 4

 3 4

 4 4

 5 4

 3 4

 5 4

 4 4

 4 4

 5 4

 4 4

.c 4

 5 5

 5 5

 3 5

 5 5

 4 5

 3 5

 4 5

 2 5

 4 5

 5 5

 4 5

 5 5

 2 5

 4 5

 4 5

 5 5

 4 5

 5 5

 3 5

 5 5

 4 5

.c 5

 1 6

 4 6

 5 6

 4 6

 3 6

 4 6

 4 6

 5 6

 4 6

 2 6

 4 6

 4 6

 2 6

 1 6

 5 6

 2 6

 2 6

 5 6

 4 6

 4 6

 4 6

 4 7

 4 7

 5 7

 1 7

 4 7

 3 7

 4 7

 5 7

 1 7

 1 7

 5 7

 5 7

 5 7

 5 7

 5 7

 5 7

 5 7

 5 7

 5 7

 4 7

 5 7

 2 7

 4 7

end

label values country country

label def country 2 "A", modify

label def country 3 "B", modify

label def country 4 "C", modify

label def country 5 "D", modify

label def country 6 "E", modify

label def country 7 "F", modify

recursive mixed model (cmp)

$
0
0
I am trying to estimate a model for a cross-section of firms (pooled cross-section dataset) based on three sequential decisions: 1) To export or not, 2) to export directly or through an intermediary, 3) how much to export indirectly.

I have tried a Heckman, but the main problem is that I have 2 sequential decisions 1) and 2) and I can only include 1 of them in the first step heckman, for exampe 1):

heckman indirectshare2 lemp llabpro foreign transport crime legalfair eu cefta voleuro, select(xd = lemp llabpro foreign eu cefta transport customs crime permit legalfair voleuro ) vce(robust)


Then, I thougt about using cmp with three equations two probits and 1 continouos model, something like:

cmp setup
cmp (xd = lemp llabpro foreign eu cefta transport customs crime permit legalfair voleuro )(indirect = lemp llabpro expintens foreign transport customs crime legalfair eu cefta ) (indirectshare2 =lemp llabpro foreign exint transport customs crime legalfair eu cefta ), indicators($cmp_probit $cmp_probit $cmp_cont) quietly


my question is: is this equivalent to a Heckman with two initial steps or rather to a treatment effect model with (endogenous treatmenet) sequetial decisions? I will be grateful for suggestions whether this is appropriate or there is any other way(s) to model this,
many thanks
Inma

Error-correction equations using OLS

$
0
0
Hi,

I am trying to estimate the following model:

Array
Here are the data:

Code:
DATE    R    INF    D1_R    D1_INF
Mar-84    13.77    1.225116        
Jun-84    12.81    4.863282    -.96    3.638166
Sep-84    10.53    5.997114    -2.28    1.133832
Dec-84    12.34    5.321586    1.81    -.6755276
Mar-85    15.29    9.289242    2.95    3.967656
Jun-85    15.75    9.078403    .46    -.2108383
Sep-85    16.78    7.778023    1.03    -1.300381
Dec-85    19.56    9.245823    2.78    1.4678
Mar-86    16.44    6.400136    -3.12    -2.845686
Jun-86    14.68    10.44446    -1.76    4.044322
Sep-86    18.08    11.18243    3.4    .7379732
Dec-86    15.53    7.940707    -2.55    -3.241724
Mar-87    16.41    5.853763    .88    -2.086944
Jun-87    13.68    6.722847    -2.73    .8690844
Sep-87    11.7    7.079831    -1.98    .3569832
Dec-87    11.46    6.956697    -.24    -.1231337
Mar-88    10.95    6.837773    -.51    -.1189237
Jun-88    13.1    7.61075    2.15    .7729769
Sep-88    13.64    7.90366    .54    .2929096
Dec-88    15.11    3.894027    1.47    -4.009632
Mar-89    17.25    9.782518    2.14    5.888491
Jun-89    18.37    9.138508    1.12    -.6440105
Sep-89    18.12    7.324721    -.25    -1.813787
Dec-89    17.89    6.796765    -.23    -.527956
Mar-90    15.84    6.293149    -2.05    -.5036168
Jun-90    15.02    3.109831    -.82    -3.183317
Sep-90    13.52    10.32069    -1.5    7.210856
Dec-90    12.19    -.7554299    -1.33    -11.07612
Mar-91    11.49    .7554299    -.7    1.51086
Jun-91    10.39    2.257767    -1.1    1.502337
Sep-91    9.58    3.734854    -.81    1.477087
Dec-91    7.84    0    -1.74    -3.734854
Mar-92    7.51    -1.116799    -.33    -1.116799
Jun-92    6.42    .372613    -1.09    1.489412
Sep-92    5.94    1.857876    -.48    1.485263
Dec-92    5.9    3.690063    -.04    1.832187
Mar-93    5.38    1.466546    -.52    -2.223517
Jun-93    5.22    1.825654    -.16    .3591075
Sep-93    4.86    .7279347    -.36    -1.097719
Dec-93    4.82    1.451907    -.04    .7239726
Mar-94    4.88    2.888099    .06    1.436192
Jun-94    5.12    2.510093    .24    -.3780057
Sep-94    5.93    3.204289    .81    .694196
Dec-94    7.95    6.681474    2.02    3.477185
Mar-95    8.13    5.197128    .18    -1.484346
Jun-95    7.55    4.790476    -.58    -.406652
Sep-95    7.49    3.04957    -.06    -1.740906
Dec-95    7.43    1.684213    -.06    -1.365357
Mar-96    7.53    2.680077    .1    .995864
Jun-96    7.57    1.000417    .04    -1.67966
Sep-96    6.91    .6655576    -.66    -.3348598
Dec-96    6.13    .664452    -.78    -.0011056
Mar-97    6.05    -.9970923    -.08    -1.661544
Jun-97    5.35    -1.667364    -.7    -.6702715
Sep-97    4.77    1.001252    -.58    2.668616
Dec-97    5.07    .9987521    .3    -.0025
Mar-98    4.96    2.320769    -.11    1.322017
Jun-98    5.32    .9905081    .36    -1.330261
Sep-98    5.03    1.973688    -.29    .9831801
Dec-98    4.8    -.3282725    -.23    -2.301961
Mar-99    4.81    1.638675    .01    1.966947
Jun-99    4.93    3.581628    .12    1.942953
Sep-99    5.01    2.262632    .08    -1.318995
Dec-99    5.65    3.529907    .64    1.267274
Mar-00    5.89    3.182197    .24    -.3477099
Jun-00    6.23    14.62629    .34    11.44409
Sep-00    6.57    1.220443    .34    -13.40585
Dec-00    6.2    4.242464    -.37    3.022021
Mar-01    5.14    3.302083    -1.06    -.9403815
Jun-01    4.97    1.194031    -.17    -2.108052
Sep-01    4.55    3.560854    -.42    2.366824
Dec-01    4.25    3.529435    -.3    -.0314198
Mar-02    4.46    2.917591    .21    -.6118433
Jun-02    5.07    2.60776    .61    -.3098314
Sep-02    4.92    2.87771    -.15    .2699504
Dec-02    4.83    5.128275    -.09    2.250565
Mar-03    4.76    0    -.07    -5.128275
Jun-03    4.67    2.258298    -.09    2.258298
Sep-03    4.91    1.965606    .24    -.2926922
Dec-03    5.47    3.624981    .56    1.659375
Mar-04    5.51    1.938391    .04    -1.68659
Jun-04    5.49    1.654034    -.02    -.2843567
Sep-04    5.42    3.014745    -.07    1.360711
Dec-04    5.41    2.721099    -.01    -.2936463
Mar-05    5.81    2.433262    .4    -.287837
Jun-05    5.66    3.755896    -.15    1.322634
Sep-05    5.62    2.130498    -.04    -1.625398
Dec-05    5.63    3.438038    .01    1.30754
Mar-06    5.61    6.27054    -.02    2.832502
Jun-06    5.96    3.612928    .35    -2.657612
Sep-06    6.21    -.5141389    .25    -4.127067
Dec-06    6.39    .2571521    .18    .7712909
Mar-07    6.43    4.854739    .04    4.597587
Jun-07    6.42    2.78394    -.01    -2.070798
Sep-07    6.94    3.765324    .52    .981384
Dec-07    7.29    5.212609    .35    1.447284
Mar-08    7.9    5.875258    .61    .6626496
Jun-08    7.81    4.590808    -.09    -1.28445
Sep-08    7.27    -1.203008    -.54    -5.793817
Dec-08    4.39    .4816376    -2.88    1.684646
Mar-09    3.16    1.920772    -1.23    1.439134
Jun-09    3.25    3.814093    .09    1.893321
Sep-09    3.37    2.129553    .12    -1.684541
Dec-09    4.13    3.524252    .76    1.394699
Mar-10    4.33    2.564859    .2    -.9593933
Jun-10    4.89    2.779397    .56    .2145388
Sep-10    4.82    1.612441    -.07    -1.166957
Dec-10    5.03    6.159232    .21    4.546791
Mar-11    4.92    3.605658    -.11    -2.553574
Jun-11    4.99    2.46017    .07    -1.145489
Sep-11    4.81    0    -.18    -2.46017
Dec-11    4.51    .2229033    -.3    .2229033
Mar-12    4.44    2.00056    -.07    1.777656
I have tried the following code but not sure if it is correct:
Code:
* e) Two error-correction using OLS:

vec D1_R D1_INF, trend(none) lags(5)
First, by typing lag(4), it seems like the output gives me only 3 lags so I changed to "lag(5)". Am I missing something? Second, I have removed the time trend, is that right? Finally, does the code achieve the specification overall? Thanks!



Panel Data, Fixed effect Model

$
0
0
I am using Panel data on Stata. When I used Fixed effect model then 3 variables value showed zero. The following message showed:


note:Female members omitted because of collinearity
note: Number of Independent Director omitted because of collinearity
note: Audit committee size omitted because of collinearity

These 3 variables are completely different from each other with reference to their nature and concepts even than it showed collinearity. These variables are very important and i cannot delete from my data sample as well...
How to solved this problem in Stata....
My VIF test is 1.54 which showed no collinearity as well. My VIF results is less then 10, even then it showed collinearity??
I attached file as well.

Can somebody guided me that how to resolve this problem please....

Show the prediction result after regressing logit model

$
0
0
Dear Statalist,

Im Wesley. I have regressed the logit model but I still do not know how to get the prediction result from this model. Could you please teach me the code or something like that.

Thank you all !
Best wishes

Wesley

Running R scripts from Stata via rsource on mac

$
0
0
I'm on a mac and have installed/updated rsource by running:

adoupdate install rsource, update

but when I try to run my R script I get the following:

. rsource using ~/.../rscript.R
Assumed R program path: "/usr/bin/r"


/bin/bash: /usr/bin/r: No such file or directory
Beginning of R output
End of R output


What am I doing wrong?

using a loop to extract estimates for multiple companies from a regression

$
0
0

Hi guys, can you please provide some assistance, I have a very long panel dataset of daily prices of different companies and their market index return from 2001-2014. I am trying to run a regression of the daily stock return (y) on index return (x) for each individual company and extract and save the residuals and co-efficients of each. I ran the following code for each individual company within the dataset. I was wondering if there is a way to run a loop so I won't have to produce the code below for each company in the dataset.
Code:
 regress lnreturn_stock lnreturn_market in 1/1530
Code:
 predict u in 1/1530, residuals
Code:
 ereturn list
Code:
 mat b=e(b) ​​​​​​​
Code:
 svmat double b, n(beta)
Thank you,
Jake

How to visualise the functional form of the conditional effect of a squared independent variable?

$
0
0
Dear all,

I have a question with regard to the graphing of effects of squared variables and I was wondering whether you could help me with some advice.

I estimate a time-series cross-country multivariate regression using a sample of OECD countries. I want to investigate the linear and the squared effect of health decentralisation on the level of infant mortality. Now, I would like to visualise the functional form of the effect of the squared variable of health decentralisation on infant mortality. It may be a simple question but I want to make sure that I am doing the right thing.

The way I would go about this is as follows:

a) I regress the linear health decentralisation as well as the several other control variables on the infant mortality variables. I save the predicted values as a separate variable.

b) I create a two-way scatterplot (with confidence intervals in order to display density) between the saved predicted values and the squared health decentralisation variable.

My syntax is the following:

gen Health_Decentralisation_SQ = Health_Decentralisation*Health_Decentralisation

xtreg Infant_Mortality Health_Decentralisation Income_log Pub_Health_Expenditure_GDP Dependency_Ratio Education_int Fertility Alcohol_Consumption, re robust

predict predicted_values, xb

twoway (qfitci predicted_values Health_Decentralisation_SQ)


Is the way I go about this correct or am I missing something? Also, obviously, Health_Decentralisation_SQ has a scale reflecting the squared health decentralisation values. In order to convert this into a meaningful scale, is it ok when I simply take the root of the squared values and adjust the scale accordingly?

I would be glad to receive your view on this.

All the best

Helge

Gravity model using ppml and time varying fixed effects

$
0
0
Hi,
I was hoping to get some help with my dissertation. I'm trying to study how terrorism affects bilateral trade and am using panel data for 164 countries. I am using the ppml estimator (because of zero trade observations) with time varying fixed effects for importer and exporter. However half of my dummies for time varying fixed effects get dropped from the model due to collinearity. I have pretty much tried everything and can't seem to get around this problem. Request your advice. Thanks.

Line break problem with markdoc

$
0
0
Hi,

I have a problem with line breaks when using Markdoc. Here is a test case

Code:
log using auto, replace
sysuse auto, clear
list in 1
log close
markdoc auto, export(md) replace
Stata prints out the listing as follows

Code:
     +----------------------------------------------------------------------------------------------------------------+
     | make          price   mpg   rep78   headroom   trunk   weight   length   turn   displa~t   gear_r~o    foreign |
     |----------------------------------------------------------------------------------------------------------------|
  1. | AMC Concord   4,099    22       3        2.5      11    2,930      186     40        121       3.58   Domestic |
     +----------------------------------------------------------------------------------------------------------------+
The relevant part of the md file contains:

Code:
          . sysuse auto, clear
          (1978 Automobile Data)
          
          . list in 1
          
               +------------------------------------------------------------------------------------
          ----------------------------+
               | make          price   mpg   rep78   headroom   trunk   weight   length   turn   displa~t   gear_r~o    foreign |
               |------------------------------------------------------------------------------------
          ----------------------------|
            1. | AMC Concord   4,099    22       3        2.5      11    2,930      186     40      

121 3.58 Domestic |
+------------------------------------------------------------------------------------
----------------------------+

          . log close
There are three problems:

First, the md file breaks long lines with line break whereas Stata does not. (It may not be visible in this forum post). I would prefer the md file to not contain line breaks that are not in the Stata log and allow the users md viewer to add any line breaks. Second, there third line is broken with two line breaks creating and empty line between "... 186 40" and "121 3.58 Domestic " Third, if I export this smcl to word, the double line break causes the font to change.

I am using Stata 14.1 on Mac and just updated markdoc to the latest version available in SSC.

Mikko

String variables: identifying matches between survey responses

$
0
0
Hello Statalisters,
A colleague has received survey responses and suspects that some responses have been "fed" to the respondents. Is there a neat way to identify whether responses given by any two individuals are systematically the same or similar? The problem is we don't know what strings to look for otherwise we'd use general string functions.
Otherwise, I might just ask them to use plagiarism detector softwares.
Thanks so much!

How to use xtabond2?

$
0
0
Hi all,

I'm trying to do a dynamic panel data analysis by estimating the equation x_i,j,t=L.x_i,j,t-1+ a. h_i,j,t + fe_i,j + v_i,j,t where x is the trade flows, h is the tariff, fe is the fixed effects. I do it using xtabond2 command: xtabond2 res2 L.res2 tariff, noleveleq gmmstyle(L.res2). I also tried it this way: xtabond2 res2 L.res2 tariff, noleveleq ivstyle(tariff, passthru) gmmstyle(L.res2).

First, I would like to ask, which command I should use, since I couldn't totally get the difference between the two. Second, in both of the regressions the Sargan statistic is equal to 0.000. How should I improve it?

Can someone please explain the difference between the two commands and give me some suggestions on improving my results?

Thanks a lot!

Counting observations depending on previous dates

$
0
0
Dear Stata experts,

my question seems ordinary, but I am missing this one command that helps me to finalize the code myself.

I want to aggregate/count observations as long as their time span is not more than 5 years.

For this I have the following data:

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str36 investor_uuid float(announced announced_minus_5years investment_experience)
"000a3a71-e224-c35d-6476-65e0d6579830" 15046 13221 1
"000a3a71-e224-c35d-6476-65e0d6579830" 15838 14013 2
"000a3a71-e224-c35d-6476-65e0d6579830" 20319 18494 3
"000a3a71-e224-c35d-6476-65e0d6579830" 20481 18656 4
"000e38ec-2b81-3772-880b-1335374cb824" 17764 15939 1
"000ede33-6c08-c865-e88e-0a59ec5daee8" 17592 15767 1
"000ede33-6c08-c865-e88e-0a59ec5daee8" 19387 17562 2
"000ede33-6c08-c865-e88e-0a59ec5daee8" 20121 18296 3
"000ede33-6c08-c865-e88e-0a59ec5daee8" 20183 18358 4
"00121759-7d9c-dc31-d0a7-60581cb33bb9" 19068 17243 1
"00121759-7d9c-dc31-d0a7-60581cb33bb9" 19509 17684 2
"00122598-4cf3-0a67-a91e-b1df72d8ee46" 17569 15744 1
"00122598-4cf3-0a67-a91e-b1df72d8ee46" 17882 16057 2
"00122598-4cf3-0a67-a91e-b1df72d8ee46" 19479 17654 3
end
format %td announced
format %td announced_minus_5years
I now want to create a new variable "investment_experience_5precedingyears" that counts all observations whose announced date is within the 5 preceding years of the respective announced one. For example if looking at the first investor_uuid the first two observations can be added up, but for the announced date 19aug2015, the previous observations happened long before the 5 years time frame. So, the investment_experience_5precedingyears should take the value 1 again (the respective one marks always the first investment count). The last observed date for this investor_uuid should then again count 2 investment_experiences as only the previous one falls into the 5-years time span.

I hope my question becomes clear. What I am specifically missing is a command like: For a respective investor_uuid, look at all previous observations and count only the ones that are true for the following condition...

Thanks a lot for your always very quick and valuable support!
Best,
Rike

Can the independant variable be the instrumental variable in lagged period ?

$
0
0
I am using panel data where there is reverse causality between the variable Y and the variable X as follow
Y depend on X of a lagged period (I note it X-1). In a subsequent period, X (I note it X+1) react to changes in Y
I perform xtivreg as follow
xtivreg X+1 (Y= X-1)
is it right?
Thank you

eteffects command

$
0
0
Hi,
I learnt that stata has an eteffects command to account for endogeneity in treatment effects in Stata 14. Forgive my naivete, but is there a way to operationalize this in Stata 12 ?
Thank you in advance,
Suja
Viewing all 72776 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>